Обсуждение: brk() function and performance

Поиск
Список
Период
Сортировка

brk() function and performance

От
Andrew Sullivan
Дата:
Hi,

We're running PostgreSQL 7.1.3 (I know, I know) on Solaris 7 on two
Sun E4500s with 8 CPUs and 16 Gig of RAM.

We have noticed that one of the machines is considerably slower than
the other.  We have traced the problem to the brk() funciton call.

We were having some trouble with certain queries, because we were
spending a lot of time moving blocks between the OS filesystem
buffers and the Postgres shared buffer.  That was on server A.  So,
we played with some settings, and settled on shared_buffers = 262144.

On server B, we did not see the same problem.  There,
shared_buffers=8192.  That is the only difference between the
machines, except that server A also sees a lot more interactive
traffic (server B is a replicated copy, and handles a number of
read-only queries).

We increased the shared_buffers setting on server A a few weeks ago,
and saw an immediate (albeit slight, but enough for us) improvement
in the system.  But in the past few days, we have experienced
sluggish behaviour.

By selecting a single line from a frequently-accessed, relatively
small table (< 300 rows), and selecting on an indexed field, we get
the following difference in the truss output:

Server A: the query takes 700-800 ms.
syscall      seconds   calls
brk              .27      62

Server B: the query takes 200-300 ms.
syscall      seconds   calls
brk              .02      64

Everything else is the same.

The backend has been running since the shared memory change.  I was
wondering if perhaps the problem is what brk() is doing.  Maybe it
needs a contiguous segment, and when it goes to allocate more of its
reserved memory, it has to shift the whole thing around?  (If so,
this is a clear reason why not to use huge shared buffers.)  The
Solaris man page doesn't make clear how this works (and in fact seem
to suggest that brk() shouldn't be used).

Any remarks, pointers, or suggestions would be welcome.  I'm stumped.
This is very puzzling.

A

--
----
Andrew Sullivan                               87 Mowat Avenue
Liberty RMS                           Toronto, Ontario Canada
<andrew@libertyrms.info>                              M6K 3E3
                                         +1 416 646 3304 x110


Re: brk() function and performance

От
Andrew Sullivan
Дата:
On Thu, Jul 11, 2002 at 12:30:12PM -0400, Andrew Sullivan wrote:
> Hi,
>
> We're running PostgreSQL 7.1.3 (I know, I know) on Solaris 7 on two
> Sun E4500s with 8 CPUs and 16 Gig of RAM.
>
> We have noticed that one of the machines is considerably slower than
> the other.  We have traced the problem to the brk() funciton call.

My Sun-loving colleague, Sorin Iszlai, wondered why this problem was
cropping up, and remembered the qsort() debacle.  So he did some
tests.  Guess what?  Here's what he found:

> I ran some tests with the realloc() function from the standard lib;
> If the application calls realloc() 4096 times the results are:

> - if linked with bsdmalloc, realloc() calls brk() 17 times only:
> syscall      seconds   calls
> brk              .40      17

> - and without bsdmalloc :
> syscall      seconds   calls
> brk             1.36   24527

At this rate, I'm beginning to get the feeling that maybe getting
FreeBSD to work well on 64 bit Sun machines is the most important
project we could undertake ;-)

Anyway, I'm going to do some tests with this, but in the meantime, if
anyone has any views on the subject, insights, or experience, it'd be
much appreciated.

Thanks.

A

--
----
Andrew Sullivan                               87 Mowat Avenue
Liberty RMS                           Toronto, Ontario Canada
<andrew@libertyrms.info>                              M6K 3E3
                                         +1 416 646 3304 x110


Re: brk() function and performance

От
Bruce Momjian
Дата:
Yow.  What are those Solaris engineers doing over there?

---------------------------------------------------------------------------

Andrew Sullivan wrote:
> On Thu, Jul 11, 2002 at 12:30:12PM -0400, Andrew Sullivan wrote:
> > Hi,
> >
> > We're running PostgreSQL 7.1.3 (I know, I know) on Solaris 7 on two
> > Sun E4500s with 8 CPUs and 16 Gig of RAM.
> >
> > We have noticed that one of the machines is considerably slower than
> > the other.  We have traced the problem to the brk() funciton call.
>
> My Sun-loving colleague, Sorin Iszlai, wondered why this problem was
> cropping up, and remembered the qsort() debacle.  So he did some
> tests.  Guess what?  Here's what he found:
>
> > I ran some tests with the realloc() function from the standard lib;
> > If the application calls realloc() 4096 times the results are:
>
> > - if linked with bsdmalloc, realloc() calls brk() 17 times only:
> > syscall      seconds   calls
> > brk              .40      17
>
> > - and without bsdmalloc :
> > syscall      seconds   calls
> > brk             1.36   24527
>
> At this rate, I'm beginning to get the feeling that maybe getting
> FreeBSD to work well on 64 bit Sun machines is the most important
> project we could undertake ;-)
>
> Anyway, I'm going to do some tests with this, but in the meantime, if
> anyone has any views on the subject, insights, or experience, it'd be
> much appreciated.
>
> Thanks.
>
> A
>
> --
> ----
> Andrew Sullivan                               87 Mowat Avenue
> Liberty RMS                           Toronto, Ontario Canada
> <andrew@libertyrms.info>                              M6K 3E3
>                                          +1 416 646 3304 x110
>
>
> ---------------------------(end of broadcast)---------------------------
> TIP 5: Have you checked our extensive FAQ?
>
> http://www.postgresql.org/users-lounge/docs/faq.html
>

--
  Bruce Momjian                        |  http://candle.pha.pa.us
  pgman@candle.pha.pa.us               |  (610) 853-3000
  +  If your life is a hard drive,     |  830 Blythe Avenue
  +  Christ can be your backup.        |  Drexel Hill, Pennsylvania 19026

Re: brk() function and performance

От
Gregory Seidman
Дата:
Andrew Sullivan sez:
} On Thu, Jul 11, 2002 at 12:30:12PM -0400, Andrew Sullivan wrote:
[...]
} > We have noticed that one of the machines is considerably slower than
} > the other.  We have traced the problem to the brk() funciton call.
}
} My Sun-loving colleague, Sorin Iszlai, wondered why this problem was
} cropping up, and remembered the qsort() debacle.  So he did some
} tests.  Guess what?  Here's what he found:
}
} > I ran some tests with the realloc() function from the standard lib;
} > If the application calls realloc() 4096 times the results are:
}
} > - if linked with bsdmalloc, realloc() calls brk() 17 times only:
} > syscall      seconds   calls
} > brk              .40      17
}
} > - and without bsdmalloc :
} > syscall      seconds   calls
} > brk             1.36   24527
}
} At this rate, I'm beginning to get the feeling that maybe getting
} FreeBSD to work well on 64 bit Sun machines is the most important
} project we could undertake ;-)
}
} Anyway, I'm going to do some tests with this, but in the meantime, if
} anyone has any views on the subject, insights, or experience, it'd be
} much appreciated.

Way back when I was a college freshman or sophomore, I was talking to a
professor who mentioned having had tremendous problems with the brk()
system call robbing him of system performance. His solution, since brk() is
called when malloc decides it needs another page or so, was to allocate a
*tremendous* amount of memory at the very beginning of his run, then free
it all. This meant that Solaris mapped a whole bunch of pages to his app
with just one brk() call, and once it was released it was in malloc's free
list. The pages weren't swapped/paged or anything because until they were
written to or read from, they didn't even really exist except in the OS's
internal tables. It just took the OS out of the loop in memory allocation.

This may ar may not be a good solution. I would expect it to fail or have
bad performance characteristics on at least some flavors of Unix, and
probably Windows. Still, it might be worth looking into on Solaris.

} Thanks.
} A
--Greg


Re: brk() function and performance

От
Andrew Sullivan
Дата:
On Tue, Jul 16, 2002 at 10:28:02AM -0400, Andrew Sullivan wrote:
> On Thu, Jul 11, 2002 at 12:30:12PM -0400, Andrew Sullivan wrote:
> >
> > We have noticed that one of the machines is considerably slower than
> > the other.  We have traced the problem to the brk() funciton call.

More news, in case anyone is interested.

It appears, after poking around the Net, that Sun ships their
poor-performing malloc as the default on purpose, because it uses
less memory.  You can set your CFLAGS="-llibbsdmalloc" if you want to
use the BSD library (which is on the system by default), or even just
set LD_PRELOAD to pick up the BSD malloc instead (the latter seems to
work just fine for the postmaster, but it breaks some other things,
so I think I'd compile against it instead for any real work).   The
BSD malloc uses about 4 times the memory of the Solaris version, but
it's plenty faster.  Memory is cheap.

Further tests, however, seem to indicate that brk() is not our main
problem.  On a test machine today, we found simple selects on a table
with only a couple hundred rows are taking > 300 milliseconds when we
set the shared buffers to some large number (like enough to allocate
a Gig of memory), more than 250 ms when running with about 512 Meg of
shared memory, but under 125 ms when running with a small shared
buffer setting (say, enough to allocate less than 200 meg -- one test
we allocated only 4 meg).  The main culprit seems to be a memset()
call that happens over and over to the same address.  I've no idea
why, but there it is.

The same results are _not_ found in testing with 7.2.1.  In that
case, allocating a Gig of shared memory does not seem to affect the
result at all.  The only question is whether they might be if we ran
a lot of updates agains the 7.2.x tree.  (We tarred up and copied the
data tree from production, since I had it from a recent maintenance
period; but we had to use pg_dump to put the data into the 7.2
database, obviously).  We'll do a great whack of updates, and see if
that makes a difference.

A

--
----
Andrew Sullivan                               87 Mowat Avenue
Liberty RMS                           Toronto, Ontario Canada
<andrew@libertyrms.info>                              M6K 3E3
                                         +1 416 646 3304 x110


Re: brk() function and performance

От
Tom Lane
Дата:
Andrew Sullivan <andrew@libertyrms.info> writes:
> On a test machine today, we found simple selects on a table
> with only a couple hundred rows are taking > 300 milliseconds when we
> set the shared buffers to some large number (like enough to allocate
> a Gig of memory), more than 250 ms when running with about 512 Meg of
> shared memory, but under 125 ms when running with a small shared
> buffer setting (say, enough to allocate less than 200 meg -- one test
> we allocated only 4 meg).  The main culprit seems to be a memset()
> call that happens over and over to the same address.  I've no idea
> why, but there it is.

Hmph.  There are some places in the bufmgr that do sequential scans of
the whole buffer array, which might account for a slowdown with huge
numbers of buffers.  I do not think any of them are in hotspot paths
however --- at least not in any recent release.  This test was on
7.1.something, wasn't it?

Could you recompile with profiling enabled and see where the time is
really going with the large number of buffers?

> The same results are _not_ found in testing with 7.2.1.

This might mean we already fixed the bottleneck, in which case the
question becomes less interesting (at least to me ;-)).

            regards, tom lane

Re: brk() function and performance

От
Bruce Momjian
Дата:
Any update on this?

---------------------------------------------------------------------------

Andrew Sullivan wrote:
> On Tue, Jul 16, 2002 at 10:28:02AM -0400, Andrew Sullivan wrote:
> > On Thu, Jul 11, 2002 at 12:30:12PM -0400, Andrew Sullivan wrote:
> > >
> > > We have noticed that one of the machines is considerably slower than
> > > the other.  We have traced the problem to the brk() funciton call.
>
> More news, in case anyone is interested.
>
> It appears, after poking around the Net, that Sun ships their
> poor-performing malloc as the default on purpose, because it uses
> less memory.  You can set your CFLAGS="-llibbsdmalloc" if you want to
> use the BSD library (which is on the system by default), or even just
> set LD_PRELOAD to pick up the BSD malloc instead (the latter seems to
> work just fine for the postmaster, but it breaks some other things,
> so I think I'd compile against it instead for any real work).   The
> BSD malloc uses about 4 times the memory of the Solaris version, but
> it's plenty faster.  Memory is cheap.
>
> Further tests, however, seem to indicate that brk() is not our main
> problem.  On a test machine today, we found simple selects on a table
> with only a couple hundred rows are taking > 300 milliseconds when we
> set the shared buffers to some large number (like enough to allocate
> a Gig of memory), more than 250 ms when running with about 512 Meg of
> shared memory, but under 125 ms when running with a small shared
> buffer setting (say, enough to allocate less than 200 meg -- one test
> we allocated only 4 meg).  The main culprit seems to be a memset()
> call that happens over and over to the same address.  I've no idea
> why, but there it is.
>
> The same results are _not_ found in testing with 7.2.1.  In that
> case, allocating a Gig of shared memory does not seem to affect the
> result at all.  The only question is whether they might be if we ran
> a lot of updates agains the 7.2.x tree.  (We tarred up and copied the
> data tree from production, since I had it from a recent maintenance
> period; but we had to use pg_dump to put the data into the 7.2
> database, obviously).  We'll do a great whack of updates, and see if
> that makes a difference.
>
> A
>
> --
> ----
> Andrew Sullivan                               87 Mowat Avenue
> Liberty RMS                           Toronto, Ontario Canada
> <andrew@libertyrms.info>                              M6K 3E3
>                                          +1 416 646 3304 x110
>
>
> ---------------------------(end of broadcast)---------------------------
> TIP 4: Don't 'kill -9' the postmaster
>

--
  Bruce Momjian                        |  http://candle.pha.pa.us
  pgman@candle.pha.pa.us               |  (610) 359-1001
  +  If your life is a hard drive,     |  13 Roberts Road
  +  Christ can be your backup.        |  Newtown Square, Pennsylvania 19073

Re: brk() function and performance

От
Andrew Sullivan
Дата:
On Tue, Aug 27, 2002 at 12:31:00PM -0400, Bruce Momjian wrote:
>
> Any update on this?

Sorry, yes. . .

>
> ---------------------------------------------------------------------------
>
> Andrew Sullivan wrote:

> >
> > The same results are _not_ found in testing with 7.2.1.  In that
> > case, allocating a Gig of shared memory does not seem to affect the
> > result at all.  The only question is whether they might be if we ran
> > a lot of updates agains the 7.2.x tree.  (We tarred up and copied the
> > data tree from production, since I had it from a recent maintenance
> > period; but we had to use pg_dump to put the data into the 7.2
> > database, obviously).  We'll do a great whack of updates, and see if
> > that makes a difference.

We ran 100,000 updates against the same record on a table (vacuuming,
sometimes, of course), and were unable to reproduce the slowdown.  My
best bet is that someone happened to fix this problem by accident.
It could have been related to any of dozens of improvements in 7.2,
of course, but whatever it was, it seems to be gone.

A

--
----
Andrew Sullivan                         204-4141 Yonge Street
Liberty RMS                           Toronto, Ontario Canada
<andrew@libertyrms.info>                              M2P 2A8
                                         +1 416 646 3304 x110