Обсуждение: Memory Allocation

Поиск
Список
Период
Сортировка

Memory Allocation

От
"Ryan Hansen"
Дата:

Hey all,

 

This may be more of a Linux question than a PG question, but I’m wondering if any of you have successfully allocated more than 8 GB of memory to PG before.

 

I have a fairly robust server running Ubuntu Hardy Heron, 24 GB of memory, and I’ve tried to commit half the memory to PG’s shared buffer, but it seems to fail.  I’m setting the kernel shared memory accordingly using sysctl, which seems to work fine, but when I set the shared buffer in PG and restart the service, it fails if it’s above about 8 GB.  I actually have it currently set at 6 GB.

 

I don’t have the exact failure message handy, but I can certainly get it if that helps.  Mostly I’m just looking to know if there’s any general reason why it would fail, some inherent kernel or db limitation that I’m unaware of. 

 

If it matters, this DB is going to be hosting and processing hundreds of GB and eventually TB of data, it’s a heavy read-write system, not transactional processing, just a lot of data file parsing (python/bash) and bulk loading.  Obviously the disks get hit pretty hard already, so I want to make the most of the large amount of available memory wherever possible.  So I’m trying to tune in that direction.

 

Any info is appreciated.

 

Thanks!

Re: Memory Allocation

От
Alan Hodgson
Дата:
On Wednesday 26 November 2008, "Ryan Hansen"
<ryan.hansen@brightbuilders.com> wrote:
> This may be more of a Linux question than a PG question, but I'm
> wondering if any of you have successfully allocated more than 8 GB of
> memory to PG before.
>

CentOS 5, 24GB shared_buffers on one server here. No problems.

--
Alan

Re: Memory Allocation

От
Carlos Moreno
Дата:
Ryan Hansen wrote:
>
> Hey all,
>
> This may be more of a Linux question than a PG question, but I’m
> wondering if any of you have successfully allocated more than 8 GB of
> memory to PG before.
>
> I have a fairly robust server running Ubuntu Hardy Heron, 24 GB of
> memory, and I’ve tried to commit half the memory to PG’s shared
> buffer, but it seems to fail.
>

Though not sure why this is happening or whether it is normal, I would
suggest that such setting is maybe too high. From the Annotated
postgresql.conf document at

http://www.powerpostgresql.com/Downloads/annotated_conf_80.html,

the suggested range is 8 to 400MB. They specifically say that it
should never be set to more than 1/3 of the available memory, which
in your case is precisely the 8GB figure (I guess that's just a
coincidence --- I doubt that the server would be written so that it
fails to start if shared_buffers is more than 1/3 of available RAM)

Another important parameter that you don't mention is the
effective_cache_size, which that same document suggests should
be about 2/3 of available memory. (this tells the planner the amount
of data that it can "probabilistically" expect to reside in memory due
to caching, and as such, the planner is likely to produce more
accurate estimates and thus better query optimizations).

Maybe you could set shared_buffers to, say, 1 or 2GB (that's already
beyond the recommended figure, but given that you have 24GB, it
may not hurt), and then effective_cache_size to 16GB or so?

HTH,

Carlos
--


Re: Memory Allocation

От
Tom Lane
Дата:
"Ryan Hansen" <ryan.hansen@brightbuilders.com> writes:
> I have a fairly robust server running Ubuntu Hardy Heron, 24 GB of memory,
> and I've tried to commit half the memory to PG's shared buffer, but it seems
> to fail.  I'm setting the kernel shared memory accordingly using sysctl,
> which seems to work fine, but when I set the shared buffer in PG and restart
> the service, it fails if it's above about 8 GB.

Fails how?  And what PG version is that?

FWIW, while there are various schools of thought on how large to make
shared_buffers, pretty much everybody agrees that half of physical RAM
is not the sweet spot.  What you're likely to get is maximal
inefficiency with every active disk page cached twice --- once in kernel
space and once in shared_buffers.

            regards, tom lane

Re: Memory Allocation

От
Scott Carey
Дата:

Tuning for bulk loading:

 

Make sure the Linux kernel paramters in /proc/sys/vm related to the page cache are set well.

Set swappiness to 0 or 1.

Make sure you understand and configure /proc/sys/vm/dirty_background_ratio

and /proc/sys/vm/dirty_ratio well. 

With enough RAM the default on some kernel versions is way, way off (40% of RAM with dirty pages!  yuck).

http://www.westnet.com/~gsmith/content/linux-pdflush.htm

If postgres is doing a lot of caching for you you probably want dirty_ratio at 10% or less, and you'll want the OS to start flushing to disk sooner rather than later.  A dirty_background_ratio of 3% with 24GB of RAM  is 720MB -- a pretty big buffer.  I would not personally want this buffer to be larger than 5 seconds of max write speed of the disk I/O.

 

You'll need to tune your background writer to be aggressive enough to actually write data fast enough so that checkpoints don't suck, and tune your checkpoint size and settings as well.  Turn on checkpoint logging on the database and run tests while looking at the output of those.  Ideally, most of your batch writes have made it to the OS before the checkpoint, and the OS has actually started moving most of it to disk.  If your settings are wrong,  you'll have the data buffered twice, and most or nearly all of it will be in memory when the checkpoint happens, and the checkpoint will take a LONG time.  The default Linux settings + default postgres settings + large shared_buffers will almost guarantee this situation for bulk loads.  Both have to be configured with complementary settings.  If you have a large postgres buffer, the OS buffer should be small and write more aggressively.  If you have a small postgres buffer, the OS can be more lazy and cache much more.

 

 

From: pgsql-performance-owner@postgresql.org [mailto:pgsql-performance-owner@postgresql.org] On Behalf Of Ryan Hansen
Sent: Wednesday, November 26, 2008 2:10 PM
To: pgsql-performance@postgresql.org
Subject: [PERFORM] Memory Allocation

 

Hey all,

 

This may be more of a Linux question than a PG question, but I’m wondering if any of you have successfully allocated more than 8 GB of memory to PG before.

 

I have a fairly robust server running Ubuntu Hardy Heron, 24 GB of memory, and I’ve tried to commit half the memory to PG’s shared buffer, but it seems to fail.  I’m setting the kernel shared memory accordingly using sysctl, which seems to work fine, but when I set the shared buffer in PG and restart the service, it fails if it’s above about 8 GB.  I actually have it currently set at 6 GB.

 

I don’t have the exact failure message handy, but I can certainly get it if that helps.  Mostly I’m just looking to know if there’s any general reason why it would fail, some inherent kernel or db limitation that I’m unaware of. 

 

If it matters, this DB is going to be hosting and processing hundreds of GB and eventually TB of data, it’s a heavy read-write system, not transactional processing, just a lot of data file parsing (python/bash) and bulk loading.  Obviously the disks get hit pretty hard already, so I want to make the most of the large amount of available memory wherever possible.  So I’m trying to tune in that direction.

 

Any info is appreciated.

 

Thanks!

Re: Memory Allocation

От
"Kevin Grittner"
Дата:
>>> Scott Carey <scott@richrelevance.com> wrote:
> Set swappiness to 0 or 1.

We recently converted all 72 remote county databases from 8.2.5 to
8.3.4.  In preparation we ran a test conversion of a large county over
and over with different settings to see what got us the best
performance.  Setting swappiness below the default degraded
performance for us in those tests for identical data, same hardware,
no other changes.

Our best guess is that code which really wasn't getting called got
swapped out leaving more space in the OS cache, but that's just a
guess.  Of course, I'm sure people would not be recommending it if
they hadn't done their own benchmarks to confirm that this setting
actually improved things in their environments, so the lesson here is
to test for your environment when possible.

-Kevin

Re: Memory Allocation

От
Scott Carey
Дата:
Swappiness optimization is going to vary.   Definitely test on your own.

For a bulk load database, with large page cache, swappines = 60 (default) is _GUARANTEED_ to force the OS to swap out
someof Postgres while in heavy use.  This is heavily dependent on the page cache size, work_mem size, and concurrency. 
I've had significantly increased performance setting this value low (1000x ! -- if your DB starts swapping postgres,
you'reperformance-DEAD).  The default has the OS targeting close to 60% of the memory for page cache.  On a 32GB
server,with 7GB postgres buffer cache, several concurrent queries reading GB's of data and using 500MB + work_mem (huge
aggregates),the default swappiness will choose to page out postgres with about 19GB of disk page cache left to evict,
withdisastrous results.  And that is a read-only test.  Tests with writes can trigger it earlier if combined with bad
dirty_bufferssettings. 

The root of the problem is that the Linux paging algorithm estimates that I/O for file read access is as costly as I/O
forpaging.  A reasonable assumption for a desktop, a ridiculously false assumption for a large database with high
capacityDB file I/O and a much lower capability swap file.  Not only that -- page in is almost always near pure random
reads,but DB I/O is often sequential.  So losing 100M of cached db file takes a lot less time to scan back in than
100MBof the application. 

If you do have enough other applications that are idle that take up RAM that should be pushed out to disk from time to
time(perhaps your programs that are doing the bulk loading?) a higher value is useful.  Although it is not exact, think
ofthe swappiness value as the percentage of RAM that the OS would prefer page cache to applications (very roughly). 

The more RAM you have and the larger your postgres memory usage, the lower the swappiness value should be.  60% of 24GB
is~14.5GB, If you have that much stuff that is in RAM that should be paged out to save space, try it. 

I currently use a value of 1, on a 32GB machine, and about 600MB of 'stuff' gets paged out normally, 1400MB under heavy
load. This is a dedicated machine.  Higher values page out more stuff that increases the cache size and helps
performancea little, but under the heavy load, it hits the paging wall and falls over.  The small improvement in
performancewhen the system is not completely stressed is not worth risking hitting the wall for me. 

***For a bulk load database, one is optimizing for _writes_ and extra page cache doesn't help writes like it does
reads.***

When I use a machine with misc. other lower priority apps and less RAM, I have found larger values to be helpful.

If your DB is configured with a low shared_buffers and small work_mem, you probably want the OS to use that much memory
fordisk pages, and again a higher swappiness may be more optimal. 

Like all of these settings, tune to your application and test.  Many of these settings are things that go hand in hand
withothers, but alone don't make as much sense.  Tuning Postgres to do most of the caching and making the OS get out of
theway is far different than tuning the OS to do as much caching work as possible and minimizing postgres.  Which of
thosetwo strategies is best is highly application dependent, somewhat O/S dependent, and also hardware dependent. 

-----Original Message-----
From: Kevin Grittner [mailto:Kevin.Grittner@wicourts.gov]
Sent: Wednesday, November 26, 2008 3:09 PM
To: Ryan Hansen; pgsql-performance@postgresql.org; Scott Carey
Subject: Re: [PERFORM] Memory Allocation

>>> Scott Carey <scott@richrelevance.com> wrote:
> Set swappiness to 0 or 1.

We recently converted all 72 remote county databases from 8.2.5 to
8.3.4.  In preparation we ran a test conversion of a large county over
and over with different settings to see what got us the best
performance.  Setting swappiness below the default degraded
performance for us in those tests for identical data, same hardware,
no other changes.

Our best guess is that code which really wasn't getting called got
swapped out leaving more space in the OS cache, but that's just a
guess.  Of course, I'm sure people would not be recommending it if
they hadn't done their own benchmarks to confirm that this setting
actually improved things in their environments, so the lesson here is
to test for your environment when possible.

-Kevin

Re: Memory Allocation

От
"Kevin Grittner"
Дата:
I'm hoping that through compare/contrast we might help someone start
closer to their own best values....

>>> Scott Carey <scott@richrelevance.com> wrote:
> Tests with writes can trigger it earlier if combined with bad
dirty_buffers
> settings.

We've never, ever modified dirty_buffers settings from defaults.

> The root of the problem is that the Linux paging algorithm estimates
that
> I/O for file read access is as costly as I/O for paging.  A
reasonable
> assumption for a desktop, a ridiculously false assumption for a large

> database with high capacity DB file I/O and a much lower capability
swap
> file.

Our swap file is not on lower speed drives.

> If you do have enough other applications that are idle that take up
RAM that
> should be pushed out to disk from time to time (perhaps your programs
that
> are doing the bulk loading?) a higher value is useful.

Bulk loading was ssh cat | psql.

> The more RAM you have and the larger your postgres memory usage, the
lower
> the swappiness value should be.

I think the test environment had 8 GB RAM with 256 MB in
shared_buffers.  For the conversion we had high work_mem and
maintenance_work_mem settings, and we turned fsync off, along with a
few other settings we would never using during production.

> I currently use a value of 1, on a 32GB machine, and about 600MB of
'stuff'
> gets paged out normally, 1400MB under heavy load.

Outside of bulk load, we've rarely seen anything swap, even under
load.

> ***For a bulk load database, one is optimizing for _writes_ and extra
page
> cache doesn't help writes like it does reads.***

I'm thinking that it likely helps when indexing tables for which data
has recently been loaded.  It also might help minimize head movement
and/or avoid the initial disk hit for a page which subsequently get
hint bits set .

> Like all of these settings, tune to your application and test.

We sure seem to agree on that.

-Kevin