Обсуждение: why does swap not recover?

Поиск
Список
Период
Сортировка

why does swap not recover?

От
Richard Yen
Дата:
Hi everyone,

We've recently encountered some swapping issues on our CentOS 64GB Nehalem machine, running postgres 8.4.2.
Unfortunately,I was foolish enough to set shared_buffers to 40GB.  I was wondering if anyone would have any insight
intowhy the swapping suddenly starts, but never recovers? 

<img src="http://richyen.com/i/swap.png">

Note, the machine has been up and running since mid-December 2009.  It was only a March 8 that this swapping began, and
it'snever recovered. 

If we look at dstat, we find the following:

<img src="http://richyen.com/i/dstat.png">

Note that it is constantly paging in, but never paging out.  This would indicate that it's constantly reading from
swap,but never writing out to it.  Why would postgres do this? (postgres is pretty much the only thing running on this
machine).

I'm planning on lowering the shared_buffers to a more sane value, like 25GB (pgtune recommends this for a Mixed-purpose
machine)or less (pgtune recommends 14GB for an OLTP machine).  However, before I do this (and possibly resolve the
issue),I was hoping to see if anyone would have an explanation for the constant reading from swap, but never writing
back.

--Richard

Re: why does swap not recover?

От
Scott Marlowe
Дата:
On Fri, Mar 26, 2010 at 5:57 PM, Richard Yen <dba@richyen.com> wrote:
> Hi everyone,
>
> We've recently encountered some swapping issues on our CentOS 64GB Nehalem

What version Centos?  How up to date is it?  Are there any other
settings that aren't defaults in things like /etc/sysctl.conf?

Re: why does swap not recover?

От
Scott Carey
Дата:
On Mar 26, 2010, at 4:57 PM, Richard Yen wrote:

> Hi everyone,
>
> We've recently encountered some swapping issues on our CentOS 64GB Nehalem machine, running postgres 8.4.2.
Unfortunately,I was foolish enough to set shared_buffers to 40GB.  I was wondering if anyone would have any insight
intowhy the swapping suddenly starts, but never recovers? 
>
> <img src="http://richyen.com/i/swap.png">
>
> Note, the machine has been up and running since mid-December 2009.  It was only a March 8 that this swapping began,
andit's never recovered. 
>
> If we look at dstat, we find the following:
>
> <img src="http://richyen.com/i/dstat.png">
>
> Note that it is constantly paging in, but never paging out.  This would indicate that it's constantly reading from
swap,but never writing out to it.  Why would postgres do this? (postgres is pretty much the only thing running on this
machine).
>
> I'm planning on lowering the shared_buffers to a more sane value, like 25GB (pgtune recommends this for a
Mixed-purposemachine) or less (pgtune recommends 14GB for an OLTP machine).  However, before I do this (and possibly
resolvethe issue), I was hoping to see if anyone would have an explanation for the constant reading from swap, but
neverwriting back. 

Linux until recently does not account for shared memory properly in its swap 'aggressiveness' decisions.
Setting shared_buffers larger than 35% is asking for trouble.

You could try adjusting the 'swappiness' setting on the fly and seeing how it reacts, but one consequence of that is
tradingoff disk swapping for kswapd using up tons of CPU causing other trouble. 

Either use one of the last few kernel versions (I forget which addressed the memory accounting issues, and haven't
triedit myself), or turn shared_buffers down.  I recommend trying 10GB or so to start. 

>
> --Richard
> --
> Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-performance


Re: why does swap not recover?

От
Craig James
Дата:
On 3/26/10 4:57 PM, Richard Yen wrote:
> Hi everyone,
>
> We've recently encountered some swapping issues on our CentOS 64GB Nehalem machine, running postgres 8.4.2.
Unfortunately,I was foolish enough to set shared_buffers to 40GB.  I was wondering if anyone would have any insight
intowhy the swapping suddenly starts, but never recovers? 
>
> <img src="http://richyen.com/i/swap.png">
>
> Note, the machine has been up and running since mid-December 2009.  It was only a March 8 that this swapping began,
andit's never recovered. 
>
> If we look at dstat, we find the following:
>
> <img src="http://richyen.com/i/dstat.png">
>
> Note that it is constantly paging in, but never paging out.

This happens when you have too many processes using too much space to fit in real memory, but none of them are changing
theirmemory image.  If the system swaps a process in, but that process doesn't change anything in memory, then there
areno dirty pages and the kernel can just kick the process out of memory without writing anything back to the swap disk
--the data in the swap are still valid. 

It's a classic problem when processes are running round-robin. Say you have space for 100 processes, but you're running
101process.  When you get to the #101, #1 is the oldest so it swaps out.  Then #1 runs, and #2 is the oldest, so it
getskicked out.  Then #2 runs and kicks out #3 ... and so forth.  Going from 100 to 101 process brings the system
nearlyto a halt. 

Some operating systems try to use tricks to keep this from happening, but it's a hard problem to solve.

Craig

Re: why does swap not recover?

От
Richard Yen
Дата:
On Mar 26, 2010, at 5:25 PM, Scott Carey wrote:
> Linux until recently does not account for shared memory properly in its swap 'aggressiveness' decisions.
> Setting shared_buffers larger than 35% is asking for trouble.
>
> You could try adjusting the 'swappiness' setting on the fly and seeing how it reacts, but one consequence of that is
tradingoff disk swapping for kswapd using up tons of CPU causing other trouble. 
Thanks for the tip.  I believe we've tried tuning the 'swappiness' setting on the fly, but it had no effect.  We're
hypothesizingthat perhaps 'swappiness' only comes into effect at the beginning of a process, so we would have to
restartthe daemon to actually make it go into effect--would you know about this? 

> Either use one of the last few kernel versions (I forget which addressed the memory accounting issues, and haven't
triedit myself), or turn shared_buffers down.  I recommend trying 10GB or so to start. 

We're currently using CentOS 2.6.18-164.6.1.el5 with all the default settings.  If this is after the one that dealt
withmemory accounting issues, I agree that I'll likely have to lower my shared_buffers. 

My sysctl.conf shows the following:
> kernel.msgmnb = 65536
> kernel.msgmax = 65536
> kernel.shmmax = 68719476736
> kernel.shmall = 4294967296

BTW, I forgot to mention that I'm using FusionIO drives for my data storage, but I'm pretty sure this is not relevant
tothe issue I'm having. 

Thanks for the help!
--Richard

Re: why does swap not recover?

От
Josh Berkus
Дата:
On 3/26/10 4:57 PM, Richard Yen wrote:
> I'm planning on lowering the shared_buffers to a more sane value, like 25GB (pgtune recommends this for a
Mixed-purposemachine) or less (pgtune recommends 14GB for an OLTP machine).  However, before I do this (and possibly
resolvethe issue), I was hoping to see if anyone would have an explanation for the constant reading from swap, but
neverwriting back. 

Postgres does not control how swap is used.  This would be an operating
system issue.  Leaving aside the distict possibility of a bug in
handling swap (nobody seems to do it well), there's the distinct
possibility that you're actually pinning more memory on the system than
it has (through various processes) and it's wisely shifted some
read-only files to the swap (as opposed to read-write ones).  But that's
a fairly handwavy guess.

--
                                  -- Josh Berkus
                                     PostgreSQL Experts Inc.
                                     http://www.pgexperts.com

Re: why does swap not recover?

От
Robert Haas
Дата:
On Fri, Mar 26, 2010 at 7:57 PM, Richard Yen <dba@richyen.com> wrote:
> Note that it is constantly paging in, but never paging out.  This would indicate that it's constantly reading from
swap,but never writing out to it.  Why would postgres do this? (postgres is pretty much the only thing running on this
machine).
>
> I'm planning on lowering the shared_buffers to a more sane value, like 25GB (pgtune recommends this for a
Mixed-purposemachine) or less (pgtune recommends 14GB for an OLTP machine).  However, before I do this (and possibly
resolvethe issue), I was hoping to see if anyone would have an explanation for the constant reading from swap, but
neverwriting back. 

Reading a page in from swap still leaves that data on the disk.  So it
may be that you're reading in pages from disk, not modifying them,
discarding them (without any need to write them out since they're
still on disk), and then reading them in again when they're accessed
again.

...Robert