Обсуждение: Linux machine aggressively clearing cache

Поиск
Список
Период
Сортировка

Linux machine aggressively clearing cache

От
Joshua Berkus
Дата:
Have run across some memory behavior on Linux I've never seen before.

Server running RHEL6 with 96GB of RAM.
Kernel 2.6.32
PostgreSQL 9.0
208GB database with fairly random accesses over 50% of the database.

Now, here's the weird part: even after a week of uptime, only 21 to 25GB of cache is ever used, and there's constantly
20GBto 35GB free memory.  This would mean a small working set, except that we see constant reads from disk (1 to
15MB/s)and around 1/3 of queries are slowed by iowaits. 

In an effort to test this, we deliberately ran a pg_dump.  This did grow the cache to all available memory, but Linux
rapidlycleared the cache (flushing to disk) down to 25GB within an hour. 

sys.kernel.vm parameters are all defaults.  None of the parameters seem to specifically relate to the size of the page
cache.

Has anyone ever seen this before?  What did you do about it?

--
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com
San Francisco

Re: Linux machine aggressively clearing cache

От
Claudio Freire
Дата:
On Tue, Mar 27, 2012 at 5:06 PM, Joshua Berkus <josh@agliodbs.com> wrote:
> In an effort to test this, we deliberately ran a pg_dump.  This did grow the cache to all available memory, but Linux
rapidlycleared the cache (flushing to disk) down to 25GB within an hour. 

This would happen if some queries (or some program) briefly uses that
much memory (pushing the cache off RAM).

Re: Linux machine aggressively clearing cache

От
Dave Crooke
Дата:
This may just be a typo, but if you really did create write (dirty) block device cache by writing the pg_dump file somewhere, then that is what it's supposed to do ;) Linux is more aggressive about write cache and will allow more of it to build up than e.g. HP-UX which will start to throttle process-to-cache writes to avoid getting too far behind.

Read cache of course does not need to be flushed and can simply be dumped when the memory is needed, and so Linux will keep more or less unlimited amounts of read cache until it needs the memory for something else .... here is an output from "free" on my laptop, showing ~2.5GB of read cache that can be freed almost instantly if needed for process memory, write cache, kernel buffers, etc. The -/+ line shows a net of what is being used by processes.

dave:~$ free
             total       used       free     shared    buffers     cached
Mem:       8089056    7476424     612632          0     603508    2556584
-/+ buffers/cache:    4316332    3772724
Swap:     24563344    1176284   23387060

redirecting  pg_dump >/dev/null  will read the DB without writing anything, but it's pretty resource intensive .... if you just want to get the database tables into the OS read cache you can do it much more cheaply with   sudo tar cvf - /var/lib/postgresql/8.4/main/base | cat >/dev/null  or similar (GNU tar somehow detects if you connect its stdout directly to /dev/null and then it cheats and doesn't do the reads)

In the second "free" output below, the kernel has grabbed what it can for cache, leaving only ~64MB of actual free memory for instant use.

dave:~$ pg_dump -F c hyper9db >/dev/null
dave:~$ free
             total       used       free     shared    buffers     cached
Mem:       8089056    8024252      64804          0     287432    3797956
-/+ buffers/cache:    3938864    4150192
Swap:     24563344    1166556   23396788
dave:~$

Cheers
Dave

On Tue, Mar 27, 2012 at 3:06 PM, Joshua Berkus <josh@agliodbs.com> wrote:
... but Linux rapidly cleared the cache (flushing to disk) down to 25GB within an hour.

--
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com
San Francisco

--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

Re: Linux machine aggressively clearing cache

От
Joshua Berkus
Дата:
> This may just be a typo, but if you really did create write (dirty)
> block device cache by writing the pg_dump file somewhere, then that
> is what it's supposed to do ;)

The pgdump was across the network.  So the only caching on the machine was read caching.

> Read cache of course does not need to be flushed and can simply be
> dumped when the memory is needed, and so Linux will keep more or
> less unlimited amounts of read cache until it needs the memory for
> something else ....

Right, that's the normal behavior.  Except not on this machine.

--Josh

Re: Linux machine aggressively clearing cache

От
Josh Berkus
Дата:
>> Read cache of course does not need to be flushed and can simply be
>> dumped when the memory is needed, and so Linux will keep more or
>> less unlimited amounts of read cache until it needs the memory for
>> something else ....
>
> Right, that's the normal behavior.  Except not on this machine.

So this turned out to be a Linux kernel issue.  Will document it on
www.databasesoup.com.

--
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com

Re: Linux machine aggressively clearing cache

От
Steve Crawford
Дата:
On 03/30/2012 05:51 PM, Josh Berkus wrote:
>
> So this turned out to be a Linux kernel issue.  Will document it on
> www.databasesoup.com.
Anytime soon? About to build two PostgreSQL servers and wondering if you
have uncovered a kernel version or similar issue to avoid.

Cheers,
Steve


Re: Linux machine aggressively clearing cache

От
Josh Berkus
Дата:
On 4/12/12 8:47 AM, Steve Crawford wrote:
> On 03/30/2012 05:51 PM, Josh Berkus wrote:
>>
>> So this turned out to be a Linux kernel issue.  Will document it on
>> www.databasesoup.com.
> Anytime soon? About to build two PostgreSQL servers and wondering if you
> have uncovered a kernel version or similar issue to avoid.

Yeah, I'll blog it.


--
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com

Re: Linux machine aggressively clearing cache

От
Ross Reedstrom
Дата:
On Wed, Apr 18, 2012 at 05:09:29PM -0700, Josh Berkus wrote:
> On 4/12/12 8:47 AM, Steve Crawford wrote:
> > On 03/30/2012 05:51 PM, Josh Berkus wrote:
> >>
> >> So this turned out to be a Linux kernel issue.  Will document it on
> >> www.databasesoup.com.
> > Anytime soon? About to build two PostgreSQL servers and wondering if you
> > have uncovered a kernel version or similar issue to avoid.
>
> Yeah, I'll blog it.

Since I'm doing some backlog catchup, I'll do some community/archive service
and provide the link:

http://www.databasesoup.com/2012/04/red-hat-kernel-cache-clearing-issue.html

Ross
--
Ross Reedstrom, Ph.D.                                 reedstrm@rice.edu
Systems Engineer & Admin, Research Scientist        phone: 713-348-6166
Connexions                  http://cnx.org            fax: 713-348-3665
Rice University MS-375, Houston, TX 77005
GPG Key fingerprint = F023 82C8 9B0E 2CC6 0D8E  F888 D3AE 810E 88F0 BEDE