Обсуждение: Relationship between ulimit and effective_cache_size, work_mem, shared_buffers, maintenance_work_mem and vacuum

Поиск
Список
Период
Сортировка

Relationship between ulimit and effective_cache_size, work_mem, shared_buffers, maintenance_work_mem and vacuum

От
Sandeep Srinivasa
Дата:
hi guys,
             we have a single Ubuntu 10.04 box on which we are going to be running a Postgres 8.4 server, ROR passenger and a solr search server.
I was looking at ways to optimize the postgres database and yet limit the amount of memory that it can consume.
 
I am gonna set my shared_buffers to 256mb and work_mem to 12mb, temp_buffers to 20mb (on a 4GB machine). 
Now, the "effective cache size" variable seems more of a hint to the query planner, than any hard limit on the database server. 

Q1.  if I add "ulimit -m" and "ulimit -v" lines in my postgres upstart files  will that be good enough to hard-limit Postgres memory usage ?

Q2.  once I have decided my max memory allocation (call it MY_ULIMIT) - should effective cache size be set to MY_ULIMIT - 256 - 12 -20 ?  round it off to MY_ULIMIT - 512mb maybe....

Q3. Or will doing something like this play havoc with the query planner/unexplained OOM/crashes ? I ask this because I see that there are other variables that I am not sure, will play nice with ulimit:
1. will this affect the memory usage of vacuum (going to be using default vacuum settings for 8.4) - because ideally I would want to have some control over it as well.
2. Would I have to tune max_connections, max_files_per_process (and any related variables) ?
3. When I turn on WAL, would I have to tune wal_buffers accordingly  set effective cache size to account for wal_buffers as well ?


thanks
-Sandeep
Sandeep Srinivasa wrote:
>
> I was looking at ways to optimize the postgres database and yet limit
> the amount of memory that it can consume.

You can't usefully limit the amount of memory that PostgreSQL will
consume yet.  Each client can easily allocate multiples of work_mem
worth of memory as they run queries, and there's temp_buffers to be
concerned about too.  One way to work around problems like this is to
significantly limit the number of clients that can be running at a time,
using something like a connection pooler, and then keep a low setting
for max_connections.  This can have some positive performance benefits
of its own, and it will keep the number of clients (the only real source
of variable allocations on the server) from getting too large.  Software
suitable for that purpose includes pgbouncer and pgpool.

> Now, the "effective cache size" variable seems more of a hint to the
> query planner, than any hard limit on the database server.

That's correct.  It doesn't allocate anything.  Doesn't limit anything
either.

> Q1.  if I add "ulimit -m" and "ulimit -v" lines in my postgres upstart
> files  will that be good enough to hard-limit Postgres memory usage ?

After fighting a few random crashes where the server runs into ulimit,
you will find that trying to hard limit PostgreSQL memory usage is more
trouble than it's worth.  It's really a bad place to go.

> Q2.  once I have decided my max memory allocation (call it MY_ULIMIT)
> - should effective cache size be set to MY_ULIMIT - 256 - 12 -20 ?
>  round it off to MY_ULIMIT - 512mb maybe....

effective_cache_size has no relation to the limits.  Take a look at how
much of your operating system cache you think might be free at any
time.  Figure out what percentage of that you might want PostgreSQL to
be able to use sometime.  Set effective_cache_size to it.  If a query
goes wild and decides to execute a really bad query plan that reads a
bunch of data, it will trash the operating system cache regardless; you
can't stop it like this.

> Q3. Or will doing something like this play havoc with the query
> planner/unexplained OOM/crashes ?

If you ever let the system get so low on RAM that the Linux OOM killer
becomes active, it will almost always kill the main database server
process, the one that spawns all the clients off, due to how Linux does
shared memory accounting.  This is another really bad things to be avoided.

> 1. will this affect the memory usage of vacuum (going to be using
> default vacuum settings for 8.4) - because ideally I would want to
> have some control over it as well.

Each of the autovacuum processes (defaulting to 3) will use up to
maintenance_work_mem worth of memory when they are running.  You should
account for that when estimating peak usage.

> 2. Would I have to tune max_connections, max_files_per_process (and
> any related variables) ?

Limiting max_connections, and accordingly dealing with the fact that
some connections might be refused temporarily in your application, is
the most effective thing you can do here.  max_files_per_process is
really secondary to any of the other bits you're asking about.

> 3. When I turn on WAL, would I have to tune wal_buffers accordingly
>  set effective cache size to account for wal_buffers as well ?

Set wal_buffers somewhere between 1MB and 16MB, include it in the
general server shared memory overhead, and then ignore it.  It takes up
a little memory but isn't nearly as important as these other bits.

--
Greg Smith  2ndQuadrant US  Baltimore, MD
PostgreSQL Training, Services and Support
greg@2ndQuadrant.com   www.2ndQuadrant.us


Thanks for the comprehensive reply. It definitely cleared a lot of things up for me.

regards
Sandeep

On Mon, Aug 16, 2010 at 12:39 PM, Greg Smith <greg@2ndquadrant.com> wrote:
Sandeep Srinivasa wrote:

I was looking at ways to optimize the postgres database and yet limit the amount of memory that it can consume.

You can't usefully limit the amount of memory that PostgreSQL will consume yet.  Each client can easily allocate multiples of work_mem worth of memory as they run queries, and there's temp_buffers to be concerned about too.  One way to work around problems like this is to significantly limit the number of clients that can be running at a time, using something like a connection pooler, and then keep a low setting for max_connections.  This can have some positive performance benefits of its own, and it will keep the number of clients (the only real source of variable allocations on the server) from getting too large.  Software suitable for that purpose includes pgbouncer and pgpool.


Now, the "effective cache size" variable seems more of a hint to the query planner, than any hard limit on the database server.

That's correct.  It doesn't allocate anything.  Doesn't limit anything either.


Q1.  if I add "ulimit -m" and "ulimit -v" lines in my postgres upstart files  will that be good enough to hard-limit Postgres memory usage ?

After fighting a few random crashes where the server runs into ulimit, you will find that trying to hard limit PostgreSQL memory usage is more trouble than it's worth.  It's really a bad place to go.


Q2.  once I have decided my max memory allocation (call it MY_ULIMIT) - should effective cache size be set to MY_ULIMIT - 256 - 12 -20 ?  round it off to MY_ULIMIT - 512mb maybe....

effective_cache_size has no relation to the limits.  Take a look at how much of your operating system cache you think might be free at any time.  Figure out what percentage of that you might want PostgreSQL to be able to use sometime.  Set effective_cache_size to it.  If a query goes wild and decides to execute a really bad query plan that reads a bunch of data, it will trash the operating system cache regardless; you can't stop it like this.


Q3. Or will doing something like this play havoc with the query planner/unexplained OOM/crashes ?

If you ever let the system get so low on RAM that the Linux OOM killer becomes active, it will almost always kill the main database server process, the one that spawns all the clients off, due to how Linux does shared memory accounting.  This is another really bad things to be avoided.


1. will this affect the memory usage of vacuum (going to be using default vacuum settings for 8.4) - because ideally I would want to have some control over it as well.

Each of the autovacuum processes (defaulting to 3) will use up to maintenance_work_mem worth of memory when they are running.  You should account for that when estimating peak usage.


2. Would I have to tune max_connections, max_files_per_process (and any related variables) ?

Limiting max_connections, and accordingly dealing with the fact that some connections might be refused temporarily in your application, is the most effective thing you can do here.  max_files_per_process is really secondary to any of the other bits you're asking about.


3. When I turn on WAL, would I have to tune wal_buffers accordingly  set effective cache size to account for wal_buffers as well ?

Set wal_buffers somewhere between 1MB and 16MB, include it in the general server shared memory overhead, and then ignore it.  It takes up a little memory but isn't nearly as important as these other bits.

--
Greg Smith  2ndQuadrant US  Baltimore, MD
PostgreSQL Training, Services and Support
greg@2ndQuadrant.com   www.2ndQuadrant.us