Обсуждение: Config parameters

Поиск
Список
Период
Сортировка

Config parameters

От
"Jeremy Haile"
Дата:
I'm curious what parameters you guys typically *always* adjust on new
PostgreSQL installs.

I am working with a database that contains several large tables (10-20
million) and many smaller tables (hundreds of rows).  My system has 2 GB
of RAM currently, although I will be upping it to 4GB soon.

My motivation in asking this question is to make sure I'm not making a
big configuration no-no by missing a parameter, and also for my own
checklist of parameters I should almost always set when configuring a
new install.

The parameters that I almost always change when installing a new system
is shared_buffers, max_fsm_pages, checkpoint_segments, and
effective_cache_size.

Are there any parameters missing that always should be changed when
deploying to a decent server?


Re: Config parameters

От
Richard Huxton
Дата:
Jeremy Haile wrote:
> I'm curious what parameters you guys typically *always* adjust on new
> PostgreSQL installs.

> The parameters that I almost always change when installing a new system
> is shared_buffers, max_fsm_pages, checkpoint_segments, and
> effective_cache_size.

Always: work_mem, maintenance_work_mem
Also consider temp_buffers and random_page_cost.

A lot will depend on how much of the data you handle ends up cached.

--
   Richard Huxton
   Archonet Ltd

Re: Config parameters

От
"Jeremy Haile"
Дата:
What is a decent default setting for work_mem and maintenance_work_mem,
considering I am regularly querying tables that are tens of millions of
rows and have 2-4 GB of RAM?

Also - what is the best way to determine decent settings for
temp_buffers and random_page_cost?


On Tue, 02 Jan 2007 16:34:19 +0000, "Richard Huxton" <dev@archonet.com>
said:
> Jeremy Haile wrote:
> > I'm curious what parameters you guys typically *always* adjust on new
> > PostgreSQL installs.
>
> > The parameters that I almost always change when installing a new system
> > is shared_buffers, max_fsm_pages, checkpoint_segments, and
> > effective_cache_size.
>
> Always: work_mem, maintenance_work_mem
> Also consider temp_buffers and random_page_cost.
>
> A lot will depend on how much of the data you handle ends up cached.
>
> --
>    Richard Huxton
>    Archonet Ltd

Re: Config parameters

От
Richard Huxton
Дата:
Jeremy Haile wrote:
> What is a decent default setting for work_mem and maintenance_work_mem,
> considering I am regularly querying tables that are tens of millions of
> rows and have 2-4 GB of RAM?

Well, work_mem will depend on your query-load. Queries that do a lot of
sorting should benefit from increased work_mem. You only have limited
RAM though, so it's a balancing act between memory used to cache disk
and per-process sort memory. Note that work_mem is per sort, so you can
use multiples of that amount in a single query. You can issue a "set" to
change the value for a session.

How you set maintenance_work_mem will depend on whether you vacuum
continually (e.g. autovacuum) or at set times.

> Also - what is the best way to determine decent settings for
> temp_buffers and random_page_cost?

With all of these, testing I'm afraid. The only sure thing you can say
is that random_page_cost should be 1 if all your database fits in RAM.

--
   Richard Huxton
   Archonet Ltd

Re: Config parameters

От
"Jeremy Haile"
Дата:
Thanks for the information!

Are there any rule-of-thumb starting points for these values that you
use when setting up servers?  I'd at least like a starting point for
testing different values.

For example, I'm sure setting a default work_mem of 100MB is usually
overkill - but is 5MB usually a reasonable number?  20MB?  My system
does not have a huge number of concurrent users, but they are hitting
large tables.  I'm not sure what numbers people usually use here
successfully.

For maintenance_work_mem, I turned off autovacuum to save on
performance, but run a vacuum analyze once an hour.  My current database
characteristics are heavy insert (bulk inserts every 5 minutes) and
medium amount of selects on large, heavily indexed tables.

For temp_buffers - any rule of thumb starting point?  What's the best
way to evaluate if this number is adjusted correctly?

For random_page_cost - is the default of 4 pretty good for most drives?
Do you usually bump it up to 3 on modern servers?  I've usually done
internal RAID setups, but the database I'm currently working on is
hitting a SAN over fiber.

I realize that these values can vary a lot based on a variety of factors
- but I'd love some more advice on what good rule-of-thumb starting
points are for experimentation and how to evaluate whether the values
are set correctly. (in the case of temp_buffers and work_mem especially)


On Tue, 02 Jan 2007 18:49:54 +0000, "Richard Huxton" <dev@archonet.com>
said:
> Jeremy Haile wrote:
> > What is a decent default setting for work_mem and maintenance_work_mem,
> > considering I am regularly querying tables that are tens of millions of
> > rows and have 2-4 GB of RAM?
>
> Well, work_mem will depend on your query-load. Queries that do a lot of
> sorting should benefit from increased work_mem. You only have limited
> RAM though, so it's a balancing act between memory used to cache disk
> and per-process sort memory. Note that work_mem is per sort, so you can
> use multiples of that amount in a single query. You can issue a "set" to
> change the value for a session.
>
> How you set maintenance_work_mem will depend on whether you vacuum
> continually (e.g. autovacuum) or at set times.
>
> > Also - what is the best way to determine decent settings for
> > temp_buffers and random_page_cost?
>
> With all of these, testing I'm afraid. The only sure thing you can say
> is that random_page_cost should be 1 if all your database fits in RAM.
>
> --
>    Richard Huxton
>    Archonet Ltd

Re: Config parameters

От
Scott Marlowe
Дата:
On Tue, 2007-01-02 at 13:19, Jeremy Haile wrote:
> Thanks for the information!
>
> Are there any rule-of-thumb starting points for these values that you
> use when setting up servers?  I'd at least like a starting point for
> testing different values.
>
> For example, I'm sure setting a default work_mem of 100MB is usually
> overkill - but is 5MB usually a reasonable number?  20MB?  My system
> does not have a huge number of concurrent users, but they are hitting
> large tables.  I'm not sure what numbers people usually use here
> successfully.

The setting for work_mem is very dependent on how many simultaneous
connections you'll be processing at the same time, and how likely they
are to be doing sorts.

If you'll only ever have 5 connections to a database on a machine with a
lot of memory, then setting it to 100M is probably fine.  Keep in mind,
the limit is PER SORT, not per query.  An upper limit of about 25% of
the machine's total memory is a good goal for how big to size work_mem.

So, on a 4 Gig machine you could divide 1G (25%) by the total possible
connections, then again by the average number of sorts you'd expect per
query / connection to get an idea.

Also, you can set it smaller than that, and for a given connection, set
it on the fly when needed.

<CONNECT>
set work_mem=1000000;
select .....
<DISCONNECT>

And you run less risk of blowing out the machine with joe user's random
query.

> For maintenance_work_mem, I turned off autovacuum to save on
> performance, but run a vacuum analyze once an hour.  My current database
> characteristics are heavy insert (bulk inserts every 5 minutes) and
> medium amount of selects on large, heavily indexed tables.

Did you turn off stats collection as well?  That's really the major
performance issue with autovacuum, not autovacuum itself.  Plus, if
you've got a table that really needs vacuuming every 5 minutes to keep
the database healthy, you may be working against yourself by turning off
autovacuum.

I.e. the cure may be worse than the disease.  OTOH, if you don't delete
/ update often, then don't worry about it.

> For temp_buffers - any rule of thumb starting point?  What's the best
> way to evaluate if this number is adjusted correctly?

Haven't researched temp_buffers at all.

> For random_page_cost - is the default of 4 pretty good for most drives?
> Do you usually bump it up to 3 on modern servers?  I've usually done
> internal RAID setups, but the database I'm currently working on is
> hitting a SAN over fiber.

random_page_cost is the hardest to come up with the proper setting.  If
you're hitting a RAID10 with 40 disk drives or some other huge drive
array, you might need to crank up random_page_cost to some very large
number, as sequential accesses are often preferred there.  I believe
there were some posts by Luke Lonergan (sp) a while back where he had
set random_page_cost to 20 or something even higher on a large system
like that.

On data sets that fit in memory, the cost nominally approaces 1.  On
smaller work group servers with a single mirror set for a drive
subsystem and moderate to large data sets, I've found values of 1.4 to
3.0 to be reasonable, depending on the workload.

> I realize that these values can vary a lot based on a variety of factors
> - but I'd love some more advice on what good rule-of-thumb starting
> points are for experimentation and how to evaluate whether the values
> are set correctly. (in the case of temp_buffers and work_mem especially)

To see if the values are good or not, run a variety of your worst
queries on the machine while varying the settings to see which run
best.  That will at least let you know if you're close.  While you can't
change buffers on the fly, you can change work_mem and random_page_cost
on the fly, per connection, to see the change.


Re: Config parameters

От
"Jeremy Haile"
Дата:
> So, on a 4 Gig machine you could divide 1G (25%) by the total possible
> connections, then again by the average number of sorts you'd expect per
> query / connection to get an idea.

Thanks for the advice.  I'll experiment with higher work_mem settings,
as I am regularly doing sorts on large datasets.  I imagine the default
setting isn't very optimal in my case.


> Did you turn off stats collection as well?  That's really the major
> performance issue with autovacuum, not autovacuum itself.

I did turn off stats collection.  I'm not sure how much of a difference
it makes, but I was trying to squeeze every ounce of performance out of
the database.


> I.e. the cure may be worse than the disease.  OTOH, if you don't delete
> / update often, then don't worry about it.

I hardly ever delete/update.  I update regularly, but only on small
tables so it doesn't make as big of a difference.  I do huge inserts,
which is why turning off stats/autovacuum gives me some performance
benefit.  I usually only do deletes nightly in large batches, so
autovacuuming/analyzing once an hour works fairly well.


> Haven't researched temp_buffers at all.

Do you usually change temp_buffers?  Mine is currently at the default
setting.  I guess I could arbitrarily bump it up - but I'm not sure what
the consequences would be or how to tell if it is set correctly.


> random_page_cost is the hardest to come up with the proper setting.

This definitely sounds like the hardest to figure out.  (since it seems
to be almost all trial-and-error)  I'll play with some different values.
 This is only used by the query planner right?  How much of a
performance difference does it usually make to tweak this number?  (i.e.
how much performance difference would someone usually expect when they
find that 2.5 works better than 4?)


> While you can't
> change buffers on the fly, you can change work_mem and random_page_cost
> on the fly, per connection, to see the change.

Thanks for the advice.  I was aware you could change work_mem on the
fly, but didn't think about setting random_page_cost on-the-fly.