Обсуждение: Recommended optimisations slows down PostgreSQL 8.4

Поиск
Список
Период
Сортировка

Recommended optimisations slows down PostgreSQL 8.4

От
Waldo Nell
Дата:
I have PostgreSQL 8.4.8 on Ubuntu Linux x64.  Server is a Core i7 950 with 6GB of RAM.  2GB of RAM us used by Java,
somesmall amount by the kernel / services and the rest is available to PostgreSQL.  Hard drive is a single 7200 RPM
SATA1TB Caviar Black HDD.  No other applications / processes are running when I perform my tests. 

I have an application that performs about 80% reads and 20% writes for a specific billrun.   It takes about 60 minutes
tocomplete, and I can have it perform precisely the same queries repeatedly.  I have consistently showed that when
shared_buffers= 24MB (the default), and wal_buffers = 64kB, the system completes the process in 50 minutes.  When I
bumpshared_buffers to 1500MB, the system slows down and takes 60 minutes to complete the same process.  Changing that
backto 24MB, but then changing wal_buffers to 16MB has the same impact - performance drops from 50 minutes to about 61
minutes. Changing those two parameters back to the defaults returns the time to 50 minutes. 

fsync = off for these tests - not sure if it is relevant.  All other settings are at their defaults.

Please explain why the system is slower with the recommended values for these two settings?  The DB is about 74GB, the
largesttable has 180 million rows. 

Re: Recommended optimisations slows down PostgreSQL 8.4

От
"ktm@rice.edu"
Дата:
On Thu, Aug 11, 2011 at 04:35:34PM -0700, Waldo Nell wrote:
> I have PostgreSQL 8.4.8 on Ubuntu Linux x64.  Server is a Core i7 950 with 6GB of RAM.  2GB of RAM us used by Java,
somesmall amount by the kernel / services and the rest is available to PostgreSQL.  Hard drive is a single 7200 RPM
SATA1TB Caviar Black HDD.  No other applications / processes are running when I perform my tests. 
>
> I have an application that performs about 80% reads and 20% writes for a specific billrun.   It takes about 60
minutesto complete, and I can have it perform precisely the same queries repeatedly.  I have consistently showed that
whenshared_buffers = 24MB (the default), and wal_buffers = 64kB, the system completes the process in 50 minutes.  When
Ibump shared_buffers to 1500MB, the system slows down and takes 60 minutes to complete the same process.  Changing that
backto 24MB, but then changing wal_buffers to 16MB has the same impact - performance drops from 50 minutes to about 61
minutes. Changing those two parameters back to the defaults returns the time to 50 minutes. 
>
> fsync = off for these tests - not sure if it is relevant.  All other settings are at their defaults.
>
> Please explain why the system is slower with the recommended values for these two settings?  The DB is about 74GB,
thelargest table has 180 million rows. 

One guess is that you are using the defaults for other costing parameters and they
do not accurately reflect your system. This means that it will be a crap shoot as
to whether a plan is faster or slower and what will affect the timing.

Regards,
Ken

Re: Recommended optimisations slows down PostgreSQL 8.4

От
Waldo Nell
Дата:
On 2011-08-11, at 17:18 , ktm@rice.edu wrote:

> One guess is that you are using the defaults for other costing parameters and they
> do not accurately reflect your system. This means that it will be a crap shoot as
> to whether a plan is faster or slower and what will affect the timing.

Ok, but I thought the way to best optimise PostgreSQL is to start with the parameters having the biggest impact and
workfrom there.  To adjust multiple parameters would not give a clear indication as to the benefit of each, as they may
canceleach other out. 

To test your theory, what other parameters should I be looking at?  Here are some more with their current values:

random_page_cost = 4.0
effective_cache_size = 128MB

Remember this runs on SATA so random seeks are not as fast as say SSD.

Re: Recommended optimisations slows down PostgreSQL 8.4

От
Greg Smith
Дата:
On 08/11/2011 07:35 PM, Waldo Nell wrote:
> Please explain why the system is slower with the recommended values for these two settings?

If the other parameters are at their defaults, the server is probably
executing a checkpoint every few seconds running your test.  I'd wager
your log is filled with checkpoint warnings, about them executing too
frequently.

Each time a checkpoint happens, all of shared_buffers is dumped out.  So
if you increase shared_buffers, but don't space the checkpoints out
more, it just ends up writing the same data over and over again.  Using
a smaller shared_buffers lets the OS deal with that problem instead, so
it actually ends up being more efficient.

Basically, you can't increase shared_buffers usefully without also
increasing checkpoint_segments.  All three of shared_buffers,
wal_buffers, and checkpoint_segments have to go up before you'll see the
expected benefit from raising any of them; you can't change a parameter
at a time and expect an improvement.  Try this:

shared_buffers=512MB
wal_buffers=16MB
checkpoint_segments=64

And see how that does.  If that finally beats your 50 minute record, you
can see if further increase to shared_buffers and checkpoint_segments
continue to help from there.  Effective upper limits on your server are
probably around 2GB for shared_buffers and around 256 for
checkpoint_segments; they could be lower if your application uses a lot
of transitory data (gets read/written once, so the database cache is no
better than the OS one).

--
Greg Smith   2ndQuadrant US    greg@2ndQuadrant.com   Baltimore, MD
PostgreSQL Training, Services, and 24x7 Support  www.2ndQuadrant.us


Re: Recommended optimisations slows down PostgreSQL 8.4

От
Waldo Nell
Дата:
On 2011-08-11, at 18:17 , Greg Smith wrote:

> shared_buffers=512MB
> wal_buffers=16MB
> checkpoint_segments=64

Thanks for the advice.  I tried these values... And it is even worse - went up to 63 minutes (from 60 minutes).  Like I
saidthis load is read mostly.  My 80 / 20% might be a bit inaccurate, if so it could be more like 90% read and 10%
write. I do not see any checkpoint warnings in my logs. 

I guess that means the OS cache is better for this particular use case than the postgresql cache?


Re: Recommended optimisations slows down PostgreSQL 8.4

От
Merlin Moncure
Дата:
On Thu, Aug 11, 2011 at 7:27 PM, Waldo Nell <pwnell@telkomsa.net> wrote:
>
> On 2011-08-11, at 17:18 , ktm@rice.edu wrote:
>
>> One guess is that you are using the defaults for other costing parameters and they
>> do not accurately reflect your system. This means that it will be a crap shoot as
>> to whether a plan is faster or slower and what will affect the timing.
>
> Ok, but I thought the way to best optimise PostgreSQL is to start with the parameters having the biggest impact and
workfrom there.  To adjust multiple parameters would not give a clear indication as to the benefit of each, as they may
canceleach other out. 

A couple points:
*) shared buffers is a highly nuanced setting that is very workload
dependent.  it mainly affects write heavy loads, and the pattern of
writing is very important in terms of the benefits you may or may not
see. it also changes checkpoint behavior -- this will typically
manifest as a negative change with raising buffers but this can be
mitigated. if your i/o becomes very bursty after raising this setting
it's a red flag that more tuning is required.

*) fsync = off: throw the book out on traditional tuning advice.  with
this setting (dangerously) set, the o/s is essentially responsible for
i/o patterns so you should focus your tuning efforts there.  the
benefits of raising shared buffers don't play as much in this case.

> To test your theory, what other parameters should I be looking at?  Here are some more with their current values:
>
> random_page_cost = 4.0
> effective_cache_size = 128MB

*) these settings affect query plans.  changing them could have no
affect or dramatic effect depending on the specific queries you have
and if they or chosen badly due to overly default conservative
settings.  the postgresql planner has gotten pretty accurate over the
years in the sense that you will want to tune these to be as close to
reality as possible.

In my opinion before looking at postgresql.conf you need to make sure
your queries and their plans are good. fire up pgfouine and see where
those 60 minutes are gettings spent.   maybe you have a problem query
that demands optimization.

merlin

Re: Recommended optimisations slows down PostgreSQL 8.4

От
Waldo Nell
Дата:
On 2011-08-12, at 09:32 , Merlin Moncure wrote:

> In my opinion before looking at postgresql.conf you need to make sure
> your queries and their plans are good. fire up pgfouine and see where
> those 60 minutes are gettings spent.   maybe you have a problem query
> that demands optimization.

Thanks for your advice, I am sure to look into what you said.  I might just add some background information.  The
processused to take 266 minutes to complete - which I got down to 49 minutes.  I spent a LOT of time optimising
queries,implementing multithreading to utilise the extra cores and place more load on the DB that way etc.  So that
beingdone as best I can for now, I am focussing on the DB itself.  I am a firm believer the best place to optimise is
firstthe queries / code THEN the hardware / fine tuning the parameters. 

The fsync = off was because the production system runs on a uber expensive SAN system with multipathing over Fibre
Channel,it is on UPS and backup generators in a secure datacenter, and we have daily backups we can fall back to. 

Re: Recommended optimisations slows down PostgreSQL 8.4

От
"Kevin Grittner"
Дата:
Waldo Nell <pwnell@telkomsa.net> wrote:

> The fsync = off was because the production system runs on a uber
> expensive SAN system with multipathing over Fibre Channel, it is
> on UPS and backup generators in a secure datacenter, and we have
> daily backups we can fall back to.

Turning fsync off in production may be OK as long as those daily
backups aren't in the same building as the uber expensive SAN, and
it's really OK to fall back on a daily backup if the database server
crashes or locks up.  By the way, I never trust a backup until I
have successfully restored from it; you should probably do that at
least on some periodic basis if you can't do it every time.

The other thing I would point out is that if you are tuning with
different table sizes, RAM sizes, or I/O performance characteristics
from production, the database tuning in one environment may not have
much to do with what works best in the other environment.

As for why the recommended settings are having paradoxical effects
in this environment -- this advice is generally based on systems
with fsync on and a RAID controller with battery-backed cache
configured for write-back.  I don't know how well the advice
generalizes to a single spindle with fsync off.

-Kevin

Re: Recommended optimisations slows down PostgreSQL 8.4

От
Greg Smith
Дата:
On 08/12/2011 12:28 PM, Waldo Nell wrote:
> I guess that means the OS cache is better for this particular use case
> than the postgresql cache?

There you go.  It's not magic; the database cache has some properties
that work very well for some workloads.  And for others, you might as
well let the OS deal with it.  The fact that you have the option of
adjusting the proportions here is a controversial design point, but it
does let you tune to your workload in this sort of case.

>  The fsync = off was because the production system runs on a uber expensive SAN system with multipathing over Fibre
Channel,it is>  on UPS and backup generators in a secure datacenter, and we have daily backups we can fall back to. 


The safer alternative is to turn synchronous_commit off and increase
wal_writer_delay.  That may not be quite as fast as turning fsync off,
but the risk level is a lot lower too.  The first time someone
accidentally unplugs a part of your server, you'll realize that the UPS
and generators don't really provide very much protection against the
things that actually happen in a data center.  Having backups is great,
but needing to restore from them is no fun.

--
Greg Smith   2ndQuadrant US    greg@2ndQuadrant.com   Baltimore, MD
PostgreSQL Training, Services, and 24x7 Support  www.2ndQuadrant.us


Re: Recommended optimisations slows down PostgreSQL 8.4

От
Waldo Nell
Дата:
On 2011-08-12, at 10:10 , Kevin Grittner wrote:

> Turning fsync off in production may be OK as long as those daily
> backups aren't in the same building as the uber expensive SAN, and
> it's really OK to fall back on a daily backup if the database server
> crashes or locks up.  By the way, I never trust a backup until I
> have successfully restored from it; you should probably do that at
> least on some periodic basis if you can't do it every time.

Yes we have daily tape backups that are taken off site.  And since we refresh QA from prod at least 4 times a month, we
knowthe backups are good on a frequent basis.  Very valid points. 

>
> The other thing I would point out is that if you are tuning with
> different table sizes, RAM sizes, or I/O performance characteristics
> from production, the database tuning in one environment may not have
> much to do with what works best in the other environment.

My DB is an exact duplicate I took from production, however my testing is on different hardware and since I cannot
afforda SAN for testing purposes, I am testing on a 7200rpm SATA drive so yeah I guess that is true... I will need to
performancetest on production environment. 

>
> As for why the recommended settings are having paradoxical effects
> in this environment -- this advice is generally based on systems
> with fsync on and a RAID controller with battery-backed cache
> configured for write-back.  I don't know how well the advice
> generalizes to a single spindle with fsync off.

Thanks, I will be sure to carry on testing on production.


Re: Recommended optimisations slows down PostgreSQL 8.4

От
Tom Lane
Дата:
Waldo Nell <pwnell@telkomsa.net> writes:
> I have PostgreSQL 8.4.8 on Ubuntu Linux x64.  Server is a Core i7 950
> with 6GB of RAM.  2GB of RAM us used by Java, some small amount by the
> kernel / services and the rest is available to PostgreSQL.

[ and the DB is 74GB, and things get slower when raising shared_buffers
  from 24MB to 1500MB ]

One other point here is that with the DB so much larger than available
RAM, you are almost certainly doing lots of I/O (unless your test case
has lots of locality of reference).  With small shared_buffers, the
space for kernel disk cache amounts to 3 or so GB, and that's your
primary buffer against duplicate I/Os.  When you crank shared_buffers
up to half that, you now have two buffer pools of about the same size
independently trying to cache the most-used parts of the DB.  This is
likely to not work too well and result in much more I/O.  You save some
shared-buffers-to-kernel-buffers transfers with more shared_buffers, but
if the amount of disk I/O goes up a lot in consequence, you'll come out
way behind.

            regards, tom lane

Re: Recommended optimisations slows down PostgreSQL 8.4

От
Stephen Frost
Дата:
* Waldo Nell (pwnell@telkomsa.net) wrote:
> The fsync = off was because the production system runs on a uber expensive SAN system with multipathing over Fibre
Channel,it is on UPS and backup generators in a secure datacenter, and we have daily backups we can fall back to. 

So, two points: #1- the uber-expensive SAN should make twiddling fsync
have much less of an effect on performance than in a non-SAN/non-BBWC
environment, so you might validate that you really need it off.  #2- the
SAN, FC, UPS, etc will be of no help if the OS or PG crash.  Seems
pretty harsh to resort back to a daily backup in the event the OS reboots
due to some wacky NMI, or the ASR going haywire..

    Thanks,

        Stephen

Вложения