Josh Berkus <josh@agliodbs.com> writes:
> Folks,
>
> At OSDL we're seeing a wierd performance crash on 8.1cvs. What's wierd about
> it is that it doesn't happen all the time -- about 1 out of 4 test runs.
> What it looks like happens sometimes is that performance drops dramatically
> at the first checkpoint, and never comes back. But there's oprofiles and
> things to make a more insightful analysis:
>
> 3 test runs exhibit it:
> http://khack.osdl.org/stp/301531/0.html
> http://khack.osdl.org/stp/301736/0.html
> http://khack.osdl.org/stp/301730/0.html
That dropoff at 60 minutes is the *first* checkpoint?! On an 80m test run?
That's a totally unrealistic configuration. Do you have any reason to think
the drop-off isn't just because all that pending i/o that you've postponed for
so long is finally having to get written out? Worse, it's forcing Postgres to
fsync files after 60m of i/o has been performed, flushing huge queues of i/o.
The benchmarks performed in this configuration are completely bogus. They
aren't including the time to checkpoint the last 20m of i/o, a quarter of all
the i/o in the test.
You really have to lower the checkpoint timeout to something realistic, like
5m or so. Otherwise these tests are just useless.
--
greg