Обсуждение: [repost] Help me develop new commit_delay advice

Поиск
Список
Период
Сортировка

[repost] Help me develop new commit_delay advice

От
Peter Geoghegan
Дата:
This has been reposted to this list from the pgsql-hackers list, at
the request of Josh Berkus. Hopefully there will be more interest
here.

---------- Forwarded message ----------
From: Peter Geoghegan <peter@2ndquadrant.com>
Date: 29 July 2012 16:39
Subject: Help me develop new commit_delay advice
To: PG Hackers <pgsql-hackers@postgresql.org>


Many of you will be aware that the behaviour of commit_delay was
recently changed. Now, the delay only occurs within the group commit
leader backend, and not within each and every backend committing a
transaction:

http://git.postgresql.org/gitweb/?p=postgresql.git;a=commit;h=f11e8be3e812cdbbc139c1b4e49141378b118dee

For those of you that didn't follow this development, I should point
out that I wrote a blogpost that described the idea, which will serve
as a useful summary:

http://pgeoghegan.blogspot.com/2012/06/towards-14000-write-transactions-on-my.html

I made what may turn out to be a useful observation during the
development of the patch, which was that for both the tpc-b.sql and
insert.sql pgbench-tools scripts, a commit_delay of half of my
wal_sync_method's reported raw sync speed looked optimal. I use Linux,
so my wal_sync_method happened to have been fdatasync. I measured this
using pg_test_fsync.

The devel docs still say of commit_delay and commit siblings: "Good
values for these parameters are not yet clear; experimentation is
encouraged". This has been the case since Postgres 7.1 (i.e. it has
never been clear what good values were - the folk wisdom was actually
that commit_delay should always be set to 0). I hope to be able to
formulate some folk wisdom about setting commit_delay from 9.3 on,
that may go on to be accepted as an official recommendation within the
docs.

I am rather curious as to what experimentation shows optimal values
for commit_delay to be for a representative cross-section of hardware.
In particular, I'd like to see if setting commit_delay to half of raw
sync time appears to be optimal for both insert.sql and tpc-b.sql
workloads across different types of hardware with different sync
times. Now, it may be sort of questionable to take those workloads as
general proxies for performance, not least since they will literally
give Postgres as many *completely uniform* transactions as it can
handle. However, it is hard to think of another, better general proxy
for performance that is likely to be accepted as such, and will allows
us to reason about setting commit_delay.

While I am not completely confident that we can formulate a widely
useful, simple piece of advice, I am encouraged by the fact that a
commit_delay of 4,000 worked very well for both tpc-b.sql and
insert.sql workloads on my laptop, beating out settings of 3,000 and
5,000 on each benchmark. I am also encouraged by the fact that in some
cases, including both the insert.sql and tpc-b.sql cases that I've
already described elsewhere, there is actually no downside to setting
commit_delay - transaction throughput naturally improves, but
transaction latency is actually improved a bit too (or at least the
average and worst-cases). This is presumably due to the amelioration
of resource contention (from greater commit batching) more than
compensating for the obvious downside of adding a delay.

It would be useful, for a start, if I had numbers for a battery-backed
write cache. I don't have access to one right now though, nor do I
have access to any more interesting hardware, which is one reason why
I'm asking for help with this.

I like to run "sync" prior to running pg_test_fsync, just in case.

[peter@peterlaptop pg_test_fsync]$ sync

I then interpret the following output:

[peter@peterlaptop pg_test_fsync]$ pg_test_fsync
2 seconds per test
O_DIRECT supported on this platform for open_datasync and open_sync.

Compare file sync methods using one 8kB write:
(in wal_sync_method preference order, except fdatasync
is Linux's default)
        open_datasync                 112.940 ops/sec
        fdatasync                         114.035 ops/sec
        fsync                                 21.291 ops/sec
*** SNIP ***

So if I can perform 114.035 8KiB sync operations per second, that's an
average of about 1 per 8.77 milliseconds, or 8770 microseconds to put
it in the units that commit_delay speaks. It is my hope that we will
find that when this number is halved, we will arrive at a figure that
is worth recommending as a general useful setting for commit_delay for
the system. I guess I could gain some additional insight by simply
changing my wal_sync_method, but I'd find it more interesting to look
at organic setups with faster (not slower) sync times than my system's
fdatasync. For those who are able to help me here, I'd like to see
pgbench-tools workloads for both tpc-b.sql and insert.sql with
incrementing values of commit_delay (increments of, say, 1000
microseconds, perhaps with less granularity where it isn't needed),
from 0 to $(1.5 times raw sync speed) microseconds.

Thanks
--
Peter Geoghegan       http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training and Services

Re: [repost] Help me develop new commit_delay advice

От
Greg Smith
Дата:
On 08/02/2012 02:02 PM, Peter Geoghegan wrote:
> I made what may turn out to be a useful observation during the
> development of the patch, which was that for both the tpc-b.sql and
> insert.sql pgbench-tools scripts, a commit_delay of half of my
> wal_sync_method's reported raw sync speed looked optimal.

I dug up what I wrote when trying to provide better advice for this
circa V8.3.  That never really gelled into something worth publishing at
the time.  But I see some similar patterns what what you're reporting,
so maybe this will be useful input to you now.  That included a 7200RPM
drive and a system with a BBWC.

In the BBWC case, the only useful tuning I found was to add a very small
amount of commit_delay, possibly increasing the siblings too.  I was
using http://benjiyork.com/blog/2007/04/sleep-considered-harmful.html to
figure out the minimum sleep resolution on the server (3us at the time)
and setting commit_delay to that; then increasing commit_siblings to 10
or 20.  Jignesh Shah came back with something in the same sort of range
then at
http://jkshah.blogspot.com/2007/07/specjappserver2004-and-postgresql_09.html
, setting commit_delay=10.

On the 7200RPM drive ~= 115 TPS, 1/2 of the drive's rotation was
consistently what worked best for me across multiple tests too.  I also
found lowering commit_siblings all the way to 1 could significantly
improve the 2 client case when you did that.  Here's my notes from then:

commit_delay=4500, commit_siblings=1:  By waiting 1/2 a revolution if
there's another active transaction, I get a small improvement at the
low-end (around an extra 20 TPS between 2 and 6 clients), while not
doing much damage to the higher client loads.  This might
be a useful tuning if your expected number of active clients are low,
you don't have a good caching controller, but you'd like a little more
oomph out of things.  The results for 7000 usec were almost as good.
But in general, if you're stuck choosing between two commit_delay values
you should use the smaller one as it will be less likely to have a bad
impact on low client loads.

I also found considering a high delay only when a lot of clients were
usually involved worked a bit better than a 1/2 rotation:

commit_delay=10000, commit_siblings=20:  At higher client loads, there's
almost invariably another commit coming right behind yours if you wait a
bit.  Just plan to wait a bit more than an entire rotation between
commits.  This buys me about an extra 30 TPS on the high client loads,
which is a small fraction of an improvement (<5%) but might be worthwhile.

The fact that it seemed the optimal delay needed to vary a bit based on
the number of the siblings was one of the challenges I kept butting into
then.  Making the GUC settings even more complicated for this doesn't
seem a productive step forward for the average user.

--
Greg Smith   2ndQuadrant US    greg@2ndQuadrant.com   Baltimore, MD
PostgreSQL Training, Services, and 24x7 Support www.2ndQuadrant.com


Re: [repost] Help me develop new commit_delay advice

От
Peter Geoghegan
Дата:
On 6 September 2012 04:20, Greg Smith <greg@2ndquadrant.com> wrote:
> On 08/02/2012 02:02 PM, Peter Geoghegan wrote:
> I dug up what I wrote when trying to provide better advice for this circa
> V8.3.  That never really gelled into something worth publishing at the time.
> But I see some similar patterns what what you're reporting, so maybe this
> will be useful input to you now.  That included a 7200RPM drive and a system
> with a BBWC.

So, did either Josh or Greg ever get as far as producing numbers for
drives with faster fsyncs than the ~8,000 us fsync speed of my
laptop's disk?

I'd really like to be able to make a firm recommendation as to how
commit_delay should be set, and have been having a hard time beating
the half raw-sync time recommendation, even with a relatively narrow
benchmark (that is, the alernative pgbench-tools scripts). My
observation is that it is generally better to ameliorate the risk of
increased latency through a higher commit_siblings setting rather than
through a lower commit_delay (though it would be easy to overdo it -
commit_delay can now be thought of as a way to bring the benefits of
group commit to workloads that could in principle benefit, but would
otherwise not benefit much from it, such as workloads with lots of
small writes but not too many clients).

One idea I had, which is really more -hackers material, was to test if
backends with a transaction are inCommit (that's a PGXACT field),
rather than just having a transaction, within MinimumActiveBackends().
The idea is that commit_siblings would represent the number of
backends imminently committing needed to delay, rather than the number
of backends in a transaction. It is far from clear that that's a good
idea, but that's perhaps just because the pgbench infrastructure is a
poor proxy for real workloads, with variable sized transactions.
Pretty much all pgbench transactions commit imminently anyway.

Another idea which I have lost faith in - because it has been hard to
prove that client count is really relevant - was the notion that
commit_delay should be a dynamically adapting function of the client
(with transactions) count. Setting commit_delay to 1/2 raw sync time
appears optimal at any client count that is > 1. The effect at 2
clients is quite noticeable.

I have a rather busy schedule right now, and cannot spend too many
more cycles on this. I'd like to reach a consensus on this soon. Just
giving the 1/2 raw sync time the official blessing of being included
in the docs should be the least we do, though. It is okay if the
wording is a bit equivocal - that has to be better than the current
advice, which is (to paraphrase) "we don't really have a clue; you
tell us".

--
Peter Geoghegan       http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training and Services