Обсуждение: sudden drop in statement turnaround latency -- yay!.

От:
"Merlin Moncure"
Дата:

I took advantage of the holidays to update a production server (dual
Opteron on win2k) from an 11/16 build (about beta5 or so) to the latest
release candidate.  No configuration changes were made, just a binary
swap and a server stop/start.

I was shocked to see that statement latency dropped by a fairly large
margin.  Here is a log snippet taken as measured from the client
application:

0.000278866 sec:  data1_read_key_item_vendor_file_0 params: $1=005988
$2=002255
0.00032731  sec:  data1_read_key_item_link_file_1 params: $1=005988
0.000327063 sec:  data1_read_key_bm_header_file_0 params: $1=008704
0.000304915 sec:  data1_read_key_item_vendor_file_0 params: $1=008704
$2=000117
0.00029838  sec:  data1_read_key_item_link_file_1 params: $1=008704
0.0003252   sec:  data1_read_key_bm_header_file_0 params: $1=000268
0.000274747 sec:  data1_read_key_item_vendor_file_0 params: $1=000268
$2=000117
0.000324275 sec:  data1_read_key_item_link_file_1 params: $1=000268

These are statements that are run (AFIK) the fastest possible way, which
is using prepared statements over parse/bind.  The previous latencies
usually varied between .0005 and .0007 sec, but never below .5 ms for a
index read.  Now, as demonstated by the log, I'm getting times less than
half that figure.  I benchmarked a transversal over a bill of materials
(several thousand statements) and noticed about a 40% reduction in time
to complete the operation.

I wonder exactly what and when this happened, has anybody else noticed a
similar change?

Merlin

От:
Tom Lane
Дата:

"Merlin Moncure" <> writes:
> I took advantage of the holidays to update a production server (dual
> Opteron on win2k) from an 11/16 build (about beta5 or so) to the latest
> release candidate.  No configuration changes were made, just a binary
> swap and a server stop/start.

> I was shocked to see that statement latency dropped by a fairly large
> margin.

Hmm ... I trawled through the CVS logs since 11/16, and did not see very
many changes that looked like they might improve performance (list
attached) --- and even of those, hardly any looked like the change would
be significant.  Do you know whether the query plans changed?  Are you
running few enough queries per connection that backend startup overhead
might be an issue?

            regards, tom lane


2004-12-15 14:16  tgl

    * src/backend/access/nbtree/nbtutils.c: Calculation of
    keys_are_unique flag was wrong for cases involving redundant
    cross-datatype comparisons.  Per example from Merlin Moncure.

2004-12-02 10:32  momjian

    * configure, configure.in, doc/src/sgml/libpq.sgml,
    doc/src/sgml/ref/copy.sgml, src/interfaces/libpq/fe-connect.c,
    src/interfaces/libpq/fe-print.c, src/interfaces/libpq/fe-secure.c,
    src/interfaces/libpq/libpq-fe.h, src/interfaces/libpq/libpq-int.h:
    Rework libpq threaded SIGPIPE handling to avoid interference with
    calling applications.  This is done by blocking sigpipe in the
    libpq thread and using sigpending/sigwait to possibily discard any
    sigpipe we generated.

2004-12-01 20:34  tgl

    * src/: backend/optimizer/path/costsize.c,
    backend/optimizer/util/plancat.c,
    test/regress/expected/geometry.out,
    test/regress/expected/geometry_1.out,
    test/regress/expected/geometry_2.out,
    test/regress/expected/inherit.out, test/regress/expected/join.out,
    test/regress/sql/inherit.sql, test/regress/sql/join.sql: Make some
    adjustments to reduce platform dependencies in plan selection.    In
    particular, there was a mathematical tie between the two possible
    nestloop-with-materialized-inner-scan plans for a join (ie, we
    computed the same cost with either input on the inside), resulting
    in a roundoff error driven choice, if the relations were both small
    enough to fit in sort_mem.  Add a small cost factor to ensure we
    prefer materializing the smaller input.  This changes several
    regression test plans, but with any luck we will now have more
    stability across platforms.

2004-12-01 14:00  tgl

    * doc/src/sgml/catalogs.sgml, doc/src/sgml/diskusage.sgml,
    doc/src/sgml/perform.sgml, doc/src/sgml/release.sgml,
    src/backend/access/nbtree/nbtree.c, src/backend/catalog/heap.c,
    src/backend/catalog/index.c, src/backend/commands/vacuum.c,
    src/backend/commands/vacuumlazy.c,
    src/backend/optimizer/util/plancat.c,
    src/backend/optimizer/util/relnode.c, src/include/access/genam.h,
    src/include/nodes/relation.h, src/test/regress/expected/case.out,
    src/test/regress/expected/inherit.out,
    src/test/regress/expected/join.out,
    src/test/regress/expected/join_1.out,
    src/test/regress/expected/polymorphism.out: Change planner to use
    the current true disk file size as its estimate of a relation's
    number of blocks, rather than the possibly-obsolete value in
    pg_class.relpages.  Scale the value in pg_class.reltuples
    correspondingly to arrive at a hopefully more accurate number of
    rows.  When pg_class contains 0/0, estimate a tuple width from the
    column datatypes and divide that into current file size to estimate
    number of rows.  This improved methodology allows us to jettison
    the ancient hacks that put bogus default values into pg_class when
    a table is first created.  Also, per a suggestion from Simon, make
    VACUUM (but not VACUUM FULL or ANALYZE) adjust the value it puts
    into pg_class.reltuples to try to represent the mean tuple density
    instead of the minimal density that actually prevails just after
    VACUUM.  These changes alter the plans selected for certain
    regression tests, so update the expected files accordingly.  (I
    removed join_1.out because it's not clear if it still applies; we
    can add back any variant versions as they are shown to be needed.)

2004-11-21 17:57  tgl

    * src/backend/utils/hash/dynahash.c: Fix rounding problem in
    dynahash.c's decision about when the target fill factor has been
    exceeded.  We usually run with ffactor == 1, but the way the test
    was coded, it wouldn't split a bucket until the actual fill factor
    reached 2.0, because of use of integer division.  Change from > to
    >= so that it will split more aggressively when the table starts to
    get full.

2004-11-21 17:48  tgl

    * src/backend/utils/mmgr/portalmem.c: Reduce the default size of
    the PortalHashTable in order to save a few cycles during
    transaction exit.  A typical session probably wouldn't have as many
    as half a dozen portals open at once, so the original value of 64
    seems far larger than needed.

2004-11-20 15:19  tgl

    * src/backend/utils/cache/relcache.c: Avoid scanning the relcache
    during AtEOSubXact_RelationCache when there is nothing to do, which
    is most of the time.  This is another simple improvement to cut
    subtransaction entry/exit overhead.

2004-11-20 15:16  tgl

    * src/backend/storage/lmgr/lock.c: Reduce the default size of the
    local lock hash table.    There's usually no need for it to be nearly
    as big as the global hash table, and since it's not in shared
    memory it can grow if it does need to be bigger.  By reducing the
    size, we speed up hash_seq_search(), which saves a significant
    fraction of subtransaction entry/exit overhead.

2004-11-19 19:48  tgl

    * src/backend/tcop/postgres.c: Move pgstat_report_tabstat() call so
    that stats are not reported to the collector until the transaction
    commits.  Per recent discussion, this should avoid confusing
    autovacuum when an updating transaction runs for a long time.

2004-11-16 22:13  neilc

    * src/backend/access/: hash/hash.c, nbtree/nbtree.c:
    Micro-optimization of markpos() and restrpos() in btree and hash
    indexes.  Rather than using ReadBuffer() to increment the reference
    count on an already-pinned buffer, we should use
    IncrBufferRefCount() as it is faster and does not require acquiring
    the BufMgrLock.

2004-11-16 19:14  tgl

    * src/: backend/main/main.c, backend/port/win32/signal.c,
    backend/postmaster/pgstat.c, backend/postmaster/postmaster.c,
    include/port/win32.h: Fix Win32 problems with signals and sockets,
    by making the forkexec code even uglier than it was already :-(.
    Also, on Windows only, use temporary shared memory segments instead
    of ordinary files to pass over critical variable values from
    postmaster to child processes.    Magnus Hagander