Re: [PATCH] Incremental sort (was: PoC: Partial sort)

Поиск
Список
Период
Сортировка
От James Coleman
Тема Re: [PATCH] Incremental sort (was: PoC: Partial sort)
Дата
Msg-id CAAaqYe_JMQh0FeSSynr_j66ZNNiX3BjKoLiMgMF4zna1m0tcVg@mail.gmail.com
обсуждение исходный текст
Ответ на Re: [PATCH] Incremental sort (was: PoC: Partial sort)  (Tomas Vondra <tomas.vondra@2ndquadrant.com>)
Ответы Re: [PATCH] Incremental sort (was: PoC: Partial sort)  (Peter Geoghegan <pg@bowt.ie>)
Список pgsql-hackers
On Tue, Jun 25, 2019 at 12:02 PM Tomas Vondra
<tomas.vondra@2ndquadrant.com> wrote:
>
> On Mon, Jun 24, 2019 at 07:34:19PM -0400, James Coleman wrote:
> >On Mon, Jun 24, 2019 at 4:16 PM Tomas Vondra
> ><tomas.vondra@2ndquadrant.com> wrote:
> >>
> >> On Mon, Jun 24, 2019 at 01:00:50PM -0400, James Coleman wrote:
> >> >On Mon, Jun 24, 2019 at 12:56 PM Simon Riggs <simon@2ndquadrant.com> wrote:
> >> >>
> >> >> On Mon, 24 Jun 2019 at 16:10, James Coleman <jtc331@gmail.com> wrote:
> >> >>>
> >> >>> On Thu, Jun 13, 2019 at 11:38:12PM -0400, James Coleman wrote:
> >...
> >> >>> As I see it the two most significant concerning cases right now are:
> >> >>> 1. Very large batches (in particular where the batch is effectively
> >> >>> all of the matching rows such that we're really just doing a standard
> >> >>> sort).
> >> >>> 2. Many very small batches.
> >> >>
> >> >>
> >> >> What is the specific use case for this? This sounds quite general case.
> >> >
> >> >They are both general cases in some sense, but the concerns lie mostly
> >> >with what happens when they're unexpectedly encountered. For example,
> >> >if the expected row count or group size is off by a good bit and we
> >> >effectively have to perform a sort of all (or most) possible rows.
> >> >
> >> >If we can get the performance to a point where that misestimated row
> >> >count or group size doesn't much matter, then ISTM including the patch
> >> >becomes a much more obvious total win.
> >> >
> >>
> >> Yes, that seems like a reasonable approach. Essentially, we're trying to
> >> construct plausible worst case examples, and then minimize the overhead
> >> compared to regular sort. If we get sufficiently close, then it's fine
> >> to rely on somewhat shaky stats - like group size estimates.
> >
> >I have a bit of a mystery in my performance testing. I've been setting
> >up a table like so:
> >
> >create table foo(pk serial primary key, owner_fk integer, created_at timestamp);
> >insert into foo(owner_fk, created_at)
> >select fk_t.i, now() - (time_t.i::text || ' minutes')::interval
> >from generate_series(1, 10000) time_t(i)
> >cross join generate_series(1, 1000) fk_t(i);
> >-- double up on one set to guarantee matching prefixes
> >insert into foo (owner_fk, created_at) select owner_fk, created_at
> >from foo where owner_fk = 23;
> >create index idx_foo_on_owner_and_created_at on foo(owner_fk, created_at);
> >analyze foo;
> >
> >and then I have the following query:
> >
> >select *
> >from foo
> >where owner_fk = 23
> >order by created_at desc, pk desc
> >limit 20000;
> >
> >The idea here is to force a bit of a worst case for small groups: we
> >have 10,000 batches (i.e., equal prefix groups) of 2 tuples each and
> >then query with a limit matching the actual number of rows we know
> >will match the query -- so even though there's a limit we're forcing a
> >total sort (and also guaranteeing both plans have to touch the same
> >number of rows). Note: I know that batches of size is actually the
> >worst case, but I chose batches of two because I've also been testing
> >a change that would skip the sort entirely for single tuple batches.
> >
> >On master (really the commit right before the current revision of the
> >patch), I get:
> >latency average = 14.271 ms
> >tps = 70.075243 (excluding connections establishing)
> >
> >With the patch (and incremental sort enabled):
> >latency average = 11.975 ms
> >tps = 83.512090 (excluding connections establishing)
> >
> >With the patch (but incremental sort disabled):
> >latency average = 11.884 ms
> >tps = 84.149834 (excluding connections establishing)
> >
> >All of those are 60 seconds runs on pgbench with a single thread.
> >
> >So we have a very substantial speedup with patch *even if the new
> >feature isn't enabled*. I've confirmed the plan looks the same on
> >patched with incremental sort disabled and master. The only changes
> >that would seem to really effect execution time would be the changes
> >to tuplesort.c, but looking through them I don't see anything I'd
> >expect to change things so dramatically.
> >
> >Any thoughts on this?
> >
>
> I can reproduce the same thing, so it's not just you. On my machine, I see
> these tps numbers (average of 10 runs, 60 seconds each):
>
>   master:        65.177
>   patched (on):  80.368
>   patched (off): 80.750
>
> The numbers are very consistent (within 1 tps).
>
> I've done a bit of CPU profiling, and on master I see this:
>
>     13.84%  postgres  postgres            [.] comparetup_heap
>      4.83%  postgres  postgres            [.] qsort_tuple
>      3.87%  postgres  postgres            [.] pg_ltostr_zeropad
>      3.55%  postgres  postgres            [.] pg_ltoa
>      3.19%  postgres  postgres            [.] AllocSetAlloc
>      2.68%  postgres  libc-2.28.so        [.] __GI___strlen_sse2
>      2.38%  postgres  postgres            [.] LWLockRelease
>      2.38%  postgres  postgres            [.] AppendSeconds.constprop.9
>      2.22%  postgres  libc-2.28.so        [.] __memmove_sse2_unaligned_erms
>      2.17%  postgres  postgres            [.] GetPrivateRefCountEntry
>      2.03%  postgres  postgres            [.] j2date
>      ...
>
> while on patched versions I see this:
>
>      4.60%  postgres  postgres            [.] pg_ltostr_zeropad
>      4.51%  postgres  postgres            [.] pg_ltoa
>      3.50%  postgres  postgres            [.] AllocSetAlloc
>      3.34%  postgres  libc-2.28.so        [.] __GI___strlen_sse2
>      2.99%  postgres  postgres            [.] LWLockRelease
>      2.84%  postgres  postgres            [.] AppendSeconds.constprop.9
>      2.65%  postgres  postgres            [.] GetPrivateRefCountEntry
>      2.64%  postgres  postgres            [.] j2date
>      2.60%  postgres  postgres            [.] printtup
>      2.56%  postgres  postgres            [.] heap_hot_search_buffer
>      ...
>      1.35%  postgres  postgres            [.] comparetup_heap
>      ...
>
> So either we're calling comparetup_heap less often, or it's cheaper.
>
> But it seems to be very dependent on the data set. If you do this:
>
>     create table foo_2 as select * from foo order by random();
>     alter table foo_2 add primary key (pk);
>     create index idx_foo_2_on_owner_and_created_at on foo_2 (owner_fk, created_at);
>
> and then run the test against this table, there's no difference.
>
> So my guess is this particular data set triggers slightly different
> behavior in tuplesort, reducing the cost of comparetup_heap. The speedup
> is quite significant (~20% on my system), the question is how widely
> applicable can it be.

Thanks for confirming!

Given the patch contents I don't see any obviously reason why either
of those possibilities (calling comparetup_heap less often, or it's
cheaper) are likely. Is that something we should investigate further?
Or is it just a nice happy accident that we should ignore since it's
dataset specific?

Anyway, when evaluating the patch performance with oddities like this
would you compare performance of incremental sort on to off on the
same revision or to master when determining if it's a regression in
performance? I could make an argument either way I think.

James Coleman



В списке pgsql-hackers по дате отправления:

Предыдущее
От: Tomas Vondra
Дата:
Сообщение: Re: [PATCH] Incremental sort (was: PoC: Partial sort)
Следующее
От: Peter Geoghegan
Дата:
Сообщение: Re: [PATCH] Incremental sort (was: PoC: Partial sort)