Re: [PATCH] Negative Transition Aggregate Functions (WIP)

Поиск
Список
Период
Сортировка
От Florian Pflug
Тема Re: [PATCH] Negative Transition Aggregate Functions (WIP)
Дата
Msg-id F33008A9-ED48-4D7D-BAB0-D3E2D695582B@phlo.org
обсуждение исходный текст
Ответ на Re: [PATCH] Negative Transition Aggregate Functions (WIP)  (Tom Lane <tgl@sss.pgh.pa.us>)
Ответы Re: [PATCH] Negative Transition Aggregate Functions (WIP)  (Florian Pflug <fgp@phlo.org>)
Список pgsql-hackers
On Apr9, 2014, at 21:35 , Tom Lane <tgl@sss.pgh.pa.us> wrote:
> As a quick check, I compared aggregation performance in HEAD, non-assert
> builds, with and without --disable-float8-byval on a 64-bit machine.
> So this tests replacing a pass-by-val transition datatype with a
> pass-by-ref one without any other changes.  There's essentially no
> difference in performance of sum(int4), AFAICT, but that's because
> int4_sum goes out of its way to cheat and avoid palloc overhead.

> I looked to the bit_and() aggregates to see what would happen to
> an aggregate not thus optimized.  As expected, int4 and int8 bit_and
> are just about the same speed if int8 is pass by value ... but if it's
> pass by ref, the int8 case is a good 60% slower.

True, but that just means that aggregate transition functions really
ought to update the state in-place, no?

> So added palloc overhead, at least, is a no-go.  I see that the patched
> version of sum(int4) avoids that trap, but nonetheless it's replaced a
> pretty cheap transition function with a less cheap function, namely the
> function previously used for avg(int4).  A quick test says that avg(int4)
> is about five percent slower than sum(int4), so that's the kind of hit
> we'd be taking on non-windowed aggregations if we do it like this.

That's rather surprising though. AFAICS, there's isn't much difference
between the two transfer functions int4_sum and int4_avg_accum at all.

The differences come down to (+ denoting things which ought to make
int4_avg_accum slower compared to int4_sum, - denoting the opposite)
1. +) int4_avg_accum calls AggCheckCallContext2. -) int4_sum checks if the state is NULL (it never is for
int4_avg_accum)3.+) int4_avg_accum uses ARR_HASNULL, ARR_SIZE, ARR_OVERHEAD_NONULLS      to verify that the state is a
2-elementarray without NULL entries4. -) int4_sum checks if the input is NULL
 

The number of conditional branches should be about the same (and all are
seldomly taken). The validity checks on the state array, i.e. (3), should
be rather cheap I think - not quite as cheap as PG_ARGISNULL maybe, but
not so much more expensive either. That leaves the AggCheckCallContext call.
If that call costs us 5%, maybe we can find a way to make it faster, or
get rid of it entirely? Still seems a lot of a call of a not-very-complex
function, though...

I'll go and check the disassembly - maybe something in int4_avg_accum turns
out to be more complex than is immediately obvious. I'll also try to create
a call profile, unless you already have one from your test runs.

best regards,
Florian Pflug





В списке pgsql-hackers по дате отправления:

Предыдущее
От: Tom Lane
Дата:
Сообщение: Re: WIP patch (v2) for updatable security barrier views
Следующее
От: Stephen Frost
Дата:
Сообщение: Re: WIP patch (v2) for updatable security barrier views