Re: [PATCH] Negative Transition Aggregate Functions (WIP)

Поиск
Список
Период
Сортировка
От David Rowley
Тема Re: [PATCH] Negative Transition Aggregate Functions (WIP)
Дата
Msg-id CAApHDvp2XHnS9ypMv+yEbLGYBvGUT49WJ7Xv4fN8D2agDsGOkQ@mail.gmail.com
обсуждение исходный текст
Ответ на Re: [PATCH] Negative Transition Aggregate Functions (WIP)  (Florian Pflug <fgp@phlo.org>)
Ответы Re: [PATCH] Negative Transition Aggregate Functions (WIP)  (David Rowley <dgrowleyml@gmail.com>)
Re: [PATCH] Negative Transition Aggregate Functions (WIP)  (Tom Lane <tgl@sss.pgh.pa.us>)
Список pgsql-hackers
On Sat, Jan 18, 2014 at 10:17 AM, Florian Pflug <fgp@phlo.org> wrote:
On Jan17, 2014, at 20:34 , David Rowley <dgrowleyml@gmail.com> wrote:
> On Fri, Jan 17, 2014 at 3:05 PM, Florian Pflug <fgp@phlo.org> wrote:
>
>> I've now shuffled things around so that we can use inverse transition functions
>> even if only some aggregates provide them, and to allow inverse transition
>> functions to force a restart by returning NULL. The rules regarding NULL handling
>> are now the following
>
> Maybe this is me thinking out loud, but I'm just thinking about the numeric case again.
> Since the patch can now handle inverse transition functions returning NULL when they
> fail to perform inverse transitions, I'm wondering if we could add an "expectedscale"
> to NumericAggState, set it to -1 initially, when we get the first value set the
> expectedscale to the dscale of that numeric, then if we get anything but that value
> we'll set the expectedscale back to -1 again, if we are asked to perform an inverse
> transition with a expectedscale as -1 we'll return null, otherwise we can perform
> the inverse transition...

You could do better than that - the numeric problem amounts to tracking the maximum
scale AFAICS, so it'd be sufficient to restart if we remove a value whose scale equals
the current maximum. But if we do SUM(numeric) at all, I think we should do so
without requiring restarts - I still think the overhead of tracking the maximum scale
within the aggregated values isn't that bad. If we zero the dscale counters lazily,
the numbers of entries we have to zero is bound by the maximum dscale we encounter.
Since actually summing N digits is probably more expensive than zeroing them, and since
we pay the zeroing overhead only once per aggregation but save potentially many
summations, I'm pretty sure we come out ahead by quite some margin.


We'll work that out, I don't think it will take very long to code up your idea either.
I just thought that my idea was good enough and very cheap too, won't all numerics that are actually stored in a column have the same scale anyway? Is it not only been a problem because we've been testing with random numeric literals the whole time?

The test turned out to become:
if (state->expectedScale == -1)
state->expectedScale = X.dscale;
else if (state->expectedScale != X.dscale)
state->expectedScale = -2;

In do_numeric_accum, then when we do the inverse I just check if expectedScale < 0 then return NULL.

I'm not set on it, and I'm willing to try the lazy zeroing of the scale tracker array, but I'm just not quite sure what extra real use cases it covers that the one above does not. Perhaps my way won't perform inverse transitions if the query did sum(numericCol / 10) or so.

I'll be committing this to the github repo very soon, so we can hack away at the scale tracking code once it's back in.

David Rowley

В списке pgsql-hackers по дате отправления:

Предыдущее
От: Andrew Dunstan
Дата:
Сообщение: Re: currawong is not a happy animal
Следующее
От: Noah Misch
Дата:
Сообщение: Re: Triggers on foreign tables