Re: [PATCH] Negative Transition Aggregate Functions (WIP)

Поиск
Список
Период
Сортировка
От David Rowley
Тема Re: [PATCH] Negative Transition Aggregate Functions (WIP)
Дата
Msg-id CAApHDvqnKRxJKRVURkQ19Nc7diaLbYw9Espa4Bo4OhtpfbzrMw@mail.gmail.com
обсуждение исходный текст
Ответ на Re: [PATCH] Negative Transition Aggregate Functions (WIP)  (David Rowley <dgrowleyml@gmail.com>)
Ответы Re: [PATCH] Negative Transition Aggregate Functions (WIP)  (Florian Pflug <fgp@phlo.org>)
Re: [PATCH] Negative Transition Aggregate Functions (WIP)  (David Rowley <dgrowleyml@gmail.com>)
Список pgsql-hackers
On Sat, Jan 18, 2014 at 10:42 AM, David Rowley <dgrowleyml@gmail.com> wrote:

You could do better than that - the numeric problem amounts to tracking the maximum
scale AFAICS, so it'd be sufficient to restart if we remove a value whose scale equals
the current maximum. But if we do SUM(numeric) at all, I think we should do so
without requiring restarts - I still think the overhead of tracking the maximum scale
within the aggregated values isn't that bad. If we zero the dscale counters lazily,
the numbers of entries we have to zero is bound by the maximum dscale we encounter.
Since actually summing N digits is probably more expensive than zeroing them, and since
we pay the zeroing overhead only once per aggregation but save potentially many
summations, I'm pretty sure we come out ahead by quite some margin.


We'll work that out, I don't think it will take very long to code up your idea either.
I just thought that my idea was good enough and very cheap too, won't all numerics that are actually stored in a column have the same scale anyway? Is it not only been a problem because we've been testing with random numeric literals the whole time?

The test turned out to become:
if (state->expectedScale == -1)
state->expectedScale = X.dscale;
else if (state->expectedScale != X.dscale)
state->expectedScale = -2;

In do_numeric_accum, then when we do the inverse I just check if expectedScale < 0 then return NULL.

I'm not set on it, and I'm willing to try the lazy zeroing of the scale tracker array, but I'm just not quite sure what extra real use cases it covers that the one above does not. Perhaps my way won't perform inverse transitions if the query did sum(numericCol / 10) or so.

I'll be committing this to the github repo very soon, so we can hack away at the scale tracking code once it's back in.


Ok, we're back up to 86 aggregate function / type combinations with inverse transition functions.
I've commited my latest work up to github and here's a fresh patch which I will need to do more tests on.

The test (below) that used to fail a few versions ago is back in there and it's now passing.

SELECT SUM(n::numeric) OVER (ORDER BY i ROWS BETWEEN CURRENT ROW AND UNBOUNDED FOLLOWING)
  FROM (VALUES(1,1.01),(2,2),(3,3)) v(i,n);

In this case it won't use inverse transitions because the forward transition function detects that the scale is not fixed.

I tested throwing some numerics into a table and I'm pretty happy with the results.

create table num (num numeric(10,2) not null);
insert into num (num) select * from generate_series(1,20000);
select sum(num) over(order by num rows between current row and unbounded following) from num; -- 124ms
select sum(num / 10) over(order by num rows between current row and unbounded following) from num; -- 254ms
select sum(num / 1) over(order by num rows between current row and unbounded following) from num; -- 108156.917 ms

The divide by 1 case is slow because of that weird 20 trailing zero instead of 16 when dividing a numeric by 1 and that causes the inverse transition function to return NULL because the scale changes.

I've not tested an unpatched version yet to see how that divide by 1 query performs on that but I'll get to that soon.

I'm thinking that the idea about detecting the numeric range with floating point types and performing an inverse transition providing the range has not gone beyond some defined danger zone is not material for this patch... I think it would be not a great deal of work to code, but the testing involved would probably make this patch not possible for 9.4

The latest version of the patch is attached.

Regards

David Rowley
 
Вложения

В списке pgsql-hackers по дате отправления:

Предыдущее
От: Noah Misch
Дата:
Сообщение: Re: Triggers on foreign tables
Следующее
От: Tom Lane
Дата:
Сообщение: Re: [PATCH] Negative Transition Aggregate Functions (WIP)