Re: Combining Aggregates

Поиск
Список
Период
Сортировка
От David Rowley
Тема Re: Combining Aggregates
Дата
Msg-id CAApHDvpgXhghtpmuKPhnBj9ZDeEPy-8C0StXgG-GuTPAMdYp6A@mail.gmail.com
обсуждение исходный текст
Ответ на Re: Combining Aggregates  (Kouhei Kaigai <kaigai@ak.jp.nec.com>)
Список pgsql-hackers
On 18 February 2015 at 21:13, Kouhei Kaigai <kaigai@ak.jp.nec.com> wrote:
This patch itself looks good as an infrastructure towards
the big picture, however, we still don't reach the consensus
how combined functions are used instead of usual translation
functions.

Thank you for taking the time to look at the patch.

Aggregate function usually consumes one or more values extracted
from a tuple, then it accumulates its internal state according
to the argument. Exiting transition function performs to update
its internal state with assumption of a function call per records.
On the other hand, new combined function allows to update its
internal state with partial aggregated values which is processed
by preprocessor node.
An aggregate function is represented with Aggref node in plan tree,
however, we have no certain way to determine which function shall
be called to update internal state of aggregate.


This is true, there's nothing done in the planner to set any sort of state in the aggregation nodes to tell them weather to call the final function or not.  It's quite hard to know how far to go with this patch. It's really only intended to provide the necessary infrastructure for things like parallel query and various other possible usages of aggregate combine functions. I don't think it's really appropriate for this patch to go adding such a property to any nodes as there would still be nothing in the planner to actually set those properties...  The only thing I can think of to get around this is implement the most simple use for combine aggregate functions, the problem with that is, that the most simple case is not at all simple.
 
 
For example, avg(float) has an internal state with float[3] type
for number of rows, sum of X and X^2. If combined function can
update its internal state with partially aggregated values, its
argument should be float[3]. It is obviously incompatible to
float8_accum(float) that is transition function of avg(float).
I think, we need a new flag on Aggref node to inform executor
which function shall be called to update internal state of
aggregate. Executor cannot decide it without this hint.

Also, do you have idea to push down aggregate function across
joins? Even though it is a bit old research, I could find
a systematic approach to push down aggregate across join.
https://cs.uwaterloo.ca/research/tr/1993/46/file.pdf


I've not read the paper yet, but I do have a very incomplete WIP patch to do this. I've just not had much time to work on it.
 
I think, it is great if core functionality support this query
rewriting feature based on cost estimation, without external
modules.
 
Regards

David Rowley

В списке pgsql-hackers по дате отправления:

Предыдущее
От: Shigeru Hanada
Дата:
Сообщение: Re: Join push-down support for foreign tables
Следующее
От: David Rowley
Дата:
Сообщение: Re: Combining Aggregates