Re: Combining Aggregates

Поиск
Список
Период
Сортировка
От Atri Sharma
Тема Re: Combining Aggregates
Дата
Msg-id CAOeZVid3R6SV7R2EFvK36YzWMEU3g5rYJKAUNQqKcP3crTFMew@mail.gmail.com
обсуждение исходный текст
Ответ на Re: Combining Aggregates  (Kouhei Kaigai <kaigai@ak.jp.nec.com>)
Ответы Re: Combining Aggregates  (Kouhei Kaigai <kaigai@ak.jp.nec.com>)
Список pgsql-hackers


On Wed, Dec 17, 2014 at 6:05 PM, Kouhei Kaigai <kaigai@ak.jp.nec.com> wrote:
Simon,

Its concept is good to me. I think, the new combined function should be
responsible to take a state data type as argument and update state object
of the aggregate function. In other words, combined function performs like
transition function but can update state object according to the summary
of multiple rows. Right?

It also needs some enhancement around Aggref/AggrefExprState structure to
inform which function shall be called on execution time.
Combined functions are usually no-thank-you. AggrefExprState updates its
internal state using transition function row-by-row. However, once someone
push-down aggregate function across table joins, combined functions have
to be called instead of transition functions.
I'd like to suggest Aggref has a new flag to introduce this aggregate expects
state object instead of scalar value.

Also, I'd like to suggest one other flag in Aggref not to generate final
result, and returns state object instead.



So are you proposing not calling transfuncs at all and just use combined functions?

That sounds counterintuitive to me. I am not able to see why you would want to avoid transfns totally even for the case of pushing down aggregates that you mentioned. 

From Simon's example mentioned upthread:

PRE-AGGREGATED PLAN
Aggregate
-> Join
     -> PreAggregate (doesn't call finalfn)
          -> Scan BaseTable1
     -> Scan BaseTable2

finalfn wouldnt be called. Instead, combined function would be responsible for getting preaggregate results and combining them (unless of course, I am missing something).

Special casing transition state updating in Aggref seems like a bad idea to me. I would think that it would be better if we made it more explicit i.e. add a new node on top that does the combination (it would be primarily responsible for calling combined function).

Not a good source of inspiration, but seeing how SQL Server does it (Exchange operator + Stream Aggregate) seems intuitive to me, and having combination operation as a separate top node might be a cleaner way.

I may be wrong though.

Regards,

Atri

В списке pgsql-hackers по дате отправления:

Предыдущее
От: Martín Marqués
Дата:
Сообщение: postgres messages error
Следующее
От: Andrew Dunstan
Дата:
Сообщение: Re: POLA violation with \c service=