Re: Final Patch for GROUPING SETS

Поиск
Список
Период
Сортировка
От Atri Sharma
Тема Re: Final Patch for GROUPING SETS
Дата
Msg-id CAOeZVidbB5v0FPG2=_QcROGvYJe+41LMiJ1Xy_ptqFZjNnp0Hw@mail.gmail.com
обсуждение исходный текст
Ответ на Re: Final Patch for GROUPING SETS  (Noah Misch <noah@leadboat.com>)
Ответы Re: Final Patch for GROUPING SETS  (Noah Misch <noah@leadboat.com>)
Список pgsql-hackers


   ChainAggregate is
a bit like a node having two parents, a Sort and a GroupAggregate.  However,
the graph edge between ChainAggregate and its GroupAggregate is a tuplestore
instead of the usual, synchronous ExecProcNode().

Well, I dont buy the two parents theory. The Sort nodes are intermediately stacked amongst ChainAggregate nodes, so there is still the single edge. However, as you rightly said, there is a shared tuplestore, but note that only the head of chain ChainAggregate has the top GroupAggregate as its parent.

Suppose one node orchestrated all sorting and aggregation.  Call it a
MultiGroupAggregate for now.  It wouldn't harness Sort nodes, because it
performs aggregation between tuplesort_puttupleslot() calls.  Instead, it
would directly manage two Tuplesortstate, CUR and NEXT.  The node would have
an initial phase similar to ExecSort(), in which it drains the outer node to
populate the first CUR.  After that, it looks more like agg_retrieve_direct(),
except that CUR is the input source, and each tuple drawn is also put into
NEXT.  When done with one CUR, swap CUR with NEXT and reinitialize NEXT.  This
design does not add I/O consumption or require a nonstandard communication
channel between executor nodes.  Tom, Andrew, does that look satisfactory?


So you are essentially proposing merging ChainAggregate and its corresponding Sort node?

So the structure would be something like:

GroupAggregate
--> MultiGroupAgg (a,b)
----> MultiGroupAgg (c,d) ...

I am not sure if I understand you correctly. Only the top level GroupAggregate node projects the result of the entire operation. The key to ChainAggregate nodes is that each ChainAggregate node handles grouping sets that fit a single ROLLUP list i.e. can be done by a single sort order. There can be multiple lists of this type in a single GS operation, however, our current design has only a single top GroupAggregate node but a ChainAggregate node + Sort node per sort order. If you are proposing replacing GroupAggregate node + entire ChainAggregate + Sort nodes stack with a single MultiGroupAggregate node, I am not able to understand how it will handle all the multiple sort orders. If you are proposing replacing only ChainAggregate + Sort node with a single MultiGroupAgg node, that still shares the tuplestore with top level GroupAggregate node.

I am pretty sure I have messed up my understanding of your proposal. Please correct me if I am wrong.

Regards,

Atri


--
Regards,
 
Atri
l'apprenant

В списке pgsql-hackers по дате отправления:

Предыдущее
От: Noah Misch
Дата:
Сообщение: Re: Final Patch for GROUPING SETS
Следующее
От: Michael Paquier
Дата:
Сообщение: Re: BUG: *FF WALs under 9.2 (WAS: .ready files appearing on slaves)