Re: Final Patch for GROUPING SETS

Поиск
Список
Период
Сортировка
От Andrew Gierth
Тема Re: Final Patch for GROUPING SETS
Дата
Msg-id 87wq0bbpht.fsf@news-spur.riddles.org.uk
обсуждение исходный текст
Ответ на Re: Final Patch for GROUPING SETS  (Andres Freund <andres@anarazel.de>)
Ответы Re: Final Patch for GROUPING SETS  (Andres Freund <andres@anarazel.de>)
Re: Final Patch for GROUPING SETS  (Andres Freund <andres@anarazel.de>)
Список pgsql-hackers
>>>>> "Andres" == Andres Freund <andres@anarazel.de> writes:
Andres> My problem is that, unless I very much misunderstand something,Andres> the current implementation can end up
requiringroughly #sets *Andres> #input of additional space for the "sidechannel tuplestore" inAndres> some bad cases.
Thathappens if you group by a couple clausesAndres> that each lead to a high number of groups.
 

The actual upper bound for the tuplestore size is the size of the
_result_ of the grouping, less one or two rows. You get that in cases
like grouping sets (unique_col, rollup(constant_col)), which seems
sufficiently pathological not to be worth worrying about greatly.

In normal cases, the size of the tuplestore is the size of the result
minus the rows processed directly by the top node. So the only way the
size can be an issue is if the result set size itself is also an issue,
and in that case I don't really think that this is going to be a matter
of significant concern.
Andres> A rough sketch of what I'm thinking of is:

I'm not sure I'd do it quite like that. Rather, have a wrapper function
get_outer_tuple that calls ExecProcNode and, if appropriate, writes the
tuple to a tuplesort before returning it; use that in place of
ExecProcNode in agg_retrieve_direct and when building the hash table.

The problem with trying to turn agg_retrieve_direct inside-out (to make
it look more like agg_retrieve_chained) is that it potentially projects
multiple output groups (not just multiple-result projections) from a
single input tuple, so it has to have some control over whether a tuple
is read or not. (agg_retrieve_chained avoids this problem because it can
loop over the projections, since it's writing to the tuplestore rather
than returning to the caller.)
Andres> I think this is quite doable and seems likely to actually endAndres> up with easier to understand code.  But
unfortunatelyit seemsAndres> to be big enough of a change to make it unlikely to be done inAndres> sufficient quality
untilthe freeze.  I'll nonetheless work aAndres> couple hours on it tomorrow.
 
Andres> Andrew, is that a structure you could live with, or not?

Well, I still think the opaque-blobless isn't nice, but I retract some
of my previous concerns; I can see a way to do it that doesn't
significantly impinge on the difficulty of adding hash support.

It sounds like I have more time immediately available than you do. As
discussed on IRC, I'll take the first shot, and we'll see how far I can
get.

-- 
Andrew (irc:RhodiumToad)



В списке pgsql-hackers по дате отправления:

Предыдущее
От: Etsuro Fujita
Дата:
Сообщение: Minor improvement to create_foreign_table.sgml
Следующее
От: Dave Page
Дата:
Сообщение: Re: pgAdmin4 Bug fix or my Fault ?