Re: OLAP CUBE/ROLLUP Operators and GROUP BY grouping sets

Поиск
Список
Период
Сортировка
От Robert Bedell
Тема Re: OLAP CUBE/ROLLUP Operators and GROUP BY grouping sets
Дата
Msg-id 200312172308104.SM00984@xavier
обсуждение исходный текст
Ответ на Re: OLAP CUBE/ROLLUP Operators and GROUP BY grouping sets  (Tom Lane <tgl@sss.pgh.pa.us>)
Ответы Re: OLAP CUBE/ROLLUP Operators and GROUP BY grouping sets  (Tom Lane <tgl@sss.pgh.pa.us>)
Re: OLAP CUBE/ROLLUP Operators and GROUP BY grouping sets  (Tom Lane <tgl@sss.pgh.pa.us>)
Re: OLAP CUBE/ROLLUP Operators and GROUP BY grouping sets  (Sailesh Krishnamurthy <sailesh@cs.berkeley.edu>)
Список pgsql-hackers
> Hannu Krosing <hannu@tm.ee> writes:
> > OTOH, I'm not sure if hash aggregates can already spill to disk if not
> > enough memory is available for keeping them all.
> 
> They do not, which is something it'd be good to fix, since if the
> planner drastically underestimates the number of groups, you could end
> up with a hashtable far bigger than the advertised SortMem.  Even if
> this doesn't end up with an "out of memory" error, it could still drive
> you into swap hell.
> 
> (Maybe that should be Robert's get-your-feet-wet project.)

Ok, I've been looking through the executor - nodeGroup.c, nodeAgg.c,
dynahash.c, execGrouping.c, etc..  I've found the places where hashed
aggregation occurs.  It looks like hashed groups simply read the entire
result in memory via a function chain through lookup_hash_entry() ->
LookupTupleHashEntry() -> hash_search().  I think that LookupTupleHashEntry
is the best place to put the code to spill over unto disk, since that is
used only by the Group, Agg, and Subplan executor nodes.  Putting it there
gives the added benefit of letting hashed subselect results spill over unto
disk as well (not sure if that's the right thing to do).  I have a couple of
questions:

1) When does the optimizer set the nodeAgg plan to HASHED?  The current
implementation simply reads through the entire result before returning
anything, so obviously it's not always done.  Sorry if I should RTFM on this
one....

2) What mechanism would be best to use for storing the data on disk?  I know
there is a temporary table mechanism, I'll be hunting for that shortly..

3) What should define the spillover point.  The documentation points to the
'sort_mem' parameter for this, but the code doesn't look to actually
implement that yet.  Similarly for the hash_search() function - I didn't see
anything to store it to disk.

4) Should LookupTupleHashEntry() be worried about the pointers it
receives...similarly for hash_search()?

Obviously (2) is the really big question.  What's the best way to do this?
I still have a bit more to go before I understand what's going on, but I'm
starting to grasp it.  Any tips would be appreciated! :)

Cheers!

Robert



В списке pgsql-hackers по дате отправления:

Предыдущее
От: Bruce Momjian
Дата:
Сообщение: Re: relation_byte_size()
Следующее
От: Tom Lane
Дата:
Сообщение: Re: OLAP CUBE/ROLLUP Operators and GROUP BY grouping sets