Re: Large Scale Aggregation (HashAgg Enhancement)

Поиск
Список
Период
Сортировка
От Simon Riggs
Тема Re: Large Scale Aggregation (HashAgg Enhancement)
Дата
Msg-id 1137459287.3180.237.camel@localhost.localdomain
обсуждение исходный текст
Ответ на Re: Large Scale Aggregation (HashAgg Enhancement)  (Tom Lane <tgl@sss.pgh.pa.us>)
Ответы Re: Large Scale Aggregation (HashAgg Enhancement)  (Tom Lane <tgl@sss.pgh.pa.us>)
Список pgsql-hackers
On Mon, 2006-01-16 at 14:43 -0500, Tom Lane wrote:
> Simon Riggs <simon@2ndquadrant.com> writes:
> > For HJ we write each outer tuple to its own file-per-batch in the order
> > they arrive. Reading them back in preserves the original ordering. So
> > yes, caution required, but I see no difficulty, just reworking the HJ
> > code (nodeHashjoin and nodeHash). What else do you see?
> 
> With dynamic adjustment of the hash partitioning, some tuples will go
> through multiple temp files before they ultimately get eaten, and
> different tuples destined for the same aggregate may take different
> paths through the temp files depending on when they arrive.  It's not
> immediately obvious that ordering is preserved when that happens.
> I think it can be made to work but it may take different management of
> the temp files than hashjoin uses.  (Worst case, we could use just a
> single temp file for all unprocessed tuples, but this would result in
> extra I/O.)

Sure hash table is dynamic, but we read all inner rows to create the
hash table (nodeHash) before we get the outer rows (nodeHJ).
Why would we continue to dynamically build the hash table after the
start of the outer scan? (I see that we do this, as you say, but I am
surprised).

Best Regards, Simon Riggs







В списке pgsql-hackers по дате отправления:

Предыдущее
От: Robert Treat
Дата:
Сообщение: Re: [pgsql-www] source documentation tool doxygen
Следующее
От: Tom Lane
Дата:
Сообщение: Re: Large Scale Aggregation (HashAgg Enhancement)