Re: Large Scale Aggregation (HashAgg Enhancement)

Поиск
Список
Период
Сортировка
От Tom Lane
Тема Re: Large Scale Aggregation (HashAgg Enhancement)
Дата
Msg-id 17440.1137440589@sss.pgh.pa.us
обсуждение исходный текст
Ответ на Re: Large Scale Aggregation (HashAgg Enhancement)  (Simon Riggs <simon@2ndquadrant.com>)
Ответы Re: Large Scale Aggregation (HashAgg Enhancement)  (Simon Riggs <simon@2ndquadrant.com>)
Список pgsql-hackers
Simon Riggs <simon@2ndquadrant.com> writes:
> For HJ we write each outer tuple to its own file-per-batch in the order
> they arrive. Reading them back in preserves the original ordering. So
> yes, caution required, but I see no difficulty, just reworking the HJ
> code (nodeHashjoin and nodeHash). What else do you see?

With dynamic adjustment of the hash partitioning, some tuples will go
through multiple temp files before they ultimately get eaten, and
different tuples destined for the same aggregate may take different
paths through the temp files depending on when they arrive.  It's not
immediately obvious that ordering is preserved when that happens.
I think it can be made to work but it may take different management of
the temp files than hashjoin uses.  (Worst case, we could use just a
single temp file for all unprocessed tuples, but this would result in
extra I/O.)
        regards, tom lane


В списке pgsql-hackers по дате отправления:

Предыдущее
От: Bruce Momjian
Дата:
Сообщение: Re: PostgreSQL win32 & NT4
Следующее
От: Simon Riggs
Дата:
Сообщение: Re: Improving N-Distinct estimation by ANALYZE