Re: Spilling hashed SetOps and aggregates to disk

Поиск
Список
Период
Сортировка
От Jeff Davis
Тема Re: Spilling hashed SetOps and aggregates to disk
Дата
Msg-id 1529625425.3820.81.camel@j-davis.com
обсуждение исходный текст
Ответ на Re: Spilling hashed SetOps and aggregates to disk  (David Gershuni <dgershun@cs.cmu.edu>)
Список pgsql-hackers
On Thu, 2018-06-21 at 13:44 -0700, David Gershuni wrote:
> To handle hash collisions, we can do the following:
> 
> 1) We track the current hash code we’re processing, in ascending
> order.
> 
> 2) Instead of combining one group at at time, we’ll maintain a list
> of
> all groups we’ve seen that match the current hash code.
> 
> 3) When we peak at a spill file, if its next group’s hash code
> matches
> our current hash code, we’ll check to see if that group matches any
> of the
> groups in our list. If so, we’ll combine it with the matching group.
> If not,
> we’ll add this group to our list.
> 
> 4) Once the head of each spill file exceeds the current hash code,
> we’ll
> emit our list as output, empty it, and advance to the next hash code.
> 
> Does that seem reasonable?

It seems algorithmically reasonable but awfully complex. I don't think
that's a good path to take.

I still only see two viable approaches: 

(1) Spill tuples into hash partitions: works and is a reasonably simple
extension of our existing code. This is basically what the last patch I
posted does (posted before grouping sets, so needs to be updated).

(2) Spill tuples into a sort: I like this approach because it can be
done simply (potentially less code than #1), and could be further
improved without adding a ton of complexity. It may even eliminate the
need for the planner to choose between hashagg and sort. The problem
here is data types that have a hash opclass but no btree opclass. I
might try to sidestep this problem by saying that data types with no
btree opclass will not obey work_mem.

Additionally, there are two other ideas that we might want to do
regardless of which approach we take:

* support evicting transition states from the hash table, so that we
can handle clustered input better

* refactor grouping sets so that phases are separate executor nodes so
that we can undo some of the complexity in nodeAgg.c

Regards,
    Jeff Davis



В списке pgsql-hackers по дате отправления:

Предыдущее
От: Bruce Momjian
Дата:
Сообщение: Re: [Proposal] Table-level Transparent Data Encryption (TDE) and KeyManagement Service (KMS)
Следующее
От: Craig Ringer
Дата:
Сообщение: Re: PATCH: backtraces for error messages