Spilling hashed SetOps and aggregates to disk

Поиск
Список
Период
Сортировка
От Heikki Linnakangas
Тема Spilling hashed SetOps and aggregates to disk
Дата
Msg-id 87be3bd5-6b13-d76e-5618-6db0a4db584d@iki.fi
обсуждение исходный текст
Ответы Re: Spilling hashed SetOps and aggregates to disk  (Andres Freund <andres@anarazel.de>)
Список pgsql-hackers
Hi,

Hash Aggs and SetOps are currently not spilled to disk. If the planner's 
estimate on the number of entries is badly off, you might run out of 
memory at execution time, if all the entries don't fit in memory.

For HashAggs, this was discussed in depth a couple of years ago at [1]. 
SetOps have the same issue, but fixing that is simpler, as you don't 
need to handle arbitrary aggregate transition values and functions.

So a while back, I started hacking on spilling SetOps, with the idea 
that the code to deal with that could later be reused to deal with 
HashAggs, too. I didn't get very far, but I'm posting this in this very 
unfinished form to show what I've got, because I had a chat on this with 
Jeff Davis and some others last week.

The logtape.c interface would be very useful for this. When you start 
spilling, you want to create many spill files, so that when reloaded, 
each file will fit comfortably in memory. With logtape.c, you can have 
many logical tapes, without the overhead of real files. Furthermore, if 
you need to re-spill because you a spill file grows too big in the first 
pass, logtape.c allows reusing the space "on-the-fly". The only problem 
with the current logtape interface is that it requires specifying the 
number of "tapes" upfront, when the tapeset is created. However, I was 
planning to change that, anyway [2].

[1] 
https://www.postgresql.org/message-id/1407706010.6623.16.camel%40jeff-desktop

[2] 
https://www.postgresql.org/message-id/420a0ec7-602c-d406-1e75-1ef7ddc58d83%40iki.fi

- Heikki


Вложения

В списке pgsql-hackers по дате отправления:

Предыдущее
От: Vik Fearing
Дата:
Сообщение: Re: plans for PostgreSQL 12
Следующее
От: Pavel Stehule
Дата:
Сообщение: Re: plans for PostgreSQL 12