Re: Spilling hashed SetOps and aggregates to disk

Поиск
Список
Период
Сортировка
От Andres Freund
Тема Re: Spilling hashed SetOps and aggregates to disk
Дата
Msg-id 20180604185205.epue25jzpavokupf@alap3.anarazel.de
обсуждение исходный текст
Ответ на Spilling hashed SetOps and aggregates to disk  (Heikki Linnakangas <hlinnaka@iki.fi>)
Ответы Re: Spilling hashed SetOps and aggregates to disk  (David Rowley <david.rowley@2ndquadrant.com>)
Re: Spilling hashed SetOps and aggregates to disk  (Jeff Davis <pgsql@j-davis.com>)
Список pgsql-hackers
Hi,

On 2018-06-04 10:32:47 +0200, Heikki Linnakangas wrote:
> Hash Aggs and SetOps are currently not spilled to disk. If the planner's
> estimate on the number of entries is badly off, you might run out of memory
> at execution time, if all the entries don't fit in memory.
> 
> For HashAggs, this was discussed in depth a couple of years ago at [1].
> SetOps have the same issue, but fixing that is simpler, as you don't need to
> handle arbitrary aggregate transition values and functions.

That part has gotten a bit easier since, because we have serialize /
deserialize operations for aggregates these days.

I wonder whether, at least for aggregates, the better fix wouldn't be to
switch to feeding the tuples into tuplesort upon memory exhaustion and
doing a sort based aggregate.  We have most of the infrastructure to do
that due to grouping sets. It's just the pre-existing in-memory tuples
that'd be problematic, in that the current transition values would need
to serialized as well.  But with a stable sort that'd not be
particularly problematic, and that could easily be achieved.

Greetings,

Andres Freund


В списке pgsql-hackers по дате отправления:

Предыдущее
От: Andres Freund
Дата:
Сообщение: Re: [HACKERS] Moving relation extension locks out of heavyweightlock manager
Следующее
От: Andres Freund
Дата:
Сообщение: Re: plans for PostgreSQL 12