Re: Spilling hashed SetOps and aggregates to disk

Поиск

Список

Период

Сортировка

От	Jeff Davis
Тема	Re: Spilling hashed SetOps and aggregates to disk
Дата	5 июня 2018 г. 08:46:47
Msg-id	1528177607.2742.32.camel@j-davis.com обсуждение исходный текст
Ответ на	Re: Spilling hashed SetOps and aggregates to disk (Tomas Vondra <tomas.vondra@2ndquadrant.com>)
Ответы	Re: Spilling hashed SetOps and aggregates to disk Re: Spilling hashed SetOps and aggregates to disk
Список	pgsql-hackers

Дерево обсуждения

On Tue, 2018-06-05 at 07:04 +0200, Tomas Vondra wrote:
> I expect the eviction strategy to be the primary design challenge of
> this patch. The other bits will be mostly determined by this one
> piece.

Not sure I agree that this is the primary challenge.

The cases that benefit from eviction are probably a minority. I see two
categories that would benefit:

  * Natural clustering in the heap. This sounds fairly common, but a
lot of the cases that come to mind are too low-cardinality to be
compelling; e.g. timestamps grouped by hour/day/month. If someone has
run into a high-cardinality natural grouping case, let me know.
  * ARRAY_AGG (or similar): individual state values can be large enough
that we need to evict to avoid exceeding work_mem even if not adding
any new groups.

In either case, it seems like a fairly simple eviction strategy would
work. For instance, we could just evict the entire hash table if
work_mem is exceeded or if the hit rate on the hash table falls below a
certain threshold. If there was really something important that should
have stayed in the hash table, it will go back in soon anyway.

So why should eviction be a major driver for the entire design? I agree
it should be an area of improvement for the future, so let me know if
you see a major problem, but I haven't been as focused on eviction.

> While the primary goal of the patch is addressing the OOM risks in
> hash
> aggregate, I think it'd be a mistake to see it just that way. I see
> it
> could allow us to perform hash aggregate more often, even if we know
> the
> groups won't fit into work_mem. If we could estimate how much of the
> aggregate state we'll have to spill to disk, we could still prefer
> hashagg over groupagg. We would pay the price for eviction, but on
> large
> data sets that can be massively cheaper than having to do sort.

Agreed. I ran some tests of my patch in the last round, and they
strongly supported choosing HashAgg a lot more often. A lot of sort
improvements have been made though, so I really need to run some new
numbers.

> far away), but it would be unfortunate to make this improvement
> impossible/more difficult in the future.

If you see anything that would make this difficult in the future, let
me know.

Regards,
     Jeff Davis

В списке pgsql-hackers по дате отправления:

Вход в личный кабинет

Восстановление пароля

Подтверждение аккаунта

Изменение пароля

Re: Spilling hashed SetOps and aggregates to disk