Re: Default setting for enable_hashagg_disk

Поиск
Список
Период
Сортировка
От Peter Geoghegan
Тема Re: Default setting for enable_hashagg_disk
Дата
Msg-id CAH2-Wz=19NR-sNxAbstxtrOgmLWJRCpCfB+1zM247rf+iDvPcw@mail.gmail.com
обсуждение исходный текст
Ответ на Re: Default setting for enable_hashagg_disk  (Tomas Vondra <tomas.vondra@2ndquadrant.com>)
Список pgsql-hackers
On Sat, Jul 11, 2020 at 4:23 PM Tomas Vondra
<tomas.vondra@2ndquadrant.com> wrote:
> I find that example rather suspicious. I mean, what exactly in the
> GroupAgg plan would consume this memory? Surely it'd have to be some
> node below the grouping, but sort shouldn't do that, no?
>
> Seems strange.

Well, I imagine hash aggregate manages to use much less memory than
the equivalent groupagg's sort, even though to the optimizer it
appears as if hash agg should end up using more memory (which is not
allowed by the optimizer when it exceeds work_mem, regardless of
whether or not it's faster). It may also be relevant that Hash agg can
use less memory simply by being faster. Going faster could easily
reduce the memory usage for the system as a whole, even when you
assume individual group agg nodes use more memory for as long as they
run. So in-memory hash agg is effectively less memory hungry.

It's not a great example of a specific case that we'd regress by not
having hash_mem/hash_mem_multiplier. It's an overestimate where older
releases accidentally got a bad, slow plan, not an underestimate where
older releases "lived beyond their means but got away with it" by
getting a good, fast plan. ISTM that the example is a good example of
the strange dynamics involved.

> I agree grouping estimates are often quite off, and I kinda agree with
> introducing hash_mem (or at least with the concept that hashing is more
> sensitive to amount of memory than sort). Not sure it's the right espace
> hatch to the hashagg spill problem, but maybe it is.

The hash_mem/hash_mem_multiplier proposal aims to fix the problem
directly, and not be an escape hatch, because we don't like escape
hatches. I think that that probably fixes many or most of the problems
in practice, at least assuming that the admin is willing to tune it.
But a small number of remaining installations may still need a "true"
escape hatch. There is an argument for having both, though I hope that
the escape hatch can be avoided.

-- 
Peter Geoghegan



В списке pgsql-hackers по дате отправления:

Предыдущее
От: Sascha Kuhl
Дата:
Сообщение: Re: WIP: BRIN multi-range indexes
Следующее
От: Tomas Vondra
Дата:
Сообщение: Re: POC: postgres_fdw insert batching