Re: Default setting for enable_hashagg_disk

Поиск
Список
Период
Сортировка
От Andres Freund
Тема Re: Default setting for enable_hashagg_disk
Дата
Msg-id 20200624193122.cqolrvl44fh4w4s4@alap3.anarazel.de
обсуждение исходный текст
Ответ на Re: Default setting for enable_hashagg_disk  (Tom Lane <tgl@sss.pgh.pa.us>)
Ответы Re: Default setting for enable_hashagg_disk  (Jeff Davis <pgsql@j-davis.com>)
Список pgsql-hackers
Hi,

On 2020-06-24 14:40:50 -0400, Tom Lane wrote:
> Tomas Vondra <tomas.vondra@2ndquadrant.com> writes:
> > On Wed, Jun 24, 2020 at 01:29:56PM -0400, Tom Lane wrote:
> >> If we feel we need something to let people have the v12 behavior
> >> back, let's have
> >> (1) enable_hashagg on/off --- controls planner, same as it ever was
> >> (2) enable_hashagg_spill on/off --- controls executor by disabling spill
>
> > What if a user specifies
> >    enable_hashagg = on
> >    enable_hashagg_spill = off
>
> It would probably be reasonable for the planner to behave as it did
> pre-v13, that is not choose hashagg if it estimates that work_mem
> would be exceeded.  (So, okay, that means enable_hashagg_spill
> affects both planner and executor ... but ISTM it's just one
> behavior not two.)

There's two different reasons for spilling in the executor right now:

1) The planner estimated that we'd need to spill, and that turns out to
   be true. There seems no reason to not spill in that case (as long as
   it's enabled/chosen in the planner).

2) The planner didn't think we'd need to spill, but we end up using more
   than work_mem memory.

nodeAgg.c already treats those separately:

void
hash_agg_set_limits(double hashentrysize, uint64 input_groups, int used_bits,
                    Size *mem_limit, uint64 *ngroups_limit,
                    int *num_partitions)
{
    int            npartitions;
    Size        partition_mem;

    /* if not expected to spill, use all of work_mem */
    if (input_groups * hashentrysize < work_mem * 1024L)
    {
        if (num_partitions != NULL)
            *num_partitions = 0;
        *mem_limit = work_mem * 1024L;
        *ngroups_limit = *mem_limit / hashentrysize;
        return;
    }

We can't sensibly disable spilling when chosen at plan time, because
that'd lead to *more* OOMS than in v12.

ISTM that we should have one option that controls whether 1) is done,
and one that controls whether 2) is done. Even if the option for 2 is
off, we still should spill when the option for 1) chooses a spilling
plan.  I don't think it makes sense for one of those options to
influence the other implicitly.

So maybe enable_hashagg_spilling_plan for 1) and
hashagg_spill_on_exhaust for 2).

Greetings,

Andres Freund



В списке pgsql-hackers по дате отправления:

Предыдущее
От: Robert Haas
Дата:
Сообщение: Re: Default setting for enable_hashagg_disk
Следующее
От: Fabien COELHO
Дата:
Сообщение: Re: Why forbid "INSERT INTO t () VALUES ();"