Re: Default setting for enable_hashagg_disk

Поиск
Список
Период
Сортировка
От Amit Kapila
Тема Re: Default setting for enable_hashagg_disk
Дата
Msg-id CAA4eK1KgTD5=7xcUrbnuOo2cYTCNHZjaByGf_rUZcWYv5x8uMw@mail.gmail.com
обсуждение исходный текст
Ответ на Re: Default setting for enable_hashagg_disk  (Peter Geoghegan <pg@bowt.ie>)
Список pgsql-hackers
On Mon, Jul 13, 2020 at 9:50 PM Peter Geoghegan <pg@bowt.ie> wrote:
>
> On Mon, Jul 13, 2020 at 6:13 AM Stephen Frost <sfrost@snowman.net> wrote:
> > Yes, increasing work_mem isn't unusual, at all.
>
> It's unusual as a way of avoiding OOMs!
>
> > Eh?  That's not at all what it looks like- they were getting OOM's
> > because they set work_mem to be higher than the actual amount of memory
> > they had and the Sort before the GroupAgg was actually trying to use all
> > that memory.  The HashAgg ended up not needing that much memory because
> > the aggregated set wasn't actually that large.  If anything, this shows
> > exactly what Jeff's fine work here is (hopefully) going to give us- the
> > option to plan a HashAgg in such cases, since we can accept spilling to
> > disk if we end up underestimate, or take advantage of that HashAgg
> > being entirely in memory if we overestimate.
>
> I very specifically said that it wasn't a case where something like
> hash_mem would be expected to make all the difference.
>
> > Having looked back, I'm not sure that I'm really in the minority
> > regarding the proposal to add this at this time either- there's been a
> > few different comments that it's too late for v13 and/or that we should
> > see if we actually end up with users seriously complaining about the
> > lack of a separate way to specify the memory for a given node type,
> > and/or that if we're going to do this then we should have a broader set
> > of options covering other nodes types too, all of which are positions
> > that I agree with.
>
> By proposing to do nothing at all, you are very clearly in a small
> minority. While (for example) I might have debated the details with
> David Rowley a lot recently, and you couldn't exactly say that we're
> in agreement, our two positions are nevertheless relatively close
> together.
>
> AFAICT, the only other person that has argued that we should do
> nothing (have no new GUC) is Bruce, which was a while ago now. (Amit
> said something similar, but has since softened his opinion [1]).
>

To be clear, my vote for PG13 is not to do anything till we have clear
evidence of regressions.  In the email you quoted, I was trying to say
that due to parallelism we might not have the problem for which we are
planning to provide an escape-hatch or hash_mem GUC.  I think the
reason for the delay in getting to the agreement is that there is no
clear evidence for the problem (user-reported cases or results of some
benchmarks like TPC-H) unless I have missed something.

Having said that, I understand that we have to reach some conclusion
to close this open item and if the majority of people are in-favor of
escape-hatch or hash_mem solution then we have to do one of those.

-- 
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com



В списке pgsql-hackers по дате отправления:

Предыдущее
От: Fujii Masao
Дата:
Сообщение: Re: Transactions involving multiple postgres foreign servers, take 2
Следующее
От: Fujii Masao
Дата:
Сообщение: Re: POC and rebased patch for CSN based snapshots