Re: Default setting for enable_hashagg_disk

Поиск
Список
Период
Сортировка
От Justin Pryzby
Тема Re: Default setting for enable_hashagg_disk
Дата
Msg-id 20200624123843.GX4107@telsasoft.com
обсуждение исходный текст
Ответ на Re: Default setting for enable_hashagg_disk  (Bruce Momjian <bruce@momjian.us>)
Ответы Re: Default setting for enable_hashagg_disk  (Bruce Momjian <bruce@momjian.us>)
Re: Default setting for enable_hashagg_disk  (Tom Lane <tgl@sss.pgh.pa.us>)
Список pgsql-hackers
On Wed, Jun 24, 2020 at 05:06:28AM -0400, Bruce Momjian wrote:
> On Wed, Jun 24, 2020 at 02:11:57PM +1200, David Rowley wrote:
> > On Tue, 23 Jun 2020 at 08:24, Jeff Davis <pgsql@j-davis.com> wrote:
> > > Another way of looking at it is that the weird behavior is already
> > > there in v12, so there are already users relying on this weird behavior
> > > as a crutch for some other planner mistake. The question is whether we
> > > want to:
> > >
> > > (a) take the weird behavior away now as a consequence of implementing
> > > disk-based HashAgg; or
> > > (b) support the weird behavior forever; or
> > > (c) introduce a GUC now to help transition away from the weird behavior
> > >
> > > The danger with (c) is that it gives users more time to become more
> > > reliant on the weird behavior; and worse, a GUC could be seen as an
> > > endorsement of the weird behavior rather than a path to eliminating it.
> > > So we could intend to do (c) and end up with (b). We can mitigate this
> > > with documentation warnings, perhaps.
> > 
> > So, I have a few thoughts on this subject. I understand both problem
> > cases have been mentioned before on this thread, but just to reiterate
> > the two problem cases that we really would rather people didn't hit.
> 
> I appreciated this summary since I wasn't fully following the issues.
> 
> > As for GUCs to try to help the group of users who, *I'm certain*, will
> > have problems with PG13's plan choice. I think the overloaded
> > enable_hashagg option is a really nice compromise.   We don't really
> > have any other executor node type that has multiple GUCs controlling
> > its behaviour, so I believe it would be nice to keep it that way.
...
> It would seem merge join has almost the same complexities as the new
> hash join code, since it can spill to disk doing sorts for merge joins,
> and adjusting work_mem is the only way to control that spill to disk.  I
> don't remember anyone complaining about spills to disk during merge
> join, so I am unclear why we would need a such control for hash join.

It loooks like merge join was new in 8.3.  I don't think that's a good analogy,
since the old behavior was still available with enable_mergejoin=off.

I think a better analogy would be if we now changed sort nodes beneath merge
join to use at most 0.5*work_mem, with no way of going back to using
1.0*work_mem.

-- 
Justin



В списке pgsql-hackers по дате отправления:

Предыдущее
От: David Rowley
Дата:
Сообщение: Re: Default setting for enable_hashagg_disk
Следующее
От: ROS Didier
Дата:
Сообщение: RE: PostgreSQL and big data - FDW