Re: Default setting for enable_hashagg_disk

Поиск
Список
Период
Сортировка
От David Rowley
Тема Re: Default setting for enable_hashagg_disk
Дата
Msg-id CAApHDvpD_q4wbSnzQJoKT+19uChxSkkw5t-Q2X--o-K8T2TCWQ@mail.gmail.com
обсуждение исходный текст
Ответ на Re: Default setting for enable_hashagg_disk  (Tom Lane <tgl@sss.pgh.pa.us>)
Ответы Re: Default setting for enable_hashagg_disk  ("David G. Johnston" <david.g.johnston@gmail.com>)
Список pgsql-hackers
On Sat, 11 Jul 2020 at 10:00, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>
> Stephen Frost <sfrost@snowman.net> writes:
> > I don't see hash_mem as being any kind of proper fix- it's just punting
> > to the user saying "we can't figure this out, how about you do it" and,
> > worse, it's in conflict with how we already ask the user that question.
> > Turning it into a multiplier doesn't change that either.
>
> Have you got a better proposal that is reasonably implementable for v13?
> (I do not accept the argument that "do nothing" is a better proposal.)
>
> I agree that hash_mem is a stopgap, whether it's a multiplier or no,
> but at this point it seems difficult to avoid inventing a stopgap.
> Getting rid of the process-global work_mem setting is a research project,
> and one I wouldn't even count on having results from for v14.  In the
> meantime, it seems dead certain that there are applications for which
> the current behavior will be problematic.  hash_mem seems like a cleaner
> and more useful stopgap than the "escape hatch" approach, at least to me.

If we're going to end up going down the route of something like
hash_mem for PG13, wouldn't it be better to have something more like
hashagg_mem that only adjusts the memory limits for Hash Agg only?

Stephen mentions in [1] that:
> Users who are actually hit by this in a negative way
> have an option- increase work_mem to reflect what was actually happening
> already.

Peter is not a fan of that idea, which can only be due to the fact
that will also increase the maximum memory consumption allowed by
other nodes in the plan too. My concern is that if we do hash_mem and
have that control the memory allowances for Hash Joins and Hash Aggs,
then that solution is just as good as Stephen's idea when the plan
only contains Hash Joins and Hash Aggs.

As much as I do want to see us get something to allow users some
reasonable way to get the same performance as they're used to, I'm
concerned that giving users something that works for many of the use
cases is not really going to be as good as giving them something that
works in all their use cases.   A user who has a partitioned table
with a good number of partitions and partition-wise joins enabled
might not like it if their Hash Join plan suddenly consumes hash_mem *
nPartitions when they've set hash_mem to 10x of work_mem due to some
other plan that requires that to maintain PG12's performance in PG13.
 If that user is unable to adjust hash_mem due to that then they're
not going to be very satisfied that we've added hash_mem to allow
their query to perform as well as it did in PG12. They'll be at the
same OOM risk that they were exposed to in PG12 if they were to
increase hash_mem here.

David

[1] https://www.postgresql.org/message-id/20200710143415.GJ12375@tamriel.snowman.net



В списке pgsql-hackers по дате отправления:

Предыдущее
От: Alvaro Herrera
Дата:
Сообщение: Re: min_safe_lsn column in pg_replication_slots view
Следующее
От: Alexander Korotkov
Дата:
Сообщение: Re: jsonpath versus NaN