Обсуждение: Add mention of execution time memory for enable_partitionwise_* GUCs
Over on [1], there's a complaint about a query OOMing because the use of enable_partitionwise_aggregate caused a plan with 1000 Hash Aggregate nodes. The only mention in the docs is the additional memory requirements and CPU for query planning when that GUC is enabled. There's no mention that execution could use work_mem * nparts more memory to be used. I think that's bad and we should fix it. I've attached my proposal to fix that. David [1] https://postgr.es/m/3603c380-d094-136e-e333-610914fb3e80%40gmx.net
Вложения
On Thu, Jul 18, 2024 at 4:03 AM David Rowley <dgrowleyml@gmail.com> wrote: > > Over on [1], there's a complaint about a query OOMing because the use > of enable_partitionwise_aggregate caused a plan with 1000 Hash > Aggregate nodes. > > The only mention in the docs is the additional memory requirements and > CPU for query planning when that GUC is enabled. There's no mention > that execution could use work_mem * nparts more memory to be used. I > think that's bad and we should fix it. > > I've attached my proposal to fix that. If those GUCs are enabled, the planner consumes large amount of memory and also takes longer irrespective of whether partitionwise plan is used or not. That's why the default is false. If majority of those joins use nested loop memory, or use index scans instead sorting, memory consumption won't be as large. Saying that it "can" result in large increase in execution memory is not accurate. But I agree that we need to mention the effect of work_mem on partitionwise join/aggregation. I had an offlist email exchange with Dimitrios where I suggested that we should mention this in the work_mem description. I.e. in work_mem description change "Note that a complex query might perform several sort and hash operations" to "Note that a complex query or a query using partitionwise aggregates or joins might perform several sort and hash operations' '. And in the description of enable_partitionwise_* GUCs mention that "Each of the partitionwise join or aggregation which performs sorting/hashing may consume work_mem worth of memory increasing the total memory consumed during query execution. -- Best Wishes, Ashutosh Bapat
On Thu, 18 Jul 2024 at 21:24, Ashutosh Bapat <ashutosh.bapat.oss@gmail.com> wrote: > If those GUCs are enabled, the planner consumes large amount of memory > and also takes longer irrespective of whether partitionwise plan is > used or not. That's why the default is false. If majority of those > joins use nested loop memory, or use index scans instead sorting, > memory consumption won't be as large. Saying that it "can" result in > large increase in execution memory is not accurate. But I agree that > we need to mention the effect of work_mem on partitionwise > join/aggregation. hmm? please tell me what word other than "can" best describes something that is possible to happen but does not always happen under all circumstances. David
On Thu, Jul 18, 2024 at 3:33 PM David Rowley <dgrowleyml@gmail.com> wrote: > > On Thu, 18 Jul 2024 at 21:24, Ashutosh Bapat > <ashutosh.bapat.oss@gmail.com> wrote: > > If those GUCs are enabled, the planner consumes large amount of memory > > and also takes longer irrespective of whether partitionwise plan is > > used or not. That's why the default is false. If majority of those > > joins use nested loop memory, or use index scans instead sorting, > > memory consumption won't be as large. Saying that it "can" result in > > large increase in execution memory is not accurate. But I agree that > > we need to mention the effect of work_mem on partitionwise > > join/aggregation. > > hmm? please tell me what word other than "can" best describes > something that is possible to happen but does not always happen under > all circumstances. May I suggest "may"? :) [1], [2], [3]. My point is, we need to highlight the role of work_mem. So modify both the descriptions. [1] https://www.thesaurus.com/e/grammar/can-vs-may/ [2] https://www.britannica.com/dictionary/eb/qa/modal-verbs-may-might-can-could-and-ought [3] https://www.merriam-webster.com/grammar/when-to-use-can-and-may -- Best Wishes, Ashutosh Bapat
On Thu, 18 Jul 2024 at 22:28, Ashutosh Bapat <ashutosh.bapat.oss@gmail.com> wrote: > > On Thu, Jul 18, 2024 at 3:33 PM David Rowley <dgrowleyml@gmail.com> wrote: > > hmm? please tell me what word other than "can" best describes > > something that is possible to happen but does not always happen under > > all circumstances. > > May I suggest "may"? :) [1], [2], [3]. Is this a wind-up? If it's not, I disagree that "may" is a better choice. The possibility example in your first link says "It may rain tomorrow. (possibility)", but that's something someone would only say if there was some evidence to support that, e.g. ominous clouds on the horizon at dusk, or >0% chance of precipitation on the weather forecast. Nobody is going to say that unless there's some supporting evidence. For the executor using work_mem * nparts, we've no evidence either. It's just a >0% possibility with no supporting evidence. > My point is, we need to highlight the role of work_mem. So modify both > the descriptions. I considered writing about work_mem, but felt I wanted to keep it as brief as possible and just have some words that might make someone think twice. The details in the work_mem documentation should inform the reader that work_mem is per executor node. It likely wouldn't hurt to have more documentation around which executor node types can use a work_mem, which use work_mem * hash_mem_multiplier and which use neither. We tend to not write too much about executor nodes in the documents, so I'm not proposing that for this patch. David > [1] https://www.thesaurus.com/e/grammar/can-vs-may/
On Thu, Jul 18, 2024 at 4:24 PM David Rowley <dgrowleyml@gmail.com> wrote: > > On Thu, 18 Jul 2024 at 22:28, Ashutosh Bapat > <ashutosh.bapat.oss@gmail.com> wrote: > > > > On Thu, Jul 18, 2024 at 3:33 PM David Rowley <dgrowleyml@gmail.com> wrote: > > > hmm? please tell me what word other than "can" best describes > > > something that is possible to happen but does not always happen under > > > all circumstances. > > > > May I suggest "may"? :) [1], [2], [3]. > > Is this a wind-up? > > If it's not, I disagree that "may" is a better choice. The > possibility example in your first link says "It may rain tomorrow. > (possibility)", but that's something someone would only say if there > was some evidence to support that, e.g. ominous clouds on the horizon > at dusk, or >0% chance of precipitation on the weather forecast. > Nobody is going to say that unless there's some supporting evidence. > For the executor using work_mem * nparts, we've no evidence either. > It's just a >0% possibility with no supporting evidence. I am not a native English speaker and might have made a mistake when interpreting the definitions. Will leave that aside. > > > My point is, we need to highlight the role of work_mem. So modify both > > the descriptions. > > I considered writing about work_mem, but felt I wanted to keep it as > brief as possible and just have some words that might make someone > think twice. The details in the work_mem documentation should inform > the reader that work_mem is per executor node. It likely wouldn't > hurt to have more documentation around which executor node types can > use a work_mem, which use work_mem * hash_mem_multiplier and which use > neither. We tend to not write too much about executor nodes in the > documents, so I'm not proposing that for this patch. Something I didn't write in my first reply but wanted to discuss was the intention of adding those GUCs. Sorry for missing it in my first email. According to [1] these GUCs were added because of increased memory consumption during planning and time taken to plan the query. The execution time memory consumption issue was known even back then but the GUC was not set to default because of that. But your patch proposes to change the narrative. In the same thread [1], you will find the discussion about turning the default to ON once we have fixed planner's memory and time consumption. We have patches addressing those issues [2] [3]. With your proposed changes we will almost never have a chance to turn those GUCs ON by default. That seems a rather sad prospect. I am fine if we want to mention that the executor may consume a large amount of memory when these GUCs are turned ON. Users may decide to turn those OFF if they can not afford to spend that much memory during execution. But I don't like tying execution time consumption with default OFF. [1] https://www.postgresql.org/message-id/CA+TgmoYggDp6k-HXNAgrykZh79w6nv2FevpYR_jeMbrORDaQrA@mail.gmail.com [2] https://www.postgresql.org/message-id/CAExHW5stmOUobE55pMt83r8UxvfCph%2BPvo5dNpdrVCsBgXEzDQ%40mail.gmail.com [3] https://www.postgresql.org/message-id/flat/CAJ2pMkZNCgoUKSE+_5LthD+KbXKvq6h2hQN8Esxpxd+cxmgomg@mail.gmail.com -- Best Wishes, Ashutosh Bapat
On Fri, 19 Jul 2024 at 17:24, Ashutosh Bapat <ashutosh.bapat.oss@gmail.com> wrote: > According to [1] these GUCs were added because of increased > memory consumption during planning and time taken to plan the query. > The execution time memory consumption issue was known even back then > but the GUC was not set to default because of that. But your patch > proposes to change the narrative. In the same thread [1], you will > find the discussion about turning the default to ON once we have fixed > planner's memory and time consumption. We have patches addressing > those issues [2] [3]. With your proposed changes we will almost never > have a chance to turn those GUCs ON by default. That seems a rather > sad prospect. Sad prospect for who? If the day comes and we're considering enabling these GUCs by default, I think we'll likely also be considering if it'll be sad for users who get an increased likelihood of OOM kills because the chosen plan uses $nparts times more memory to execute than the old plan. Can you honestly say that you have no concern about increased executor memory consumption if we switched on these GUCs by default? I was certainly concerned about it in [5], so I dropped that patch after realising what could happen. > I am fine if we want to mention that the executor may consume a large > amount of memory when these GUCs are turned ON. Users may decide to > turn those OFF if they can not afford to spend that much memory during > execution. But I don't like tying execution time consumption with > default OFF. If you were to fix the planner issues then this text would need to be revisited anyway. However, I seriously question your judgment on fixing those alone being enough to allow us to switch these on by default. It seems unlikely that the planner would use anything near any realistic work_mem setting extra memory per partition, but that's what the executor would do, given enough data per partition. I'd say any analysis that only found planner memory and time to be the only reason to turn these GUCs off by default was incomplete analysis. Maybe it was a case of stopping looking for all the reasons once enough had been found to make the choice. If not, then they found the molehill and failed to notice the mountain. David [4] https://postgr.es/m/3603c380-d094-136e-e333-610914fb3e80%40gmx.net [5] https://postgr.es/m/CAApHDvojKdBR3MR59JXmaCYbyHB6Q_5qPRU+dy93En8wm+XiDA@mail.gmail.com
Re: Add mention of execution time memory for enable_partitionwise_* GUCs
От
Dimitrios Apostolou
Дата:
Thank you for the patch improving the docs, I think it's a clear improvement from before. On Thu, 18 Jul 2024, David Rowley wrote: > I considered writing about work_mem, but felt I wanted to keep it as > brief as possible and just have some words that might make someone > think twice. The details in the work_mem documentation should inform > the reader that work_mem is per executor node. It likely wouldn't > hurt to have more documentation around which executor node types can > use a work_mem, which use work_mem * hash_mem_multiplier and which use > neither. We tend to not write too much about executor nodes in the > documents, so I'm not proposing that for this patch. This is the only part I think is missing, since we now know (measurements in [1], reproducible scenario in [2]) that the number of partitions plays an important role in sizing the RAM of the server. It's just too big to not mention that worst case will be n_partitions * work_mem. [1] https://www.postgresql.org/message-id/d26e67d3-74bc-60aa-bf24-2a8fb83efe9c%40gmx.net [2] https://www.postgresql.org/message-id/af6ed790-a5fe-19aa-1141-927595604c01%40gmx.net I would also like to add an entry about this issue with links to the above pages, to the TODO page at [3], as this is the only bugtracker I'm aware of. Am I doing it right bringing it up for approval on this list? [3] https://wiki.postgresql.org/wiki/Todo Thanks, Dimitris