Обсуждение: WITH NOT MATERIALIZED and DML CTEs
Currently, WITH a AS NOT MATERIALIZED (INSERT ...) would silently disregard the "NOT MATERIALIZED" instruction and execute the data- modifying CTE to completion (as per the long-standing DML CTE rule). This seems like an omission to me. Ideally, the presence of an explicit "NOT MATERIALIZED" clause on a data-modifying CTE should disable the "run to completion" logic. It is understandably late in the 12 cycle, so maybe prohibit NOT MATERIALIZED with DML altogheter and revisit this in 13? Thoughts? Elvis
Hi, On 2019-06-03 11:45:51 -0400, Elvis Pranskevichus wrote: > Currently, WITH a AS NOT MATERIALIZED (INSERT ...) would silently > disregard the "NOT MATERIALIZED" instruction and execute the data- > modifying CTE to completion (as per the long-standing DML CTE rule). > > This seems like an omission to me. Ideally, the presence of an explicit > "NOT MATERIALIZED" clause on a data-modifying CTE should disable the > "run to completion" logic. I don't see us ever doing that. The result of minor costing and other planner changes would yield different updated data. That'll just create endless bug reports. > It is understandably late in the 12 cycle, so maybe prohibit NOT > MATERIALIZED with DML altogheter and revisit this in 13? I could see us adding an error, or just continuing to silently ignore it. Greetings, Andres Freund
On Monday, June 3, 2019 11:50:15 A.M. EDT Andres Freund wrote: > > This seems like an omission to me. Ideally, the presence of an > > explicit "NOT MATERIALIZED" clause on a data-modifying CTE should > > disable the "run to completion" logic. > > I don't see us ever doing that. The result of minor costing and other > planner changes would yield different updated data. That'll just > create endless bug reports. I understand why the rule exists in the first place, but I think that an explicit opt-in signals the assumption of responsibility and opens the possibility of using this in a well-defined evaluation context, such as CASE WHEN. Elvis
Elvis Pranskevichus <elprans@gmail.com> writes: > On Monday, June 3, 2019 11:50:15 A.M. EDT Andres Freund wrote: >>> This seems like an omission to me. Ideally, the presence of an >>> explicit "NOT MATERIALIZED" clause on a data-modifying CTE should >>> disable the "run to completion" logic. >> I don't see us ever doing that. The result of minor costing and other >> planner changes would yield different updated data. That'll just >> create endless bug reports. > I understand why the rule exists in the first place, but I think that an > explicit opt-in signals the assumption of responsibility and opens the > possibility of using this in a well-defined evaluation context, such as > CASE WHEN. TBH, if you think it's well-defined, you're wrong. I concur with Andres that turning off run-to-completion for DMLs would be disastrous. For just one obvious point, what about firing AFTER triggers? It's already the case that the planner will silently ignore NOT MATERIALIZED for other cases where it can't inline the CTE for semantic or implementation reasons -- see comments in SS_process_ctes(). I see no good reason to treat the DML exception much differently from other exceptions, such as presence of volatile functions or recursion. regards, tom lane
On Monday, June 3, 2019 12:09:46 P.M. EDT Tom Lane wrote: > > I understand why the rule exists in the first place, but I think > > that an explicit opt-in signals the assumption of responsibility > > and opens the possibility of using this in a well-defined > > evaluation context, such as CASE WHEN. > > TBH, if you think it's well-defined, you're wrong. The documentation seems to strongly suggest otherwise: "When it is essential to force evaluation order, a CASE construct (see Section 9.17) can be used. ... CASE construct used in this fashion will defeat optimization attempts" Are there cases where this is not true outside of the documented exceptions (i.e. immutable early-eval and aggregates)? Elvis
Elvis Pranskevichus <elprans@gmail.com> writes: > On Monday, June 3, 2019 12:09:46 P.M. EDT Tom Lane wrote: >>> I understand why the rule exists in the first place, but I think >>> that an explicit opt-in signals the assumption of responsibility >>> and opens the possibility of using this in a well-defined >>> evaluation context, such as CASE WHEN. >> TBH, if you think it's well-defined, you're wrong. > The documentation seems to strongly suggest otherwise: > "When it is essential to force evaluation order, a CASE construct (see > Section 9.17) can be used. ... CASE construct used in this fashion will > defeat optimization attempts" > Are there cases where this is not true outside of the documented > exceptions (i.e. immutable early-eval and aggregates)? CASE is a scalar-expression construct. It's got little to do with the timing of scan/join operations such as row fetches. We offer users essentially no control over when those happen ... other than the guarantees about CTE materialization, which are exactly what you say you want to give up. regards, tom lane
On Monday, June 3, 2019 1:03:44 P.M. EDT Tom Lane wrote: > CASE is a scalar-expression construct. It's got little to do with > the timing of scan/join operations such as row fetches. We offer > users essentially no control over when those happen ... other than > the guarantees about CTE materialization, which are exactly what > you say you want to give up. In the general case, yes, but I *can* use a scalar-returning INSERT CTE in a THEN clause as a subquery. Useful for a conditional INSERT, when you can't use ON CONFLICT. Anyway, I understand that the complications are probably not worth it. Thanks, Elvis
On Mon, Jun 03, 2019 at 11:45:51AM -0400, Elvis Pranskevichus wrote: > Currently, WITH a AS NOT MATERIALIZED (INSERT ...) would silently > disregard the "NOT MATERIALIZED" instruction and execute the data- > modifying CTE to completion (as per the long-standing DML CTE rule). > > This seems like an omission to me. Ideally, the presence of an explicit > "NOT MATERIALIZED" clause on a data-modifying CTE should disable the > "run to completion" logic. It might be worth documenting the fact that NOT MATERIALIZED doesn't affect DML CTEs, just as it doesn't affect statements with volatile functions and recursive CTEs. Best, David. -- David Fetter <david(at)fetter(dot)org> http://fetter.org/ Phone: +1 415 235 3778 Remember to vote! Consider donating to Postgres: http://www.postgresql.org/about/donate
David Fetter <david@fetter.org> writes: > It might be worth documenting the fact that NOT MATERIALIZED doesn't > affect DML CTEs, just as it doesn't affect statements with volatile > functions and recursive CTEs. We already do: However, if a WITH query is non-recursive and side-effect-free (that is, it is a SELECT containing no volatile functions) then it can be folded into the parent query, allowing joint optimization of the two query levels. By default, this happens if the parent query references the WITH query just once, but not if it references the WITH query more than once. You can override that decision by specifying MATERIALIZED to force separate calculation of the WITH query, or by specifying NOT MATERIALIZED to force it to be merged into the parent query. The latter choice risks duplicate computation of the WITH query, but it can still give a net savings if each usage of the WITH query needs only a small part of the WITH query's full output. regards, tom lane
On Mon, Jun 03, 2019 at 07:33:35PM -0400, Tom Lane wrote: > David Fetter <david@fetter.org> writes: > > It might be worth documenting the fact that NOT MATERIALIZED doesn't > > affect DML CTEs, just as it doesn't affect statements with volatile > > functions and recursive CTEs. > > We already do: > > However, if a WITH query is non-recursive and side-effect-free (that > is, it is a SELECT containing no volatile functions) then it can be I guess this part makes it pretty clear that DML isn't part of the party just yet. Best, David. -- David Fetter <david(at)fetter(dot)org> http://fetter.org/ Phone: +1 415 235 3778 Remember to vote! Consider donating to Postgres: http://www.postgresql.org/about/donate
On 2019-Jun-03, Andres Freund wrote: > On 2019-06-03 11:45:51 -0400, Elvis Pranskevichus wrote: > > It is understandably late in the 12 cycle, so maybe prohibit NOT > > MATERIALIZED with DML altogheter and revisit this in 13? > > I could see us adding an error, or just continuing to silently ignore > it. Hmm, shouldn't we be throwing an error for that case? I'm not sure it's defensible that we don't. -- Álvaro Herrera https://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services