Centralizing protective copying of utility statements

Поиск
Список
Период
Сортировка
От Tom Lane
Тема Centralizing protective copying of utility statements
Дата
Msg-id 931771.1623893989@sss.pgh.pa.us
обсуждение исходный текст
Ответы Re: Centralizing protective copying of utility statements  (Zhihong Yu <zyu@yugabyte.com>)
Re: Centralizing protective copying of utility statements  (Tom Lane <tgl@sss.pgh.pa.us>)
Re: Centralizing protective copying of utility statements  (Andres Freund <andres@anarazel.de>)
Список pgsql-hackers
I haven't pushed my quick-hack fix for bug #17053 ([1]) because
I wasn't really satisfied with band-aiding that problem in one
more place.  I took a look around to see if we could protect
against the whole class of scribble-on-a-utility-statement
issues in a more centralized way.

What I found is that there are only two places that call
ProcessUtility with a statement that might be coming from the plan
cache.  _SPI_execute_plan is always doing so, so it can just
unconditionally copy the statement.  The other one is
PortalRunUtility, which can examine the Portal to see if the
parsetree came out of cache or not.  Having added copyObject
calls there, we can get rid of the retail calls that exist
in not-quite-enough utility statement execution routines.

I think this would have been more complicated before plpgsql
started using the plancache; at least, some of the comments
removed here refer to plpgsql as being an independent hazard.
Also, I didn't risk removing any copyObject calls that are
further down than the top level of statement execution handlers.
Some of those are visibly still necessary, and others are hard
to be sure about.

Although this adds some overhead in the form of copying of
utility node trees that won't actually mutate during execution,
I think that won't be too bad because those trees tend to be
small and hence cheap to copy.  The statements that can have
a lot of substructure usually contain expression trees or the
like, which do have to be copied for safety.  Moreover, we buy
back a lot of cost by removing pointless copying when we're
not executing on a cached plan.

(BTW, in case you are wondering: this hazard only exists for
utility statements, because we long ago made the executor
not modify the Plan tree it's given.)

This is possibly too aggressive to consider for back-patching.
In the back branches, perhaps we should just use my original
localized fix.  Another conservative (but expensive) answer
for the back branches is to add the new copyObject calls
but not remove any of the old ones.

Thoughts?

            regards, tom lane

[1] https://www.postgresql.org/message-id/flat/17053-3ca3f501bbc212b4%40postgresql.org

diff --git a/src/backend/commands/copyto.c b/src/backend/commands/copyto.c
index 89a4f8f810..b6eacd5baa 100644
--- a/src/backend/commands/copyto.c
+++ b/src/backend/commands/copyto.c
@@ -438,14 +438,8 @@ BeginCopyTo(ParseState *pstate,
         /*
          * Run parse analysis and rewrite.  Note this also acquires sufficient
          * locks on the source table(s).
-         *
-         * Because the parser and planner tend to scribble on their input, we
-         * make a preliminary copy of the source querytree.  This prevents
-         * problems in the case that the COPY is in a portal or plpgsql
-         * function and is executed repeatedly.  (See also the same hack in
-         * DECLARE CURSOR and PREPARE.)  XXX FIXME someday.
          */
-        rewritten = pg_analyze_and_rewrite(copyObject(raw_query),
+        rewritten = pg_analyze_and_rewrite(raw_query,
                                            pstate->p_sourcetext, NULL, 0,
                                            NULL);

diff --git a/src/backend/commands/createas.c b/src/backend/commands/createas.c
index dce882012e..0982851715 100644
--- a/src/backend/commands/createas.c
+++ b/src/backend/commands/createas.c
@@ -299,14 +299,8 @@ ExecCreateTableAs(ParseState *pstate, CreateTableAsStmt *stmt,
          * rewriter.  We do not do AcquireRewriteLocks: we assume the query
          * either came straight from the parser, or suitable locks were
          * acquired by plancache.c.
-         *
-         * Because the rewriter and planner tend to scribble on the input, we
-         * make a preliminary copy of the source querytree.  This prevents
-         * problems in the case that CTAS is in a portal or plpgsql function
-         * and is executed repeatedly.  (See also the same hack in EXPLAIN and
-         * PREPARE.)
          */
-        rewritten = QueryRewrite(copyObject(query));
+        rewritten = QueryRewrite(query);

         /* SELECT should never rewrite to more or less than one SELECT query */
         if (list_length(rewritten) != 1)
diff --git a/src/backend/commands/explain.c b/src/backend/commands/explain.c
index 9a60865d19..7045f6ad55 100644
--- a/src/backend/commands/explain.c
+++ b/src/backend/commands/explain.c
@@ -256,14 +256,8 @@ ExplainQuery(ParseState *pstate, ExplainStmt *stmt,
      * rewriter.  We do not do AcquireRewriteLocks: we assume the query either
      * came straight from the parser, or suitable locks were acquired by
      * plancache.c.
-     *
-     * Because the rewriter and planner tend to scribble on the input, we make
-     * a preliminary copy of the source querytree.  This prevents problems in
-     * the case that the EXPLAIN is in a portal or plpgsql function and is
-     * executed repeatedly.  (See also the same hack in DECLARE CURSOR and
-     * PREPARE.)  XXX FIXME someday.
      */
-    rewritten = QueryRewrite(castNode(Query, copyObject(stmt->query)));
+    rewritten = QueryRewrite(castNode(Query, stmt->query));

     /* emit opening boilerplate */
     ExplainBeginOutput(es);
@@ -441,8 +435,7 @@ ExplainOneUtility(Node *utilityStmt, IntoClause *into, ExplainState *es,
     {
         /*
          * We have to rewrite the contained SELECT and then pass it back to
-         * ExplainOneQuery.  It's probably not really necessary to copy the
-         * contained parsetree another time, but let's be safe.
+         * ExplainOneQuery.
          */
         CreateTableAsStmt *ctas = (CreateTableAsStmt *) utilityStmt;
         List       *rewritten;
@@ -463,7 +456,7 @@ ExplainOneUtility(Node *utilityStmt, IntoClause *into, ExplainState *es,
             return;
         }

-        rewritten = QueryRewrite(castNode(Query, copyObject(ctas->query)));
+        rewritten = QueryRewrite(castNode(Query, ctas->query));
         Assert(list_length(rewritten) == 1);
         ExplainOneQuery(linitial_node(Query, rewritten),
                         CURSOR_OPT_PARALLEL_OK, ctas->into, es,
@@ -482,7 +475,7 @@ ExplainOneUtility(Node *utilityStmt, IntoClause *into, ExplainState *es,
         DeclareCursorStmt *dcs = (DeclareCursorStmt *) utilityStmt;
         List       *rewritten;

-        rewritten = QueryRewrite(castNode(Query, copyObject(dcs->query)));
+        rewritten = QueryRewrite(castNode(Query, dcs->query));
         Assert(list_length(rewritten) == 1);
         ExplainOneQuery(linitial_node(Query, rewritten),
                         dcs->options, NULL, es,
diff --git a/src/backend/commands/policy.c b/src/backend/commands/policy.c
index 5cacc088cd..5d469309ce 100644
--- a/src/backend/commands/policy.c
+++ b/src/backend/commands/policy.c
@@ -747,12 +747,12 @@ CreatePolicy(CreatePolicyStmt *stmt)
     addNSItemToQuery(with_check_pstate, nsitem, false, true, true);

     qual = transformWhereClause(qual_pstate,
-                                copyObject(stmt->qual),
+                                stmt->qual,
                                 EXPR_KIND_POLICY,
                                 "POLICY");

     with_check_qual = transformWhereClause(with_check_pstate,
-                                           copyObject(stmt->with_check),
+                                           stmt->with_check,
                                            EXPR_KIND_POLICY,
                                            "POLICY");

@@ -922,7 +922,7 @@ AlterPolicy(AlterPolicyStmt *stmt)

         addNSItemToQuery(qual_pstate, nsitem, false, true, true);

-        qual = transformWhereClause(qual_pstate, copyObject(stmt->qual),
+        qual = transformWhereClause(qual_pstate, stmt->qual,
                                     EXPR_KIND_POLICY,
                                     "POLICY");

@@ -946,7 +946,7 @@ AlterPolicy(AlterPolicyStmt *stmt)
         addNSItemToQuery(with_check_pstate, nsitem, false, true, true);

         with_check_qual = transformWhereClause(with_check_pstate,
-                                               copyObject(stmt->with_check),
+                                               stmt->with_check,
                                                EXPR_KIND_POLICY,
                                                "POLICY");

diff --git a/src/backend/commands/portalcmds.c b/src/backend/commands/portalcmds.c
index d34cc39fde..3ea30bcbc9 100644
--- a/src/backend/commands/portalcmds.c
+++ b/src/backend/commands/portalcmds.c
@@ -76,14 +76,8 @@ PerformCursorOpen(ParseState *pstate, DeclareCursorStmt *cstmt, ParamListInfo pa
      * rewriter.  We do not do AcquireRewriteLocks: we assume the query either
      * came straight from the parser, or suitable locks were acquired by
      * plancache.c.
-     *
-     * Because the rewriter and planner tend to scribble on the input, we make
-     * a preliminary copy of the source querytree.  This prevents problems in
-     * the case that the DECLARE CURSOR is in a portal or plpgsql function and
-     * is executed repeatedly.  (See also the same hack in EXPLAIN and
-     * PREPARE.)  XXX FIXME someday.
      */
-    rewritten = QueryRewrite((Query *) copyObject(query));
+    rewritten = QueryRewrite(query);

     /* SELECT should never rewrite to more or less than one query */
     if (list_length(rewritten) != 1)
diff --git a/src/backend/commands/prepare.c b/src/backend/commands/prepare.c
index 65d8ac65b1..5e03c7c5aa 100644
--- a/src/backend/commands/prepare.c
+++ b/src/backend/commands/prepare.c
@@ -78,12 +78,9 @@ PrepareQuery(ParseState *pstate, PrepareStmt *stmt,
     /*
      * Need to wrap the contained statement in a RawStmt node to pass it to
      * parse analysis.
-     *
-     * Because parse analysis scribbles on the raw querytree, we must make a
-     * copy to ensure we don't modify the passed-in tree.  FIXME someday.
      */
     rawstmt = makeNode(RawStmt);
-    rawstmt->stmt = (Node *) copyObject(stmt->query);
+    rawstmt->stmt = stmt->query;
     rawstmt->stmt_location = stmt_location;
     rawstmt->stmt_len = stmt_len;

diff --git a/src/backend/commands/tablecmds.c b/src/backend/commands/tablecmds.c
index 028e8ac46b..4e23c7fce5 100644
--- a/src/backend/commands/tablecmds.c
+++ b/src/backend/commands/tablecmds.c
@@ -4408,8 +4408,7 @@ ATPrepCmd(List **wqueue, Relation rel, AlterTableCmd *cmd,
      * Copy the original subcommand for each table.  This avoids conflicts
      * when different child tables need to make different parse
      * transformations (for example, the same column may have different column
-     * numbers in different children).  It also ensures that we don't corrupt
-     * the original parse tree, in case it is saved in plancache.
+     * numbers in different children).
      */
     cmd = copyObject(cmd);

diff --git a/src/backend/commands/view.c b/src/backend/commands/view.c
index f2642dba6c..4df05a0b33 100644
--- a/src/backend/commands/view.c
+++ b/src/backend/commands/view.c
@@ -417,12 +417,9 @@ DefineView(ViewStmt *stmt, const char *queryString,
     /*
      * Run parse analysis to convert the raw parse tree to a Query.  Note this
      * also acquires sufficient locks on the source table(s).
-     *
-     * Since parse analysis scribbles on its input, copy the raw parse tree;
-     * this ensures we don't corrupt a prepared statement, for example.
      */
     rawstmt = makeNode(RawStmt);
-    rawstmt->stmt = (Node *) copyObject(stmt->query);
+    rawstmt->stmt = stmt->query;
     rawstmt->stmt_location = stmt_location;
     rawstmt->stmt_len = stmt_len;

diff --git a/src/backend/executor/spi.c b/src/backend/executor/spi.c
index b8bd05e894..70800a649e 100644
--- a/src/backend/executor/spi.c
+++ b/src/backend/executor/spi.c
@@ -2532,6 +2532,14 @@ _SPI_execute_plan(SPIPlanPtr plan, ParamListInfo paramLI,
                 ProcessUtilityContext context;
                 QueryCompletion qc;

+                /*
+                 * Copy the utility statement out of the plan cache, in case
+                 * its processing scribbles on the statement's node tree.
+                 * (This could be refined if we knew which utility statements
+                 * do that.)
+                 */
+                stmt = copyObject(stmt);
+
                 /*
                  * If the SPI context is atomic, or we were not told to allow
                  * nonatomic operations, tell ProcessUtility this is an atomic
diff --git a/src/backend/parser/parse_utilcmd.c b/src/backend/parser/parse_utilcmd.c
index c9708e38f4..81d3e7990c 100644
--- a/src/backend/parser/parse_utilcmd.c
+++ b/src/backend/parser/parse_utilcmd.c
@@ -11,10 +11,6 @@
  * Hence these functions are now called at the start of execution of their
  * respective utility commands.
  *
- * NOTE: in general we must avoid scribbling on the passed-in raw parse
- * tree, since it might be in a plan cache.  The simplest solution is
- * a quick copyObject() call before manipulating the query tree.
- *
  *
  * Portions Copyright (c) 1996-2021, PostgreSQL Global Development Group
  * Portions Copyright (c) 1994, Regents of the University of California
@@ -177,12 +173,6 @@ transformCreateStmt(CreateStmt *stmt, const char *queryString)
     Oid            existing_relid;
     ParseCallbackState pcbstate;

-    /*
-     * We must not scribble on the passed-in CreateStmt, so copy it.  (This is
-     * overkill, but easy.)
-     */
-    stmt = copyObject(stmt);
-
     /* Set up pstate */
     pstate = make_parsestate(NULL);
     pstate->p_sourcetext = queryString;
@@ -2824,12 +2814,6 @@ transformIndexStmt(Oid relid, IndexStmt *stmt, const char *queryString)
     if (stmt->transformed)
         return stmt;

-    /*
-     * We must not scribble on the passed-in IndexStmt, so copy it.  (This is
-     * overkill, but easy.)
-     */
-    stmt = copyObject(stmt);
-
     /* Set up pstate */
     pstate = make_parsestate(NULL);
     pstate->p_sourcetext = queryString;
@@ -2925,12 +2909,6 @@ transformStatsStmt(Oid relid, CreateStatsStmt *stmt, const char *queryString)
     if (stmt->transformed)
         return stmt;

-    /*
-     * We must not scribble on the passed-in CreateStatsStmt, so copy it.
-     * (This is overkill, but easy.)
-     */
-    stmt = copyObject(stmt);
-
     /* Set up pstate */
     pstate = make_parsestate(NULL);
     pstate->p_sourcetext = queryString;
@@ -2993,9 +2971,6 @@ transformStatsStmt(Oid relid, CreateStatsStmt *stmt, const char *queryString)
  *
  * actions and whereClause are output parameters that receive the
  * transformed results.
- *
- * Note that we must not scribble on the passed-in RuleStmt, so we do
- * copyObject() on the actions and WHERE clause.
  */
 void
 transformRuleStmt(RuleStmt *stmt, const char *queryString,
@@ -3070,7 +3045,7 @@ transformRuleStmt(RuleStmt *stmt, const char *queryString,

     /* take care of the where clause */
     *whereClause = transformWhereClause(pstate,
-                                        (Node *) copyObject(stmt->whereClause),
+                                        stmt->whereClause,
                                         EXPR_KIND_WHERE,
                                         "WHERE");
     /* we have to fix its collations too */
@@ -3142,8 +3117,7 @@ transformRuleStmt(RuleStmt *stmt, const char *queryString,
             addNSItemToQuery(sub_pstate, newnsitem, false, true, false);

             /* Transform the rule action statement */
-            top_subqry = transformStmt(sub_pstate,
-                                       (Node *) copyObject(action));
+            top_subqry = transformStmt(sub_pstate, action);

             /*
              * We cannot support utility-statement actions (eg NOTIFY) with
@@ -3325,12 +3299,6 @@ transformAlterTableStmt(Oid relid, AlterTableStmt *stmt,
     AlterTableCmd *newcmd;
     ParseNamespaceItem *nsitem;

-    /*
-     * We must not scribble on the passed-in AlterTableStmt, so copy it. (This
-     * is overkill, but easy.)
-     */
-    stmt = copyObject(stmt);
-
     /* Caller is responsible for locking the relation */
     rel = relation_open(relid, NoLock);
     tupdesc = RelationGetDescr(rel);
diff --git a/src/backend/tcop/pquery.c b/src/backend/tcop/pquery.c
index f7f08e6c05..4e02e1821c 100644
--- a/src/backend/tcop/pquery.c
+++ b/src/backend/tcop/pquery.c
@@ -1144,6 +1144,15 @@ PortalRunUtility(Portal portal, PlannedStmt *pstmt,
     else
         portal->portalSnapshot = NULL;

+    /*
+     * If the Portal's statement list belongs to a cached plan, we must not
+     * damage the original statement node tree.  Processing of the statement
+     * might scribble on the node tree, so copy the statement.  (This could be
+     * refined if we knew which utility statements do that.)
+     */
+    if (portal->cplan)
+        pstmt = copyObject(pstmt);
+
     ProcessUtility(pstmt,
                    portal->sourceText,
                    isTopLevel ? PROCESS_UTILITY_TOPLEVEL : PROCESS_UTILITY_QUERY,
diff --git a/src/backend/tcop/utility.c b/src/backend/tcop/utility.c
index 1a8fc16773..81f34dfb69 100644
--- a/src/backend/tcop/utility.c
+++ b/src/backend/tcop/utility.c
@@ -485,6 +485,10 @@ CheckRestrictedOperation(const char *cmdname)
  *    qc: where to store command completion status data.  May be NULL,
  *        but if not, then caller must have initialized it.
  *
+ * Some utility statement execution functions will scribble on the passed
+ * "pstmt" node tree.  It is caller's responsibility that that tree not be
+ * something that must be preserved.
+ *
  * Caller MUST supply a queryString; it is not allowed (anymore) to pass NULL.
  * If you really don't have source text, you can pass a constant string,
  * perhaps "(query not available)".

В списке pgsql-hackers по дате отправления:

Предыдущее
От: Thomas Munro
Дата:
Сообщение: Re: A qsort template
Следующее
От: vignesh C
Дата:
Сообщение: Re: locking [user] catalog tables vs 2pc vs logical rep