Re: POC: converting Lists into arrays

Поиск
Список
Период
Сортировка
От Tom Lane
Тема Re: POC: converting Lists into arrays
Дата
Msg-id 17220.1565301350@sss.pgh.pa.us
обсуждение исходный текст
Ответ на Re: POC: converting Lists into arrays  (Tom Lane <tgl@sss.pgh.pa.us>)
Ответы Re: POC: converting Lists into arrays  (David Rowley <david.rowley@2ndquadrant.com>)
Список pgsql-hackers
I wrote:
> BTW, further on the subject of performance --- I'm aware of at least
> these topics for follow-on patches:

> * Fix places that are maintaining arrays parallel to Lists for
> access-speed reasons (at least simple_rte_array, append_rel_array,
> es_range_table_array).

Attached is a patch that removes simple_rte_array in favor of just
accessing the query's rtable directly.  I concluded that there was
not much point in messing with simple_rel_array or append_rel_array,
because they are not in fact just mirrors of some List.  There's no
List at all of baserel RelOptInfos, and while we do have a list of
AppendRelInfos, it's a compact, random-order list not one indexable
by child relid.

Having done this, though, I'm a bit discouraged about whether to commit
it.  In light testing, it's not any faster than HEAD and in complex
queries seems to actually be a bit slower.  I suspect the reason is
that we've effectively replaced
    root->simple_rte_array[i]
with
    root->parse->rtable->elements[i-1]
and the two extra levels of indirection are taking their toll.

It'd be possible to get rid of one of those indirections by maintaining a
copy of root->parse->rtable directly in PlannerInfo; but that throws away
most of the intellectual appeal of not having two sources of truth to
maintain, and it won't completely close the performance gap.

Other minor objections include:

* Many RTE accesses now look randomly different from adjacent 
RelOptInfo accesses.

* The inheritance-expansion code is a bit sloppy about how much it will
expand these arrays, which means it's possible in corner cases for
length(parse->rtable) to be less than root->simple_rel_array_size-1.
This resulted in a crash in add_other_rels_to_query, which was assuming
it could fetch a possibly-null RTE pointer from indexes up to
simple_rel_array_size-1.  While that wasn't hard to fix, I wonder
whether any third-party code has similar assumptions.

So on the whole, I'm inclined not to do this.  There are some cosmetic
bits of this patch that I do want to commit though: I found some
out-of-date comments, and I realized that it's pretty dumb not to
just merge setup_append_rel_array into setup_simple_rel_arrays.
The division of labor there is inconsistent with the fact that
there's no such division in expand_planner_arrays.

I still have hopes for getting rid of es_range_table_array though,
and will look at that tomorrow or so.

            regards, tom lane

diff --git a/contrib/postgres_fdw/postgres_fdw.c b/contrib/postgres_fdw/postgres_fdw.c
index 06a2058..8bd1c47 100644
--- a/contrib/postgres_fdw/postgres_fdw.c
+++ b/contrib/postgres_fdw/postgres_fdw.c
@@ -2198,7 +2198,7 @@ postgresPlanDirectModify(PlannerInfo *root,
     }
     else
         foreignrel = root->simple_rel_array[resultRelation];
-    rte = root->simple_rte_array[resultRelation];
+    rte = planner_rt_fetch(resultRelation, root);
     fpinfo = (PgFdwRelationInfo *) foreignrel->fdw_private;

     /*
diff --git a/src/backend/optimizer/path/allpaths.c b/src/backend/optimizer/path/allpaths.c
index e9ee32b..fe7f8b1 100644
--- a/src/backend/optimizer/path/allpaths.c
+++ b/src/backend/optimizer/path/allpaths.c
@@ -307,7 +307,7 @@ set_base_rel_sizes(PlannerInfo *root)
         if (rel->reloptkind != RELOPT_BASEREL)
             continue;

-        rte = root->simple_rte_array[rti];
+        rte = planner_rt_fetch(rti, root);

         /*
          * If parallelism is allowable for this query in general, see whether
@@ -349,7 +349,7 @@ set_base_rel_pathlists(PlannerInfo *root)
         if (rel->reloptkind != RELOPT_BASEREL)
             continue;

-        set_rel_pathlist(root, rel, rti, root->simple_rte_array[rti]);
+        set_rel_pathlist(root, rel, rti, planner_rt_fetch(rti, root));
     }
 }

@@ -1008,7 +1008,7 @@ set_append_rel_size(PlannerInfo *root, RelOptInfo *rel,
             continue;

         childRTindex = appinfo->child_relid;
-        childRTE = root->simple_rte_array[childRTindex];
+        childRTE = planner_rt_fetch(childRTindex, root);

         /*
          * The child rel's RelOptInfo was already created during
@@ -1239,7 +1239,7 @@ set_append_rel_pathlist(PlannerInfo *root, RelOptInfo *rel,

         /* Re-locate the child RTE and RelOptInfo */
         childRTindex = appinfo->child_relid;
-        childRTE = root->simple_rte_array[childRTindex];
+        childRTE = planner_rt_fetch(childRTindex, root);
         childrel = root->simple_rel_array[childRTindex];

         /*
@@ -3742,9 +3742,8 @@ print_relids(PlannerInfo *root, Relids relids)
     {
         if (!first)
             printf(" ");
-        if (x < root->simple_rel_array_size &&
-            root->simple_rte_array[x])
-            printf("%s", root->simple_rte_array[x]->eref->aliasname);
+        if (x <= list_length(root->parse->rtable))
+            printf("%s", planner_rt_fetch(x, root)->eref->aliasname);
         else
             printf("%d", x);
         first = false;
diff --git a/src/backend/optimizer/plan/analyzejoins.c b/src/backend/optimizer/plan/analyzejoins.c
index d19ff41..2576439 100644
--- a/src/backend/optimizer/plan/analyzejoins.c
+++ b/src/backend/optimizer/plan/analyzejoins.c
@@ -609,7 +609,7 @@ rel_supports_distinctness(PlannerInfo *root, RelOptInfo *rel)
     }
     else if (rel->rtekind == RTE_SUBQUERY)
     {
-        Query       *subquery = root->simple_rte_array[rel->relid]->subquery;
+        Query       *subquery = planner_rt_fetch(rel->relid, root)->subquery;

         /* Check if the subquery has any qualities that support distinctness */
         if (query_supports_distinctness(subquery))
@@ -660,7 +660,7 @@ rel_is_distinct_for(PlannerInfo *root, RelOptInfo *rel, List *clause_list)
     else if (rel->rtekind == RTE_SUBQUERY)
     {
         Index        relid = rel->relid;
-        Query       *subquery = root->simple_rte_array[relid]->subquery;
+        Query       *subquery = planner_rt_fetch(relid, root)->subquery;
         List       *colnos = NIL;
         List       *opids = NIL;
         ListCell   *l;
diff --git a/src/backend/optimizer/plan/createplan.c b/src/backend/optimizer/plan/createplan.c
index f232569..5689330 100644
--- a/src/backend/optimizer/plan/createplan.c
+++ b/src/backend/optimizer/plan/createplan.c
@@ -4472,7 +4472,7 @@ create_hashjoin_plan(PlannerInfo *root,
             Var           *var = (Var *) node;
             RangeTblEntry *rte;

-            rte = root->simple_rte_array[var->varno];
+            rte = planner_rt_fetch(var->varno, root);
             if (rte->rtekind == RTE_RELATION)
             {
                 skewTable = rte->relid;
diff --git a/src/backend/optimizer/plan/initsplan.c b/src/backend/optimizer/plan/initsplan.c
index 73da0c2..4a2aaa2 100644
--- a/src/backend/optimizer/plan/initsplan.c
+++ b/src/backend/optimizer/plan/initsplan.c
@@ -148,7 +148,7 @@ add_other_rels_to_query(PlannerInfo *root)
     for (rti = 1; rti < root->simple_rel_array_size; rti++)
     {
         RelOptInfo *rel = root->simple_rel_array[rti];
-        RangeTblEntry *rte = root->simple_rte_array[rti];
+        RangeTblEntry *rte;

         /* there may be empty slots corresponding to non-baserel RTEs */
         if (rel == NULL)
@@ -159,6 +159,7 @@ add_other_rels_to_query(PlannerInfo *root)
             continue;

         /* If it's marked as inheritable, look for children. */
+        rte = planner_rt_fetch(rti, root);
         if (rte->inh)
             expand_inherited_rtentry(root, rel, rte, rti);
     }
@@ -351,7 +352,7 @@ find_lateral_references(PlannerInfo *root)
 static void
 extract_lateral_references(PlannerInfo *root, RelOptInfo *brel, Index rtindex)
 {
-    RangeTblEntry *rte = root->simple_rte_array[rtindex];
+    RangeTblEntry *rte = planner_rt_fetch(rtindex, root);
     List       *vars;
     List       *newvars;
     Relids        where_needed;
@@ -1086,7 +1087,7 @@ process_security_barrier_quals(PlannerInfo *root,
                                int rti, Relids qualscope,
                                bool below_outer_join)
 {
-    RangeTblEntry *rte = root->simple_rte_array[rti];
+    RangeTblEntry *rte = planner_rt_fetch(rti, root);
     Index        security_level = 0;
     ListCell   *lc;

diff --git a/src/backend/optimizer/plan/planmain.c b/src/backend/optimizer/plan/planmain.c
index df3f8c2..3398bde 100644
--- a/src/backend/optimizer/plan/planmain.c
+++ b/src/backend/optimizer/plan/planmain.c
@@ -79,9 +79,7 @@ query_planner(PlannerInfo *root,
     root->initial_rels = NIL;

     /*
-     * Make a flattened version of the rangetable for faster access (this is
-     * OK because the rangetable won't change any more), and set up an empty
-     * array for indexing base relations.
+     * Set up arrays for accessing base relations and AppendRelInfos.
      */
     setup_simple_rel_arrays(root);

@@ -99,7 +97,7 @@ query_planner(PlannerInfo *root,
         if (IsA(jtnode, RangeTblRef))
         {
             int            varno = ((RangeTblRef *) jtnode)->rtindex;
-            RangeTblEntry *rte = root->simple_rte_array[varno];
+            RangeTblEntry *rte = planner_rt_fetch(varno, root);

             Assert(rte != NULL);
             if (rte->rtekind == RTE_RESULT)
@@ -157,12 +155,6 @@ query_planner(PlannerInfo *root,
     }

     /*
-     * Populate append_rel_array with each AppendRelInfo to allow direct
-     * lookups by child relid.
-     */
-    setup_append_rel_array(root);
-
-    /*
      * Construct RelOptInfo nodes for all base relations used in the query.
      * Appendrel member relations ("other rels") will be added later.
      *
diff --git a/src/backend/optimizer/plan/planner.c b/src/backend/optimizer/plan/planner.c
index 0f918dd..7a38955 100644
--- a/src/backend/optimizer/plan/planner.c
+++ b/src/backend/optimizer/plan/planner.c
@@ -1754,17 +1754,6 @@ inheritance_planner(PlannerInfo *root)
         root->simple_rel_array = save_rel_array;
         root->append_rel_array = save_append_rel_array;

-        /* Must reconstruct master's simple_rte_array, too */
-        root->simple_rte_array = (RangeTblEntry **)
-            palloc0((list_length(final_rtable) + 1) * sizeof(RangeTblEntry *));
-        rti = 1;
-        foreach(lc, final_rtable)
-        {
-            RangeTblEntry *rte = lfirst_node(RangeTblEntry, lc);
-
-            root->simple_rte_array[rti++] = rte;
-        }
-
         /* Put back adjusted rowmarks, too */
         root->rowMarks = final_rowmarks;
     }
diff --git a/src/backend/optimizer/prep/prepunion.c b/src/backend/optimizer/prep/prepunion.c
index 5a11c12..a05ed10 100644
--- a/src/backend/optimizer/prep/prepunion.c
+++ b/src/backend/optimizer/prep/prepunion.c
@@ -132,17 +132,11 @@ plan_set_operations(PlannerInfo *root)
     /*
      * We'll need to build RelOptInfos for each of the leaf subqueries, which
      * are RTE_SUBQUERY rangetable entries in this Query.  Prepare the index
-     * arrays for that.
+     * arrays for those, and for AppendRelInfos in case they're needed.
      */
     setup_simple_rel_arrays(root);

     /*
-     * Populate append_rel_array with each AppendRelInfo to allow direct
-     * lookups by child relid.
-     */
-    setup_append_rel_array(root);
-
-    /*
      * Find the leftmost component Query.  We need to use its column names for
      * all generated tlists (else SELECT INTO won't work right).
      */
@@ -150,7 +144,7 @@ plan_set_operations(PlannerInfo *root)
     while (node && IsA(node, SetOperationStmt))
         node = ((SetOperationStmt *) node)->larg;
     Assert(node && IsA(node, RangeTblRef));
-    leftmostRTE = root->simple_rte_array[((RangeTblRef *) node)->rtindex];
+    leftmostRTE = planner_rt_fetch(((RangeTblRef *) node)->rtindex, root);
     leftmostQuery = leftmostRTE->subquery;
     Assert(leftmostQuery != NULL);

@@ -225,7 +219,7 @@ recurse_set_operations(Node *setOp, PlannerInfo *root,
     if (IsA(setOp, RangeTblRef))
     {
         RangeTblRef *rtr = (RangeTblRef *) setOp;
-        RangeTblEntry *rte = root->simple_rte_array[rtr->rtindex];
+        RangeTblEntry *rte = planner_rt_fetch(rtr->rtindex, root);
         Query       *subquery = rte->subquery;
         PlannerInfo *subroot;
         RelOptInfo *final_rel;
diff --git a/src/backend/optimizer/util/inherit.c b/src/backend/optimizer/util/inherit.c
index 38bc61e..5ed7737 100644
--- a/src/backend/optimizer/util/inherit.c
+++ b/src/backend/optimizer/util/inherit.c
@@ -481,12 +481,10 @@ expand_single_inheritance_child(PlannerInfo *root, RangeTblEntry *parentrte,
     }

     /*
-     * Store the RTE and appinfo in the respective PlannerInfo arrays, which
-     * the caller must already have allocated space for.
+     * Store the appinfo in the associated PlannerInfo array, which the caller
+     * must already have allocated space for.
      */
     Assert(childRTindex < root->simple_rel_array_size);
-    Assert(root->simple_rte_array[childRTindex] == NULL);
-    root->simple_rte_array[childRTindex] = childrte;
     Assert(root->append_rel_array[childRTindex] == NULL);
     root->append_rel_array[childRTindex] = appinfo;

@@ -601,7 +599,7 @@ expand_appendrel_subquery(PlannerInfo *root, RelOptInfo *rel,

         /* find the child RTE, which should already exist */
         Assert(childRTindex < root->simple_rel_array_size);
-        childrte = root->simple_rte_array[childRTindex];
+        childrte = planner_rt_fetch(childRTindex, root);
         Assert(childrte != NULL);

         /* Build the child RelOptInfo. */
diff --git a/src/backend/optimizer/util/plancat.c b/src/backend/optimizer/util/plancat.c
index 98e9948..848ca1a 100644
--- a/src/backend/optimizer/util/plancat.c
+++ b/src/backend/optimizer/util/plancat.c
@@ -177,7 +177,7 @@ get_relation_info(PlannerInfo *root, Oid relationObjectId, bool inhparent,
          * index while we hold lock on the parent rel, and no lock type used
          * for queries blocks any other kind of index operation.
          */
-        lmode = root->simple_rte_array[varno]->rellockmode;
+        lmode = planner_rt_fetch(varno, root)->rellockmode;

         foreach(l, indexoidlist)
         {
diff --git a/src/backend/optimizer/util/relnode.c b/src/backend/optimizer/util/relnode.c
index 37d228c..1f9f037 100644
--- a/src/backend/optimizer/util/relnode.c
+++ b/src/backend/optimizer/util/relnode.c
@@ -67,46 +67,26 @@ static void build_child_join_reltarget(PlannerInfo *root,

 /*
  * setup_simple_rel_arrays
- *      Prepare the arrays we use for quickly accessing base relations.
+ *      Prepare the arrays we use for quickly accessing base relations
+ *      and AppendRelInfos.
  */
 void
 setup_simple_rel_arrays(PlannerInfo *root)
 {
-    Index        rti;
+    /* Arrays are accessed using RT indexes (1..N) */
+    int            size = list_length(root->parse->rtable) + 1;
     ListCell   *lc;

-    /* Arrays are accessed using RT indexes (1..N) */
-    root->simple_rel_array_size = list_length(root->parse->rtable) + 1;
+    root->simple_rel_array_size = size;

-    /* simple_rel_array is initialized to all NULLs */
+    /*
+     * simple_rel_array is initialized to all NULLs, since no RelOptInfos
+     * exist yet.  It'll be filled by later calls to build_simple_rel().
+     */
     root->simple_rel_array = (RelOptInfo **)
-        palloc0(root->simple_rel_array_size * sizeof(RelOptInfo *));
-
-    /* simple_rte_array is an array equivalent of the rtable list */
-    root->simple_rte_array = (RangeTblEntry **)
-        palloc0(root->simple_rel_array_size * sizeof(RangeTblEntry *));
-    rti = 1;
-    foreach(lc, root->parse->rtable)
-    {
-        RangeTblEntry *rte = (RangeTblEntry *) lfirst(lc);
-
-        root->simple_rte_array[rti++] = rte;
-    }
-}
-
-/*
- * setup_append_rel_array
- *        Populate the append_rel_array to allow direct lookups of
- *        AppendRelInfos by child relid.
- *
- * The array remains unallocated if there are no AppendRelInfos.
- */
-void
-setup_append_rel_array(PlannerInfo *root)
-{
-    ListCell   *lc;
-    int            size = list_length(root->parse->rtable) + 1;
+        palloc0(size * sizeof(RelOptInfo *));

+    /* append_rel_array is not needed if there are no AppendRelInfos */
     if (root->append_rel_list == NIL)
     {
         root->append_rel_array = NULL;
@@ -116,6 +96,12 @@ setup_append_rel_array(PlannerInfo *root)
     root->append_rel_array = (AppendRelInfo **)
         palloc0(size * sizeof(AppendRelInfo *));

+    /*
+     * append_rel_array is filled with any already-existing AppendRelInfos,
+     * which currently could only come from UNION ALL flattening.  We might
+     * add more later during inheritance expansion, but it's the
+     * responsibility of the expansion code to update the array properly.
+     */
     foreach(lc, root->append_rel_list)
     {
         AppendRelInfo *appinfo = lfirst_node(AppendRelInfo, lc);
@@ -133,8 +119,12 @@ setup_append_rel_array(PlannerInfo *root)

 /*
  * expand_planner_arrays
- *        Expand the PlannerInfo's per-RTE arrays by add_size members
+ *        Expand the PlannerInfo's per-baserel arrays by add_size members
  *        and initialize the newly added entries to NULLs
+ *
+ * Note: this causes the append_rel_array to become allocated even if
+ * it was not before.  This is okay for current uses, because we only call
+ * this when adding child relations, which always have AppendRelInfos.
  */
 void
 expand_planner_arrays(PlannerInfo *root, int add_size)
@@ -145,12 +135,6 @@ expand_planner_arrays(PlannerInfo *root, int add_size)

     new_size = root->simple_rel_array_size + add_size;

-    root->simple_rte_array = (RangeTblEntry **)
-        repalloc(root->simple_rte_array,
-                 sizeof(RangeTblEntry *) * new_size);
-    MemSet(root->simple_rte_array + root->simple_rel_array_size,
-           0, sizeof(RangeTblEntry *) * add_size);
-
     root->simple_rel_array = (RelOptInfo **)
         repalloc(root->simple_rel_array,
                  sizeof(RelOptInfo *) * new_size);
@@ -190,7 +174,7 @@ build_simple_rel(PlannerInfo *root, int relid, RelOptInfo *parent)
         elog(ERROR, "rel %d already exists", relid);

     /* Fetch RTE for relation */
-    rte = root->simple_rte_array[relid];
+    rte = planner_rt_fetch(relid, root);
     Assert(rte != NULL);

     rel = makeNode(RelOptInfo);
diff --git a/src/backend/statistics/extended_stats.c b/src/backend/statistics/extended_stats.c
index 23c74f7..05c18a4 100644
--- a/src/backend/statistics/extended_stats.c
+++ b/src/backend/statistics/extended_stats.c
@@ -793,7 +793,7 @@ statext_is_compatible_clause_internal(PlannerInfo *root, Node *clause,
     /* (Var op Const) or (Const op Var) */
     if (is_opclause(clause))
     {
-        RangeTblEntry *rte = root->simple_rte_array[relid];
+        RangeTblEntry *rte = planner_rt_fetch(relid, root);
         OpExpr       *expr = (OpExpr *) clause;
         Var           *var;

@@ -922,7 +922,7 @@ static bool
 statext_is_compatible_clause(PlannerInfo *root, Node *clause, Index relid,
                              Bitmapset **attnums)
 {
-    RangeTblEntry *rte = root->simple_rte_array[relid];
+    RangeTblEntry *rte = planner_rt_fetch(relid, root);
     RestrictInfo *rinfo = (RestrictInfo *) clause;
     Oid            userid;

diff --git a/src/backend/utils/adt/selfuncs.c b/src/backend/utils/adt/selfuncs.c
index 7eba59e..1c18499 100644
--- a/src/backend/utils/adt/selfuncs.c
+++ b/src/backend/utils/adt/selfuncs.c
@@ -4648,7 +4648,7 @@ static void
 examine_simple_variable(PlannerInfo *root, Var *var,
                         VariableStatData *vardata)
 {
-    RangeTblEntry *rte = root->simple_rte_array[var->varno];
+    RangeTblEntry *rte = planner_rt_fetch(var->varno, root);

     Assert(IsA(rte, RangeTblEntry));

@@ -5130,7 +5130,7 @@ get_actual_variable_range(PlannerInfo *root, VariableStatData *vardata,
     if (rel == NULL || rel->indexlist == NIL)
         return false;
     /* If it has indexes it must be a plain relation */
-    rte = root->simple_rte_array[rel->relid];
+    rte = planner_rt_fetch(rel->relid, root);
     Assert(rte->rtekind == RTE_RELATION);

     /* Search through the indexes to see if any match our problem */
diff --git a/src/include/nodes/pathnodes.h b/src/include/nodes/pathnodes.h
index e3c579e..6f3115a 100644
--- a/src/include/nodes/pathnodes.h
+++ b/src/include/nodes/pathnodes.h
@@ -203,15 +203,7 @@ struct PlannerInfo
     int            simple_rel_array_size;    /* allocated size of array */

     /*
-     * simple_rte_array is the same length as simple_rel_array and holds
-     * pointers to the associated rangetable entries.  This lets us avoid
-     * rt_fetch(), which can be a bit slow once large inheritance sets have
-     * been expanded.
-     */
-    RangeTblEntry **simple_rte_array;    /* rangetable as an array */
-
-    /*
-     * append_rel_array is the same length as the above arrays, and holds
+     * append_rel_array is the same length as simple_rel_array, and holds
      * pointers to the corresponding AppendRelInfo entry indexed by
      * child_relid, or NULL if none.  The array itself is not allocated if
      * append_rel_list is empty.
@@ -365,14 +357,9 @@ struct PlannerInfo
 };


-/*
- * In places where it's known that simple_rte_array[] must have been prepared
- * already, we just index into it to fetch RTEs.  In code that might be
- * executed before or after entering query_planner(), use this macro.
- */
+/* Handy macro for getting the RTE with rangetable index rti */
 #define planner_rt_fetch(rti, root) \
-    ((root)->simple_rte_array ? (root)->simple_rte_array[rti] : \
-     rt_fetch(rti, (root)->parse->rtable))
+    ((RangeTblEntry *) list_nth((root)->parse->rtable, (rti)-1))

 /*
  * If multiple relations are partitioned the same way, all such partitions
diff --git a/src/include/optimizer/pathnode.h b/src/include/optimizer/pathnode.h
index 182ffee..a12af54 100644
--- a/src/include/optimizer/pathnode.h
+++ b/src/include/optimizer/pathnode.h
@@ -277,7 +277,6 @@ extern Path *reparameterize_path_by_child(PlannerInfo *root, Path *path,
  * prototypes for relnode.c
  */
 extern void setup_simple_rel_arrays(PlannerInfo *root);
-extern void setup_append_rel_array(PlannerInfo *root);
 extern void expand_planner_arrays(PlannerInfo *root, int add_size);
 extern RelOptInfo *build_simple_rel(PlannerInfo *root, int relid,
                                     RelOptInfo *parent);

В списке pgsql-hackers по дате отправления:

Предыдущее
От: Ibrar Ahmed
Дата:
Сообщение: Re: Small const correctness patch
Следующее
От: Thomas Munro
Дата:
Сообщение: Re: Locale support