Proposed patch for qual pushdown into UNION/INTERSECT

Поиск
Список
Период
Сортировка
От Tom Lane
Тема Proposed patch for qual pushdown into UNION/INTERSECT
Дата
Msg-id 23670.1030607122@sss.pgh.pa.us
обсуждение исходный текст
Ответы Re: Proposed patch for qual pushdown into UNION/INTERSECT  (Stephan Szabo <sszabo@megazone23.bigpanda.com>)
Re: Proposed patch for qual pushdown into UNION/INTERSECT  (Stephan Szabo <sszabo@megazone23.bigpanda.com>)
Список pgsql-patches
Seems to work in preliminary testing, but I'd like someone else to
give it a try before I decide it works.  Patch is against CVS tip.

            regards, tom lane


*** src/backend/optimizer/path/allpaths.c.orig    Thu Jun 20 16:29:29 2002
--- src/backend/optimizer/path/allpaths.c    Thu Aug 29 03:40:36 2002
***************
*** 46,51 ****
--- 46,56 ----
                          RangeTblEntry *rte);
  static RelOptInfo *make_one_rel_by_joins(Query *root, int levels_needed,
                        List *initial_rels);
+ static bool subquery_is_pushdown_safe(Query *subquery, Query *topquery);
+ static bool recurse_pushdown_safe(Node *setOp, Query *topquery);
+ static void subquery_push_qual(Query *subquery, Index rti, Node *qual);
+ static void recurse_push_qual(Node *setOp, Query *topquery,
+                               Index rti, Node *qual);


  /*
***************
*** 297,327 ****
       * generate a better plan for the subquery than evaluating all the
       * subquery output rows and then filtering them.
       *
!      * There are several cases where we cannot push down clauses:
!      *
!      * 1. If the subquery contains set ops (UNION/INTERSECT/EXCEPT) we do not
!      * push down any qual clauses, since the planner doesn't support quals
!      * at the top level of a setop.  (With suitable analysis we could try
!      * to push the quals down into the component queries of the setop, but
!      * getting it right seems nontrivial.  Work on this later.)
!      *
!      * 2. If the subquery has a LIMIT clause or a DISTINCT ON clause, we must
!      * not push down any quals, since that could change the set of rows
!      * returned.  (Actually, we could push down quals into a DISTINCT ON
!      * subquery if they refer only to DISTINCT-ed output columns, but
!      * checking that seems more work than it's worth.  In any case, a
!      * plain DISTINCT is safe to push down past.)
!      *
!      * 3. If the subquery has any functions returning sets in its target list,
!      * we do not push down any quals, since the quals
!      * might refer to those tlist items, which would mean we'd introduce
!      * functions-returning-sets into the subquery's WHERE/HAVING quals.
!      * (It'd be sufficient to not push down quals that refer to those
!      * particular tlist items, but that's much clumsier to check.)
!      *
!      * 4. We do not push down clauses that contain subselects, mainly because
!      * I'm not sure it will work correctly (the subplan hasn't yet
!      * transformed sublinks to subselects).
       *
       * Non-pushed-down clauses will get evaluated as qpquals of the
       * SubqueryScan node.
--- 302,312 ----
       * generate a better plan for the subquery than evaluating all the
       * subquery output rows and then filtering them.
       *
!      * There are several cases where we cannot push down clauses.
!      * Restrictions involving the subquery are checked by
!      * subquery_is_pushdown_safe().  Also, we do not push down clauses that
!      * contain subselects, mainly because I'm not sure it will work correctly
!      * (the subplan hasn't yet transformed sublinks to subselects).
       *
       * Non-pushed-down clauses will get evaluated as qpquals of the
       * SubqueryScan node.
***************
*** 329,339 ****
       * XXX Are there any cases where we want to make a policy decision not to
       * push down, because it'd result in a worse plan?
       */
!     if (subquery->setOperations == NULL &&
!         subquery->limitOffset == NULL &&
!         subquery->limitCount == NULL &&
!         !has_distinct_on_clause(subquery) &&
!         !expression_returns_set((Node *) subquery->targetList))
      {
          /* OK to consider pushing down individual quals */
          List       *upperrestrictlist = NIL;
--- 314,321 ----
       * XXX Are there any cases where we want to make a policy decision not to
       * push down, because it'd result in a worse plan?
       */
!     if (rel->baserestrictinfo != NIL &&
!         subquery_is_pushdown_safe(subquery, subquery))
      {
          /* OK to consider pushing down individual quals */
          List       *upperrestrictlist = NIL;
***************
*** 351,375 ****
              }
              else
              {
!                 /*
!                  * We need to replace Vars in the clause (which must refer
!                  * to outputs of the subquery) with copies of the
!                  * subquery's targetlist expressions.  Note that at this
!                  * point, any uplevel Vars in the clause should have been
!                  * replaced with Params, so they need no work.
!                  */
!                 clause = ResolveNew(clause, rti, 0,
!                                     subquery->targetList,
!                                     CMD_SELECT, 0);
!                 subquery->havingQual = make_and_qual(subquery->havingQual,
!                                                      clause);
!
!                 /*
!                  * We need not change the subquery's hasAggs or
!                  * hasSublinks flags, since we can't be pushing down any
!                  * aggregates that weren't there before, and we don't push
!                  * down subselects at all.
!                  */
              }
          }
          rel->baserestrictinfo = upperrestrictlist;
--- 333,340 ----
              }
              else
              {
!                 /* Push it down */
!                 subquery_push_qual(subquery, rti, clause);
              }
          }
          rel->baserestrictinfo = upperrestrictlist;
***************
*** 547,553 ****
--- 512,694 ----
  }

  /*****************************************************************************
+  *            PUSHING QUALS DOWN INTO SUBQUERIES
+  *****************************************************************************/
+
+ /*
+  * subquery_is_pushdown_safe - is a subquery safe for pushing down quals?
+  *
+  * subquery is the particular component query being checked.  topquery
+  * is the top component of a set-operations tree (the same Query if no
+  * set-op is involved).
+  *
+  * Conditions checked here:
+  *
+  * 1. If the subquery has a LIMIT clause or a DISTINCT ON clause, we must
+  * not push down any quals, since that could change the set of rows
+  * returned.  (Actually, we could push down quals into a DISTINCT ON
+  * subquery if they refer only to DISTINCT-ed output columns, but
+  * checking that seems more work than it's worth.  In any case, a
+  * plain DISTINCT is safe to push down past.)
   *
+  * 2. If the subquery has any functions returning sets in its target list,
+  * we do not push down any quals, since the quals
+  * might refer to those tlist items, which would mean we'd introduce
+  * functions-returning-sets into the subquery's WHERE/HAVING quals.
+  * (It'd be sufficient to not push down quals that refer to those
+  * particular tlist items, but that's much clumsier to check.)
+  *
+  * 3. If the subquery contains EXCEPT or EXCEPT ALL set ops we cannot push
+  * quals into it, because that would change the results.  For subqueries
+  * using UNION/UNION ALL/INTERSECT/INTERSECT ALL, we can push the quals
+  * into each component query, so long as all the component queries share
+  * identical output types.  (That restriction could probably be relaxed,
+  * but it would take much more code to include type coercion code into
+  * the quals, and I'm also concerned about possible semantic gotchas.)
+  */
+ static bool
+ subquery_is_pushdown_safe(Query *subquery, Query *topquery)
+ {
+     SetOperationStmt *topop;
+
+     /* Check points 1 and 2 */
+     if (subquery->limitOffset != NULL ||
+         subquery->limitCount != NULL ||
+         has_distinct_on_clause(subquery) ||
+         expression_returns_set((Node *) subquery->targetList))
+         return false;
+
+     /* Are we at top level, or looking at a setop component? */
+     if (subquery == topquery)
+     {
+         /* Top level, so check any component queries */
+         if (subquery->setOperations != NULL)
+             if (!recurse_pushdown_safe(subquery->setOperations, topquery))
+                 return false;
+     }
+     else
+     {
+         /* Setop component must not have more components (too weird) */
+         if (subquery->setOperations != NULL)
+             return false;
+         /* Setop component output types must match top level */
+         topop = (SetOperationStmt *) topquery->setOperations;
+         Assert(topop && IsA(topop, SetOperationStmt));
+         if (!tlist_same_datatypes(subquery->targetList,
+                                   topop->colTypes,
+                                   true))
+             return false;
+
+     }
+     return true;
+ }
+
+ /*
+  * Helper routine to recurse through setOperations tree
+  */
+ static bool
+ recurse_pushdown_safe(Node *setOp, Query *topquery)
+ {
+     if (IsA(setOp, RangeTblRef))
+     {
+         RangeTblRef *rtr = (RangeTblRef *) setOp;
+         RangeTblEntry *rte = rt_fetch(rtr->rtindex, topquery->rtable);
+         Query       *subquery = rte->subquery;
+
+         Assert(subquery != NULL);
+         return subquery_is_pushdown_safe(subquery, topquery);
+     }
+     else if (IsA(setOp, SetOperationStmt))
+     {
+         SetOperationStmt *op = (SetOperationStmt *) setOp;
+
+         /* EXCEPT is no good */
+         if (op->op == SETOP_EXCEPT)
+             return false;
+         /* Else recurse */
+         if (!recurse_pushdown_safe(op->larg, topquery))
+             return false;
+         if (!recurse_pushdown_safe(op->rarg, topquery))
+             return false;
+     }
+     else
+     {
+         elog(ERROR, "recurse_pushdown_safe: unexpected node %d",
+              (int) nodeTag(setOp));
+     }
+     return true;
+ }
+
+ /*
+  * subquery_push_qual - push down a qual that we have determined is safe
+  */
+ static void
+ subquery_push_qual(Query *subquery, Index rti, Node *qual)
+ {
+     if (subquery->setOperations != NULL)
+     {
+         /* Recurse to push it separately to each component query */
+         recurse_push_qual(subquery->setOperations, subquery, rti, qual);
+     }
+     else
+     {
+         /*
+          * We need to replace Vars in the qual (which must refer
+          * to outputs of the subquery) with copies of the
+          * subquery's targetlist expressions.  Note that at this
+          * point, any uplevel Vars in the qual should have been
+          * replaced with Params, so they need no work.
+          *
+          * This step also ensures that when we are pushing into a setop
+          * tree, each component query gets its own copy of the qual.
+          */
+         qual = ResolveNew(qual, rti, 0,
+                           subquery->targetList,
+                           CMD_SELECT, 0);
+         subquery->havingQual = make_and_qual(subquery->havingQual,
+                                              qual);
+
+         /*
+          * We need not change the subquery's hasAggs or
+          * hasSublinks flags, since we can't be pushing down any
+          * aggregates that weren't there before, and we don't push
+          * down subselects at all.
+          */
+     }
+ }
+
+ /*
+  * Helper routine to recurse through setOperations tree
+  */
+ static void
+ recurse_push_qual(Node *setOp, Query *topquery,
+                   Index rti, Node *qual)
+ {
+     if (IsA(setOp, RangeTblRef))
+     {
+         RangeTblRef *rtr = (RangeTblRef *) setOp;
+         RangeTblEntry *rte = rt_fetch(rtr->rtindex, topquery->rtable);
+         Query       *subquery = rte->subquery;
+
+         Assert(subquery != NULL);
+         subquery_push_qual(subquery, rti, qual);
+     }
+     else if (IsA(setOp, SetOperationStmt))
+     {
+         SetOperationStmt *op = (SetOperationStmt *) setOp;
+
+         recurse_push_qual(op->larg, topquery, rti, qual);
+         recurse_push_qual(op->rarg, topquery, rti, qual);
+     }
+     else
+     {
+         elog(ERROR, "recurse_push_qual: unexpected node %d",
+              (int) nodeTag(setOp));
+     }
+ }
+
+ /*****************************************************************************
+  *            DEBUG SUPPORT
   *****************************************************************************/

  #ifdef OPTIMIZER_DEBUG
*** src/backend/optimizer/prep/prepunion.c.orig    Fri Aug  2 14:15:06 2002
--- src/backend/optimizer/prep/prepunion.c    Thu Aug 29 03:16:12 2002
***************
*** 66,72 ****
  static List *generate_append_tlist(List *colTypes, bool flag,
                       List *input_plans,
                       List *refnames_tlist);
- static bool tlist_same_datatypes(List *tlist, List *colTypes, bool junkOK);
  static Node *adjust_inherited_attrs_mutator(Node *node,
                                 adjust_inherited_attrs_context *context);

--- 66,71 ----
***************
*** 579,585 ****
   * Resjunk columns are ignored if junkOK is true; otherwise presence of
   * a resjunk column will always cause a 'false' result.
   */
! static bool
  tlist_same_datatypes(List *tlist, List *colTypes, bool junkOK)
  {
      List       *i;
--- 578,584 ----
   * Resjunk columns are ignored if junkOK is true; otherwise presence of
   * a resjunk column will always cause a 'false' result.
   */
! bool
  tlist_same_datatypes(List *tlist, List *colTypes, bool junkOK)
  {
      List       *i;
*** src/include/optimizer/prep.h.orig    Thu Jun 20 16:29:51 2002
--- src/include/optimizer/prep.h    Thu Aug 29 03:16:03 2002
***************
*** 43,46 ****
--- 43,48 ----
                         Index old_rt_index, Oid old_relid,
                         Index new_rt_index, Oid new_relid);

+ extern bool tlist_same_datatypes(List *tlist, List *colTypes, bool junkOK);
+
  #endif   /* PREP_H */

В списке pgsql-patches по дате отправления:

Предыдущее
От: Neil Conway
Дата:
Сообщение: Re: revised patch for PL/PgSQL table functions
Следующее
От: Neil Conway
Дата:
Сообщение: Re: rules regression test fix