Re: any way to use indexscan to get last X values with "order by Y limit X" clause?

Поиск
Список
Период
Сортировка
От Tom Lane
Тема Re: any way to use indexscan to get last X values with "order by Y limit X" clause?
Дата
Msg-id 23329.1055695990@sss.pgh.pa.us
обсуждение исходный текст
Ответ на any way to use indexscan to get last X values with "order by Y limit X" clause?  (Tomaz Borstnar <tomaz.borstnar@over.net>)
Ответы Re: any way to use indexscan to get last X values
Список pgsql-performance
Tomaz Borstnar <tomaz.borstnar@over.net> writes:
> SELECT thread, modifystamp, count(id) AS tcount, abstime(modifystamp) AS
> latest, max(id) as maxid FROM tjavendan WHERE approved='Y'  GROUP BY
> thread, modifystamp ORDER BY modifystamp desc, thread desc limit 40

> Having backwards reading of index would really help here.

The only way that a fast-start plan is useful is if there is a way to do
it with no explicit sort steps at all.  A sort step must read its entire
input before it can produce any output, so you completely blow the
chance of not reading the whole table as soon as there's any sorting.

There are a couple of reasons why this query can't be done using only an
initial indexscan to sort the data:

1. You don't have a suitable index.  Neither an index on modifystamp
alone nor an index on thread alone is of any use to produce a two-column
ordering; you need a two-column index on (modifystamp, thread).

2. The GROUP BY and ORDER BY steps require different sort orders, and so
even if an index satisfied one, there'd still be a sort needed for the
other.  This is partly your fault (writing the columns in different
orders) and partly the system's fault: it's implicitly taking the GROUP
BY entries to be equivalent to ORDER BY ASC, which is overspecification.

I've applied the attached patch to CVS tip to cure the latter problem.
With this, a two-column index, and compatible column ordering in ORDER
BY and GROUP BY, I get a reasonable-looking fast-start plan.  The patch
will not apply exactly against 7.3 because there's a renamed function
call in there, but you could make it work with a little effort.

            regards, tom lane


*** src/backend/parser/analyze.c.orig    Fri Jun  6 11:04:02 2003
--- src/backend/parser/analyze.c    Sun Jun 15 12:05:34 2003
***************
*** 1787,1799 ****
       */
      qry->havingQual = transformWhereClause(pstate, stmt->havingClause);

!     qry->groupClause = transformGroupClause(pstate,
!                                             stmt->groupClause,
!                                             qry->targetList);
!
      qry->sortClause = transformSortClause(pstate,
                                            stmt->sortClause,
                                            qry->targetList);

      qry->distinctClause = transformDistinctClause(pstate,
                                                    stmt->distinctClause,
--- 1787,1804 ----
       */
      qry->havingQual = transformWhereClause(pstate, stmt->havingClause);

!     /*
!      * Transform sorting/grouping stuff.  Do ORDER BY first because both
!      * transformGroupClause and transformDistinctClause need the results.
!      */
      qry->sortClause = transformSortClause(pstate,
                                            stmt->sortClause,
                                            qry->targetList);
+
+     qry->groupClause = transformGroupClause(pstate,
+                                             stmt->groupClause,
+                                             qry->targetList,
+                                             qry->sortClause);

      qry->distinctClause = transformDistinctClause(pstate,
                                                    stmt->distinctClause,
*** src/backend/parser/parse_clause.c.orig    Fri Jun  6 11:04:02 2003
--- src/backend/parser/parse_clause.c    Sun Jun 15 12:19:14 2003
***************
*** 1124,1130 ****
   *      transform a GROUP BY clause
   */
  List *
! transformGroupClause(ParseState *pstate, List *grouplist, List *targetlist)
  {
      List       *glist = NIL,
                 *gl;
--- 1124,1131 ----
   *      transform a GROUP BY clause
   */
  List *
! transformGroupClause(ParseState *pstate, List *grouplist,
!                      List *targetlist, List *sortClause)
  {
      List       *glist = NIL,
                 *gl;
***************
*** 1132,1152 ****
      foreach(gl, grouplist)
      {
          TargetEntry *tle;

          tle = findTargetlistEntry(pstate, lfirst(gl),
                                    targetlist, GROUP_CLAUSE);

          /* avoid making duplicate grouplist entries */
!         if (!targetIsInSortList(tle, glist))
!         {
!             GroupClause *grpcl = makeNode(GroupClause);
!
!             grpcl->tleSortGroupRef = assignSortGroupRef(tle, targetlist);

!             grpcl->sortop = ordering_oper_opid(tle->resdom->restype);
!
!             glist = lappend(glist, grpcl);
          }
      }

      return glist;
--- 1133,1173 ----
      foreach(gl, grouplist)
      {
          TargetEntry *tle;
+         Oid            ordering_op;
+         GroupClause *grpcl;

          tle = findTargetlistEntry(pstate, lfirst(gl),
                                    targetlist, GROUP_CLAUSE);

          /* avoid making duplicate grouplist entries */
!         if (targetIsInSortList(tle, glist))
!             continue;

!         /*
!          * If the GROUP BY clause matches the ORDER BY clause, we want to
!          * adopt the ordering operators from the latter rather than using
!          * the default ops.  This allows "GROUP BY foo ORDER BY foo DESC" to
!          * be done with only one sort step.  Note we are assuming that any
!          * user-supplied ordering operator will bring equal values together,
!          * which is all that GROUP BY needs.
!          */
!         if (sortClause &&
!             ((SortClause *) lfirst(sortClause))->tleSortGroupRef ==
!             tle->resdom->ressortgroupref)
!         {
!             ordering_op = ((SortClause *) lfirst(sortClause))->sortop;
!             sortClause = lnext(sortClause);
          }
+         else
+         {
+             ordering_op = ordering_oper_opid(tle->resdom->restype);
+             sortClause = NIL;    /* disregard ORDER BY once match fails */
+         }
+
+         grpcl = makeNode(GroupClause);
+         grpcl->tleSortGroupRef = assignSortGroupRef(tle, targetlist);
+         grpcl->sortop = ordering_op;
+         glist = lappend(glist, grpcl);
      }

      return glist;
*** src/include/parser/parse_clause.h.orig    Fri Mar 21 20:49:38 2003
--- src/include/parser/parse_clause.h    Sun Jun 15 12:03:13 2003
***************
*** 22,28 ****
  extern bool interpretInhOption(InhOption inhOpt);
  extern Node *transformWhereClause(ParseState *pstate, Node *where);
  extern List *transformGroupClause(ParseState *pstate, List *grouplist,
!                      List *targetlist);
  extern List *transformSortClause(ParseState *pstate, List *orderlist,
                      List *targetlist);
  extern List *transformDistinctClause(ParseState *pstate, List *distinctlist,
--- 22,28 ----
  extern bool interpretInhOption(InhOption inhOpt);
  extern Node *transformWhereClause(ParseState *pstate, Node *where);
  extern List *transformGroupClause(ParseState *pstate, List *grouplist,
!                      List *targetlist, List *sortClause);
  extern List *transformSortClause(ParseState *pstate, List *orderlist,
                      List *targetlist);
  extern List *transformDistinctClause(ParseState *pstate, List *distinctlist,

В списке pgsql-performance по дате отправления:

Предыдущее
От: Stephan Szabo
Дата:
Сообщение: Re: any way to use indexscan to get last X values with
Следующее
От: Manfred Koizar
Дата:
Сообщение: Re: 7.3 vs 7.2 - different query plan, bad performance