Re: Add proper planner support for ORDER BY / DISTINCT aggregates

Поиск
Список
Период
Сортировка
От Ranier Vilela
Тема Re: Add proper planner support for ORDER BY / DISTINCT aggregates
Дата
Msg-id CAEudQAq-H2yNzCni1N_PW7PfrWr4jYMhzRAiQiGyAQ4vt4-sTg@mail.gmail.com
обсуждение исходный текст
Ответ на Re: Add proper planner support for ORDER BY / DISTINCT aggregates  (David Rowley <dgrowleyml@gmail.com>)
Ответы Re: Add proper planner support for ORDER BY / DISTINCT aggregates  (David Rowley <dgrowleyml@gmail.com>)
Список pgsql-hackers
Em seg., 12 de jul. de 2021 às 09:04, David Rowley <dgrowleyml@gmail.com> escreveu:
On Sun, 13 Jun 2021 at 03:07, David Rowley <dgrowleyml@gmail.com> wrote:
>
> Please find attached my WIP patch.  It's WIP due to what I mentioned
> in the above paragraph and also because I've not bothered to add JIT
> support for the new expression evaluation steps.

I've split this patch into two parts.
Hi, I'll take a look.


0001 Adds planner support for ORDER BY aggregates.
/* Normal transition function without ORDER BY / DISTINCT. */
Is it possible to avoid entering to initialize args if 'argno >= pertrans->numTransInputs'?
Like this:

if (!pertrans->aggsortrequired && argno < pertrans->numTransInputs)

And if argos is '>' that pertrans->numTransInputs?
The test shouldn't be, inside the loop?

+ /*
+ * Don't initialize args for any ORDER BY clause that might
+ * exist in a presorted aggregate.
+ */
+ if (argno >= pertrans->numTransInputs)
+ break;

I think that or can reduce the scope of variable 'sortlist' or simply remove it?

a)
+ /* Determine pathkeys for aggregate functions with an ORDER BY */
+ if (parse->groupingSets == NIL && root->numOrderedAggs > 0 &&
+ (qp_extra->groupClause == NIL || root->group_pathkeys))
+ {
+ ListCell   *lc;
+ List   *pathkeys = NIL;
+
+ foreach(lc, root->agginfos)
+ {
+ AggInfo    *agginfo = (AggInfo *) lfirst(lc);
+ Aggref   *aggref = agginfo->representative_aggref;
+ List   *sortlist;
+

or better

b)
+ /* Determine pathkeys for aggregate functions with an ORDER BY */
+ if (parse->groupingSets == NIL && root->numOrderedAggs > 0 &&
+ (qp_extra->groupClause == NIL || root->group_pathkeys))
+ {
+ ListCell   *lc;
+ List   *pathkeys = NIL;
+
+ foreach(lc, root->agginfos)
+ {
+ AggInfo    *agginfo = (AggInfo *) lfirst(lc);
+ Aggref   *aggref = agginfo->representative_aggref;
+
+ if (AGGKIND_IS_ORDERED_SET(aggref->aggkind))
+ continue;
+
+ /* DISTINCT aggregates not yet supported by the planner */
+ if (aggref->aggdistinct != NIL)
+ continue;
+
+ if (aggref->aggorder == NIL)
+ continue;
+
+ /*
+ * Find the pathkeys with the most sorted derivative of the first
+ * Aggref. For example, if we determine the pathkeys for the first
+ * Aggref to be {a}, and we find another with {a,b}, then we use
+ * {a,b} since it's useful for more Aggrefs than just {a}.  We
+ * currently ignore anything that might have a longer list of
+ * pathkeys than the first Aggref if it is not contained in the
+ * pathkeys for the first agg.  We can't practically plan for all
+ * orders of each Aggref, so this seems like the best compromise.
+ */
+ if (pathkeys == NIL)
+ {
+ pathkeys = make_pathkeys_for_sortclauses(root,  aggref->aggorder ,
+ aggref->args);
+ aggref->aggpresorted = true;
+ }
+ else
+ {
+ List   *pathkeys2 = make_pathkeys_for_sortclauses(root,
+  aggref->aggorder,
+  aggref->args);


0002 is a WIP patch for DISTINCT support.  This still lacks JIT
support and I'm still not certain of the best where to store the
previous value or tuple to determine if the current one is distinct
from it.
In the patch 0002, I think that can reduce the scope of variable 'aggstate'?

+ EEO_CASE(EEOP_AGG_PRESORTED_DISTINCT_SINGLE)
+ {
+ AggStatePerTrans pertrans = op->d.agg_presorted_distinctcheck.pertrans;
+ Datum value = pertrans->transfn_fcinfo->args[1].value;
+ bool isnull = pertrans->transfn_fcinfo->args[1].isnull;
+
+ if (!pertrans->haslast ||
+ pertrans->lastisnull != isnull ||
+ !DatumGetBool(FunctionCall2Coll(&pertrans->equalfnOne,
+ pertrans->aggCollation,
+ pertrans->lastdatum, value)))
+ {
+ if (pertrans->haslast && !pertrans->inputtypeByVal)
+ pfree(DatumGetPointer(pertrans->lastdatum));
+
+ pertrans->haslast = true;
+ if (!isnull)
+ {
+ AggState   *aggstate = castNode(AggState, state->parent);
+
+ /*
+ * XXX is it worth having a dedicted ByVal version of this
+ * operation so that we can skip switching memory contexts
+ * and do a simple assign rather than datumCopy below?
+ */
+ MemoryContext oldContext;
+
+ oldContext = MemoryContextSwitchTo(aggstate->curaggcontext->ecxt_per_tuple_memory);

What do you think?

regards,
Ranier Vilela

В списке pgsql-hackers по дате отправления:

Предыдущее
От: Josef Šimánek
Дата:
Сообщение: Re: [PATCH] Don't block HOT update by BRIN index
Следующее
От: Tom Lane
Дата:
Сообщение: Re: proposal: possibility to read dumped table's name from file