Re: Deduplicate aggregates and transition functions in planner

Поиск
Список
Период
Сортировка
От Andres Freund
Тема Re: Deduplicate aggregates and transition functions in planner
Дата
Msg-id 20201029174810.tk4s7ycfrwde7m3x@alap3.anarazel.de
обсуждение исходный текст
Ответ на Re: Deduplicate aggregates and transition functions in planner  (Heikki Linnakangas <hlinnaka@iki.fi>)
Ответы Re: Deduplicate aggregates and transition functions in planner  (Heikki Linnakangas <hlinnaka@iki.fi>)
Список pgsql-hackers
Hi,

On 2020-10-29 10:17:20 +0200, Heikki Linnakangas wrote:
> On 28/10/2020 21:59, Andres Freund wrote:
> > It wouldn't surprise me to see a small execution time speedup here -
> > I've seen the load of the aggno show up in profiles.
> 
> I think you'd be hard-pressed to find a real-life query where it
> matters. But if you don't care about real life:

I was actually thinking about a different angle - that the evaluation of
an Aggref can be faster, because we need less indirection to find the
aggno. As you have already implemented for the JITed code, but removing
it for the expression code looks easy enough too. You'd need a lot of
groups and presumably a fair number of Aggrefs to see it.

Attached is a quick version of what I am thinking wrt AggrefExprState.


> > > @@ -783,14 +783,13 @@ ExecInitExprRec(Expr *node, ExprState *state,
> > >                   scratch.opcode = EEOP_AGGREF;
> > >                   scratch.d.aggref.astate = astate;
> > > -                astate->aggref = aggref;
> > > +                astate->aggno = aggref->aggno;
> > >                   if (state->parent && IsA(state->parent, AggState))
> > >                   {
> > >                       AggState   *aggstate = (AggState *) state->parent;
> > > -                    aggstate->aggs = lappend(aggstate->aggs, astate);
> > > -                    aggstate->numaggs++;
> > > +                    aggstate->aggs = lappend(aggstate->aggs, aggref);
> > 
> > Hm. Why is aggstate->aggs still built during expression initialization?
> > Imo that's a pretty huge wart that also introduces more
> > order-of-operation brittleness to executor startup.
> 
> The Agg node itself doesn't include any information about the aggregates and
> transition functions. Because of that, ExecInitAgg needs a "representive"
> Aggref for each transition state and agg, to initialize the per-trans and
> per-agg structs. The expression initialization makes those Aggrefs available
> for ExecInitAgg.

> Instead of collecting all the Aggrefs in a list, ExecInitExprRec() could set
> each representative Aggref directly in the right per-trans and per-agg
> struct, based on the 'aggno' and 'aggtransno' fields.

Hold on a second: To me the question is why is it the right design that
the Agg node doesn't have the information about "aggregates and
transition functions"? Agg e.g. already does directly contains the group
keys...

My concern wouldn't really be addressed if we replace the lappend() in
ExecInitExprRec() with setting something "directly in the right
per-trans...". I think it's structurally wrong to have to discover
Aggrefs at execution time at all.

Perhaps the easiest incremental step would be to have something like a
CookedAggref { int aggno; } and then just store the Aggref nodes in
Agg->aggs, with aggno referencing that...


> I'd like to get rid of the "representative Aggrefs" altogether, and pass
> information about the transition and final functions from planner to
> executor in some other form. But that's exactly what got me into the
> refactoring that was ballooning out of hand that I mentioned.

Fair.


Greetings,

Andres Freund

Вложения

В списке pgsql-hackers по дате отправления:

Предыдущее
От: Tom Lane
Дата:
Сообщение: Re: duplicate function oid symbols
Следующее
От: Andres Freund
Дата:
Сообщение: Re: Autovacuum worker doesn't immediately exit on postmaster death