Re: overhead due to casting extra parameters with aggregates (over and over)

Поиск
Список
Период
Сортировка
От Tom Lane
Тема Re: overhead due to casting extra parameters with aggregates (over and over)
Дата
Msg-id 10046.1569257616@sss.pgh.pa.us
обсуждение исходный текст
Ответ на overhead due to casting extra parameters with aggregates (over andover)  (Tomas Vondra <tomas.vondra@2ndquadrant.com>)
Ответы Re: overhead due to casting extra parameters with aggregates (overand over)  (Tomas Vondra <tomas.vondra@2ndquadrant.com>)
Список pgsql-hackers
Tomas Vondra <tomas.vondra@2ndquadrant.com> writes:
> I've been working on a custom aggregate, and I've ran into some fairly
> annoying overhead due to casting direct parameters over and over. I'm
> wondering if there's a way to eliminate this, somehow, without having to
> do an explicit cast.

> Imagine you have a simple aggregate:

>   CREATE AGGREGATE tdigest_percentile(double precision, int, double precision[])
>   (
>     ...
>   );

> with two direct parameters (actually, I'm not sure that's the correct
> term, becuse this is not an ordered-set aggregate and [1] only talks
> about direct parameters in that context). Anyway, I'm talking about the
> extra parameters, after the 'double precision' value to aggregate.

But you're not telling the system that those are direct parameters,
at least not if you mean that they can only legitimately have one value
across the whole query.  As-is, they're just more aggregated arguments
so we have to evaluate them again at each row.

It's fairly messy that the SQL spec ties direct arguments to ordered-set
aggregates; you'd think there'd be some value in treating those features
as orthogonal.  I'm not sure how we could wedge them into the syntax
otherwise, though :-(.  You could perhaps convert your aggregate to
an ordered-set aggregate, but then you'd be paying for a sort that
you don't need, IIUC.

> After a while, I've realized that the issue is casting - the CTE
> produces numeric[] array, and we do the cast to double precision[] on
> every call to the state transition function (and we do ~10M of those).

The only reason that the CTE reference is cheap is that we understand
that it's stable so we don't have to recompute it each time; otherwise
you'd be moaning about that more than the cast.  As you say, the short
term workaround is to do the casting inside the sub-select.  I think the
long term fix is to generically avoid re-computing stable subexpressions.
There was a patch for that a year or so ago but the author never finished
it, AFAIR.

            regards, tom lane



В списке pgsql-hackers по дате отправления:

Предыдущее
От: Steven Pousty
Дата:
Сообщение: Re: JSONPATH documentation
Следующее
От: Tomas Vondra
Дата:
Сообщение: Re: overhead due to casting extra parameters with aggregates (overand over)