Обсуждение: overhead due to casting extra parameters with aggregates (over andover)
Hi,
I've been working on a custom aggregate, and I've ran into some fairly
annoying overhead due to casting direct parameters over and over. I'm
wondering if there's a way to eliminate this, somehow, without having to
do an explicit cast.
Imagine you have a simple aggregate:
  CREATE AGGREGATE tdigest_percentile(double precision, int, double precision[])
  (
    ...
  );
with two direct parameters (actually, I'm not sure that's the correct
term, becuse this is not an ordered-set aggregate and [1] only talks
about direct parameters in that context). Anyway, I'm talking about the
extra parameters, after the 'double precision' value to aggregate.
The last parameter is an array of values in [0.0,1.0], representing
percentiles (similarly to what we do in percentile_cont). It's annoying
to write literal values, so let's use CTE to generate the array:
  WITH
    perc AS (SELECT array_agg(i/100.0) AS p
             FROM generate_series(1,99) s(i))
  SELECT
    SELECT tdigest_percentile(random(), 100, (SELECT p FROM perc))
    FROM generate_series(1,10000000);
which does work, but it's running for ~180 seconds. When used with an
explicit array literal, it runs in ~1.6 second.
  SELECT tdigest_percentile(random(), 100, ARRAY[0.01, ..., 0.99]))
  FROM generate_series(1,10000000);
After a while, I've realized that the issue is casting - the CTE
produces numeric[] array, and we do the cast to double precision[] on
every call to the state transition function (and we do ~10M of those).
The cast is fairly expensive - much more expensive than the aggregate
itself. The explicit literal ends up being the right type, so the whole
query is much faster.
And indeed, adding the explicit cast to the CTE query
  WITH
    perc AS (SELECT array_agg((i/100.0)::double precision) AS p
             FROM generate_series(1,99) s(i))
  SELECT
    SELECT tdigest_percentile(random(), 100, (SELECT p FROM perc))
    FROM generate_series(1,10000000);
does the trick - the query is ~1.6s again.
I wonder if there's a chance to detect and handle this without having to
do the cast over and over? I'm thinking that's not quite possible,
because the value is not actually guaranteed to be the same for all
calls (even though it's the case for the example I've given).
But maybe we could flag the parameter somehow, to make it more like the
direct parameter (which is only evaluated once). I don't really need
those extra parameters in the transition function at all, it's fine to
just get it to the final function (and there should be far fewer calls
to those).
regards
[1] https://www.postgresql.org/docs/current/sql-createaggregate.html
-- 
Tomas Vondra                  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services 
			
		Tomas Vondra <tomas.vondra@2ndquadrant.com> writes:
> I've been working on a custom aggregate, and I've ran into some fairly
> annoying overhead due to casting direct parameters over and over. I'm
> wondering if there's a way to eliminate this, somehow, without having to
> do an explicit cast.
> Imagine you have a simple aggregate:
>   CREATE AGGREGATE tdigest_percentile(double precision, int, double precision[])
>   (
>     ...
>   );
> with two direct parameters (actually, I'm not sure that's the correct
> term, becuse this is not an ordered-set aggregate and [1] only talks
> about direct parameters in that context). Anyway, I'm talking about the
> extra parameters, after the 'double precision' value to aggregate.
But you're not telling the system that those are direct parameters,
at least not if you mean that they can only legitimately have one value
across the whole query.  As-is, they're just more aggregated arguments
so we have to evaluate them again at each row.
It's fairly messy that the SQL spec ties direct arguments to ordered-set
aggregates; you'd think there'd be some value in treating those features
as orthogonal.  I'm not sure how we could wedge them into the syntax
otherwise, though :-(.  You could perhaps convert your aggregate to
an ordered-set aggregate, but then you'd be paying for a sort that
you don't need, IIUC.
> After a while, I've realized that the issue is casting - the CTE
> produces numeric[] array, and we do the cast to double precision[] on
> every call to the state transition function (and we do ~10M of those).
The only reason that the CTE reference is cheap is that we understand
that it's stable so we don't have to recompute it each time; otherwise
you'd be moaning about that more than the cast.  As you say, the short
term workaround is to do the casting inside the sub-select.  I think the
long term fix is to generically avoid re-computing stable subexpressions.
There was a patch for that a year or so ago but the author never finished
it, AFAIR.
            regards, tom lane
			
		On Mon, Sep 23, 2019 at 12:53:36PM -0400, Tom Lane wrote: >Tomas Vondra <tomas.vondra@2ndquadrant.com> writes: >> I've been working on a custom aggregate, and I've ran into some fairly >> annoying overhead due to casting direct parameters over and over. I'm >> wondering if there's a way to eliminate this, somehow, without having to >> do an explicit cast. > >> Imagine you have a simple aggregate: > >> CREATE AGGREGATE tdigest_percentile(double precision, int, double precision[]) >> ( >> ... >> ); > >> with two direct parameters (actually, I'm not sure that's the correct >> term, becuse this is not an ordered-set aggregate and [1] only talks >> about direct parameters in that context). Anyway, I'm talking about the >> extra parameters, after the 'double precision' value to aggregate. > >But you're not telling the system that those are direct parameters, >at least not if you mean that they can only legitimately have one value >across the whole query. As-is, they're just more aggregated arguments >so we have to evaluate them again at each row. > Understood. >It's fairly messy that the SQL spec ties direct arguments to ordered-set >aggregates; you'd think there'd be some value in treating those features >as orthogonal. I'm not sure how we could wedge them into the syntax >otherwise, though :-(. You could perhaps convert your aggregate to >an ordered-set aggregate, but then you'd be paying for a sort that >you don't need, IIUC. > Yeah, having to do the sort (and keep all the data) is exactly what the tdigest is meant to eliminate, so making it an ordered-set aggregate is exactly the thing I don't want to do. Also, it disables parallel query, which is another reason not to do that. >> After a while, I've realized that the issue is casting - the CTE >> produces numeric[] array, and we do the cast to double precision[] on >> every call to the state transition function (and we do ~10M of those). > >The only reason that the CTE reference is cheap is that we understand >that it's stable so we don't have to recompute it each time; otherwise >you'd be moaning about that more than the cast. As you say, the short >term workaround is to do the casting inside the sub-select. I think the >long term fix is to generically avoid re-computing stable subexpressions. >There was a patch for that a year or so ago but the author never finished >it, AFAIR. > Hmmm, yeah. I'll dig throgh the archives, although it's not a very high priority - it's more a thing that surprised/bugged me while working on the custom aggregate. regards -- Tomas Vondra http://www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services