Обсуждение: array_accum aggregate

Поиск
Список
Период
Сортировка

array_accum aggregate

От
Stephen Frost
Дата:
Greetings,

  The array_accum example aggregate in the user documentation works
  reasonably on small data sets but doesn't work too hot on large ones.
  http://www.postgresql.org/docs/8.1/static/xaggr.html

  Normally I wouldn't care particularly much but it turns out that PL/R
  uses arrays for quite a bit (eg: histograms and other statistics
  functions).  I've also heard other complaints about the performance of
  arrays, though I'm not sure if those were due to array_accum or
  something else.

  Long story short, I set out to build a faster array_accum.  Much to my
  suprise and delight, we already *had* one.  accumArrayResult() and
  makeArrayResult()/construct_md_array() appear to do a fantastic job.
  I've created a couple of 'glue' functions to expose these functions so
  they can be used in an aggregate.  I'm sure they could be improved
  upon and possibly made even smaller than they already are (90 lines
  total for both) but I'd like to throw out the idea of including them
  in core.  The aggregate created with them could also be considered for
  inclusion though I'm less concerned with that.  I don't expect general
  PostgreSQL users would have trouble creating the aggregate- I don't
  know that the average user would be able or willing to write the C
  functions.

  For comparison, the new functions run with:
  time psql -c "select aaccum(generate_series) from generate_series(1,1000000);" > /dev/null
      4.24s real     0.34s user     0.06s system

  Compared to:
  time psql -c "select array_accum(generate_series) from generate_series(1,1000000);" > /dev/null
  ...

  Well, it's still running and it's been over an hour.

  The main differences, as I see it, are: accumArrayResult() works in
  chunks of 64 elements, and uses repalloc().  array_accum uses
  array_set() which works on individual elements and uses
  palloc()/memcpy().  I appriciate that this is done because for most
  cases of array_set() it's not acceptable to modify the input and am
  not suggesting that be changed.  An alternative might be to modify
  array_set() to check if it is in an aggregate and change its behavior
  but adding the seperate functions seemed cleaner and much less
  intrusive to me.

  Please find the functions attached.

          Thanks,

            Stephen

Вложения

Re: array_accum aggregate

От
Tom Lane
Дата:
Stephen Frost <sfrost@snowman.net> writes:
>   Long story short, I set out to build a faster array_accum.  Much to my
>   suprise and delight, we already *had* one.  accumArrayResult() and
>   makeArrayResult()/construct_md_array() appear to do a fantastic job.
>   I've created a couple of 'glue' functions to expose these functions so
>   they can be used in an aggregate.  I'm sure they could be improved
>   upon and possibly made even smaller than they already are (90 lines
>   total for both) but I'd like to throw out the idea of including them
>   in core.  The aggregate created with them could also be considered for
>   inclusion though I'm less concerned with that.

Since you've set up the functions to only be usable inside an aggregate,
I don't see much of a reason why we wouldn't provide the aggregate too.
It looks like it should work to have just one polymorphic aggregate
definition, eg, array_accum(anyelement) returns anyarray.

As far as coding style goes, you're supposed to use makeArrayResult()
with accumArrayResult(), not call construct_md_array() directly.  And
copying the ArrayBuildState around like that is just plain bizarre...

Does the thing work without memory leakage (for a pass-by-ref datatype)
in a GROUP BY situation?
        regards, tom lane


Re: array_accum aggregate

От
Stephen Frost
Дата:
* Tom Lane (tgl@sss.pgh.pa.us) wrote:
> Stephen Frost <sfrost@snowman.net> writes:
> >   Long story short, I set out to build a faster array_accum.  Much to my
> >   suprise and delight, we already *had* one.  accumArrayResult() and
> >   makeArrayResult()/construct_md_array() appear to do a fantastic job.
> >   I've created a couple of 'glue' functions to expose these functions so
> >   they can be used in an aggregate.  I'm sure they could be improved
> >   upon and possibly made even smaller than they already are (90 lines
> >   total for both) but I'd like to throw out the idea of including them
> >   in core.  The aggregate created with them could also be considered for
> >   inclusion though I'm less concerned with that.
>
> Since you've set up the functions to only be usable inside an aggregate,
> I don't see much of a reason why we wouldn't provide the aggregate too.

Sure, I don't see a reason these functions would be useful outside of an
aggregate in the form they need to be in for the aggregate, either..

> It looks like it should work to have just one polymorphic aggregate
> definition, eg, array_accum(anyelement) returns anyarray.

I was hoping to do that, but since it's an aggregate the ffunc format is
pre-defined to require accepting the 'internal state' and nothing else,
and to return 'anyelement' or 'anyarray' one of the inputs must be an
'anyelement' or 'anyarray', aiui.  That also implied to me that the type
passed in was expected to be the type passed out, which wouldn't
necessairly be the case here as the internal state variable is a bytea.
It might be possible to do away with the internal state variable
entirely though and make it an anyelement instead, if we can find
somewhere else to put the pointer to the ArrayBuildState.

> As far as coding style goes, you're supposed to use makeArrayResult()
> with accumArrayResult(), not call construct_md_array() directly.  And
> copying the ArrayBuildState around like that is just plain bizarre...

I had attempted to use makeArrayResult() originally and ran into some
trouble with the MemoryContextDelete() which is done during it.  The
context used was the AggState context and therefore was deleted
elsewhere.  That might have been a misunderstanding on my part though
since I recall fixing at least one or two bugs afterwards, so it may be
possible to go back and change to using makeArrayResult().  That'd
certainly make the ffunc smaller.

As for ArrayBuildState, I'm not actually copying the structure around,
just the pointer is being copied around inside of the state transistion
variable (which is a bytea).  I'm not sure where else to store the
pointer to the ArrayBuildState though, since multiple could be in play
at a given time during an aggregation, aiui.

> Does the thing work without memory leakage (for a pass-by-ref datatype)
> in a GROUP BY situation?

I expected that using the AggState context would handle free'ing the
memory at the end of the aggregation as I believe that's the context
used for the state variable itself as well.  Honestly, I wasn't entirely
sure how to create my own context or if that was the correct thing to do
in this case.  I'll see about changing it around to do that if it's
acceptable to have a context created for each instance of a group-by
aggregate, and I can figure out how. :)

I'm not sure about memory leakage during a Sort+GroupAgg..  I don't know
how that's done currently either, honestly.  It seems like memory could
be free'd during that once the ffunc is called, and an extra memory
context could do that, is that what you're referring to?
Thanks!
    Stephen

Re: array_accum aggregate

От
Stephen Frost
Дата:
* Stephen Frost (sfrost@snowman.net) wrote:
>   For comparison, the new functions run with:
>   time psql -c "select aaccum(generate_series) from generate_series(1,1000000);" > /dev/null
>       4.24s real     0.34s user     0.06s system
>
>   Compared to:
>   time psql -c "select array_accum(generate_series) from generate_series(1,1000000);" > /dev/null
>   ...
>
>   Well, it's still running and it's been over an hour.

Just to follow-up on this, the result was:
7601.36s real     0.36s user     0.02s system

Or about 2 hours.
    Enjoy,
    Stephen

Re: array_accum aggregate

От
Tom Lane
Дата:
Stephen Frost <sfrost@snowman.net> writes:
> * Tom Lane (tgl@sss.pgh.pa.us) wrote:
>> It looks like it should work to have just one polymorphic aggregate
>> definition, eg, array_accum(anyelement) returns anyarray.

> I was hoping to do that, but since it's an aggregate the ffunc format is
> pre-defined to require accepting the 'internal state' and nothing else,
> and to return 'anyelement' or 'anyarray' one of the inputs must be an
> 'anyelement' or 'anyarray', aiui.

Hmm ... I hadn't been thinking about what the state type would need to
be, but certainly "bytea" is a lie given what you're really doing.
We've run into this same problem in contrib/intagg: sometimes you'd like
to use a state data structure that isn't any regular SQL datatype, and
in particular isn't just a single blob of memory.  That's a problem from
nodeAgg's point of view because it expects to be responsible for copying
the state value from one context to another.  Don't have an immediate
idea for a solution ...
        regards, tom lane


Re: array_accum aggregate

От
"Florian G. Pflug"
Дата:
Tom Lane wrote:
> Stephen Frost <sfrost@snowman.net> writes:
>> * Tom Lane (tgl@sss.pgh.pa.us) wrote:
>>> It looks like it should work to have just one polymorphic aggregate
>>> definition, eg, array_accum(anyelement) returns anyarray.
> 
>> I was hoping to do that, but since it's an aggregate the ffunc format is
>> pre-defined to require accepting the 'internal state' and nothing else,
>> and to return 'anyelement' or 'anyarray' one of the inputs must be an
>> 'anyelement' or 'anyarray', aiui.
> 
> Hmm ... I hadn't been thinking about what the state type would need to
> be, but certainly "bytea" is a lie given what you're really doing.
> We've run into this same problem in contrib/intagg: sometimes you'd like
> to use a state data structure that isn't any regular SQL datatype, and
> in particular isn't just a single blob of memory.  That's a problem from
> nodeAgg's point of view because it expects to be responsible for copying
> the state value from one context to another.  Don't have an immediate
> idea for a solution ...

I used an int8 as state type for an implementation of a kind of
array_accum_unique aggregate. I know that it's a ugly hack, but
it has been running on a production machine for months now, without
a problem...

The memory is alloc'd from the aggcontext, btw.

Note that I only use this aggregate in one particular query -
so there might be problems with my approach that just don't
manifest in my particular situation. For example, the aggregate
is used only on a table that is never updated, and it is only
used in select queries. So there might be problems if the executor
decides that it has to restart a query...

greetings, Florian Pflug



Re: array_accum aggregate

От
Stephen Frost
Дата:
* Tom Lane (tgl@sss.pgh.pa.us) wrote:
> Stephen Frost <sfrost@snowman.net> writes:
> > I was hoping to do that, but since it's an aggregate the ffunc format is
> > pre-defined to require accepting the 'internal state' and nothing else,
> > and to return 'anyelement' or 'anyarray' one of the inputs must be an
> > 'anyelement' or 'anyarray', aiui.
>
> Hmm ... I hadn't been thinking about what the state type would need to
> be, but certainly "bytea" is a lie given what you're really doing.

Indeed.  I've updated the functions quite a bit to clean things up,
including: Added many more comments, removed the unnecessary 'storage*'
pointer being used, created my own structure for tracking state
information, created a seperate memory context (tied to the AggContext),
correctly handle NULL values, and changed the ffunc to use
makeArrayResult.

I also tried just tried using polymorphic types for the functions and
for the aggregate and it appeared to just work:

    create function aaccum_sfunc (anyarray, anyelement) returns anyarray
        language 'C' AS 'aaccum.so', 'aaccum_sfunc'
    ;
    create function aaccum_ffunc (anyarray) returns anyarray language
        'C' AS '/data/sfrost/postgres/arrays/aaccum.so', 'aaccum_ffunc'
    ;
    create aggregate aaccum (
        sfunc = aaccum_sfunc,
        basetype = anyelement,
        stype = anyarray,
        finalfunc = aaccum_ffunc
    );

    select aaccum(generate_series) from generate_series(1,5);
       aaccum
    -------------
     {1,2,3,4,5}
    (1 row)

    (test is a table with one varchar column, abc)
    select aaccum(abc) from test;
     aaccum
    ---------
     {a,b,c}
    (1 row)

    (Added a column called 'hi', set to 'a', added b,b and c,b)
    select hi,aaccum(abc) from test group by hi;
     hi | aaccum
    ----+---------
     b  | {b,c}
     a  | {a,b,c}
    (2 rows)

It makes some sense that it would work as an 'anyarray' is just a
variable-length type internally and so long as nothing else attempts to
make sense out of our 'fake array' everything should be fine.

The latest version also appears to be a bit faster than the prior
version.  I'm going to be running a very large query shortly using
this aaccum and will report back how it goes.  Please let me know if
there are any other improvments or changes I should make.  I'd like to
submit this to -patches w/ the appropriate entries to have it be
included in the core distribution.  Is it acceptable to reuse the
'array_accum' name even though it was used in the documentation as an
example?  I'm thinking yes, but wanted to check.

    Thanks!

        Stephen

Вложения

Re: array_accum aggregate

От
Stephen Frost
Дата:
* Stephen Frost (sfrost@snowman.net) wrote:
> I'm going to be running a very large query shortly using
> this aaccum and will report back how it goes.

It went *very* well, actually much better than I had originally
expected.  This query used to take over 12 hours to complete (about 11
of which was after the main table involved was sorted).  With this new
aaccum in place the whole query only took about an hour, most of which
was the sort and join required by the query.  The aggregation (aaccum)
and r_hist() (R histogram function generating PNGs) took only a few
minutes.
Thanks!
    Stephen

Re: array_accum aggregate

От
"Merlin Moncure"
Дата:
On 10/10/06, Stephen Frost <sfrost@snowman.net> wrote:
> * Stephen Frost (sfrost@snowman.net) wrote:
> > I'm going to be running a very large query shortly using
> > this aaccum and will report back how it goes.
>
> It went *very* well, actually much better than I had originally
> expected.  This query used to take over 12 hours to complete (about 11
> of which was after the main table involved was sorted).  With this new
> aaccum in place the whole query only took about an hour, most of which
> was the sort and join required by the query.  The aggregation (aaccum)
> and r_hist() (R histogram function generating PNGs) took only a few
> minutes.

very cool, and definately useful imo.  i use array_accum all the time,
and have not always been happy with its performance.  if your stuff
passes muster, would be nice to see it move into core :-).

merlin