Re: Combining Aggregates

Поиск
Список
Период
Сортировка
От Robert Haas
Тема Re: Combining Aggregates
Дата
Msg-id CA+TgmobRsLKbVu7Mh6VrPLYO+mV5jfsizBrp_eHqNNb6110_fg@mail.gmail.com
обсуждение исходный текст
Ответ на Re: Combining Aggregates  (David Rowley <david.rowley@2ndquadrant.com>)
Ответы Re: Combining Aggregates  (David Rowley <david.rowley@2ndquadrant.com>)
Список pgsql-hackers
On Sun, Jan 17, 2016 at 9:26 PM, David Rowley
<david.rowley@2ndquadrant.com> wrote:
> hmm, so wouldn't that mean that the transition function would need to (for
> each input tuple):
>
> 1. Parse that StringInfo into tokens.
> 2. Create a new aggregate state object.
> 3. Populate the new aggregate state based on the tokenised StringInfo, this
> would perhaps require that various *_in() functions are called on each
> token.
> 4. Add the new tuple to the aggregate state.
> 5. Build a new StringInfo based on the aggregate state modified in 4.
>
> ?

I don't really know what you mean by parse the StringInfo into tokens.
The whole point of the expanded-object interface is to be able to keep
things in an expanded internal form so that you *don't* have to
repeatedly construct and deconstruct internal data structures.  I
worked up an example of this approach using string_agg(), which I
attach here.  This changes the transition type of string_agg() from
internal to text.  The same code would work for bytea_string_agg(),
which would allow removal of some other code, but this patch doesn't
do that, because the point of this is to elucidate the approach.

In my tests, this seems to be slightly slower than what we're doing
today; worst of all, it adds a handful of cycles to
advance_transition_function() even when the aggregate is not an
expanded object or, indeed, not even pass-by-reference.  Some of this
might be able to be fixed by a little massaging - in particular,
DatumIsReadWriteExpandedObject() seems like it could be partly or
entirely inlined, and maybe there's some other way to improve the
coding here.

Generally, I think finding a way to pass expanded objects through
nodeAgg.c would be a good thing to pursue, if we can make it work.
The immediate impetus for changing things this way would be that we
wouldn't need to add a mechanism for serializing and deserializing
internal functions just to pass around partial aggregates.  But
there's another advantage, too: right now,
advance_transition_function() does a lot of data copying to move data
from per-call context to the per-aggregate context.  When an expanded
object is in use, this can be skipped.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Вложения

В списке pgsql-hackers по дате отправления:

Предыдущее
От: Bruce Momjian
Дата:
Сообщение: Re: Optimizer questions
Следующее
От: David Rowley
Дата:
Сообщение: Compiler warning in pg_am changes