Re: Combining Aggregates

Поиск

Список

Период

Сортировка

От	David Rowley
Тема	Re: Combining Aggregates
Дата	21 декабря 2015 г. 09:02:35
Msg-id	CAKJS1f9hGtS0c1Zc_uGDvNARDXbC4Siie873BJ74Bhu1-YxEtQ@mail.gmail.com обсуждение исходный текст
Ответ на	Re: Combining Aggregates (David Rowley <david.rowley@2ndquadrant.com>)
Ответы	Re: Combining Aggregates Re: Combining Aggregates Re: Combining Aggregates
Список	pgsql-hackers

Дерево обсуждения

On 18 December 2015 at 01:28, David Rowley <david.rowley@2ndquadrant.com> wrote:

# select sum(x::numeric) from generate_series(1,3) x(x);
ERROR: invalid memory alloc request size 18446744072025250716

The reason that this happens is down to the executor always thinking that Aggref returns the aggtype, but in cases where we're not finalizing the aggregate the executor needs to know that we're actually returning aggtranstype instead. Hash aggregates appear to work as the trans value is just stuffed into a hash table, but with plain and sorted aggregates this gets formed into a Tuple again, and forming a tuple naturally requires the types to be set correctly! I'm not sure exactly what the best way to fix this will be. I've hacked something up in the attached test patch which gets around the problem by adding a new aggtranstype to Aggref and also an 'aggskipfinalize' field which I manually set to true in a bit of a hacky way inside the grouping planner. Then in exprType() for Aggref I conditionally return the aggtype or aggtranstype based on the aggskipfinalize setting. This is most likely not the way to properly fix this, but I'm running out of steam today to think of how it should be done, so I'm currently very open to ideas on this.

Ok, so it seems that my mindset was not in parallel process space when I was thinking about this. I think having the pointer in the Tuple is probably going to be fine for this multiple stage aggregation when that's occurring in a single backend process, but obviously if the memory that the pointer points to belongs to a worker process in a parallel aggregate situation, then bad things will happen.

Now, there has been talk of this previously, on various threads, but I don't believe any final decisions were made on how exactly it should be done. At the moment I plan to make changes as follows:

Add 3 new columns to pg_aggregate, aggserialfn, aggdeserialfn and aggserialtype These will only be required when aggtranstype is INTERNAL. Perhaps we should disallow CREATE AGGREAGET from accepting them for any other type... The return type of aggserialfn should be aggserialtype, and it should take a single parameter of aggtranstype. aggdeserialfn will be the reverse of that.
Add a new bool field to nodeAgg's state named serialStates. If this is field is set to true then when we're in finalizeAgg = false mode, we'll call the serialfn on the agg state instead of the finalfn. This will allow the serialized state to be stored in the tuple and sent off to the main backend. The combine agg node should also be set to serialStates = true, so that it knows to deserialize instead of just assuming that the agg state is of type aggtranstype.

I believe this should allow us to not cause any performance regressions by moving away from INTERNAL agg states. It should also be very efficient for internal states such as Int8TransTypeData, as this struct merely has 2 int64 fields which should be very simple to stuff into a bytea varlena type. We don't need to mess around converting the ->count and ->sum into a strings or anything.

Then in order for the planner to allow parallel aggregation all aggregates must:

Not have a DISTINCT or ORDER BY clause
Have a combinefn
If aggtranstype = INTERNAL, must have a aggserialfn and aggdeserialfn.

We can relax the requirement on 3 if we're using 2-stage aggregation, but not parallel aggregation.

Any objections, or better ideas?

That just leaves me to figure out how to set the correct Aggref->aggtype during planning, as now there's 3 possible types:

if (finalizeAggs == false)

{

if (serialStates == true)

type = aggserialtype;

else

type = aggtranstype;

}

else

type = prorettype; /* normal case */

David Rowley http://www.2ndQuadrant.com/

PostgreSQL Development, 24x7 Support, Training & Services

В списке pgsql-hackers по дате отправления:

Вход в личный кабинет

Восстановление пароля

Подтверждение аккаунта

Изменение пароля

Re: Combining Aggregates