Re: Parallel Aggregate

Поиск

Список

Период

Сортировка

От	Tomas Vondra
Тема	Re: Parallel Aggregate
Дата	15 марта 2016 г. 01:39:44
Msg-id	731afd1d-ebaf-f5d9-5f80-20f652346834@2ndquadrant.com обсуждение исходный текст
Ответ на	Re: Parallel Aggregate (David Rowley <david.rowley@2ndquadrant.com>)
Ответы	Re: Parallel Aggregate
Список	pgsql-hackers

Дерево обсуждения

Hi,

On 03/13/2016 10:44 PM, David Rowley wrote:
> On 12 March 2016 at 16:31, David Rowley <david.rowley@2ndquadrant.com> wrote:
>> I've attached an updated patch which is based on commit 7087166,
>> things are really changing fast in the grouping path area at the
>> moment, but hopefully the dust is starting to settle now.
>
> The attached patch fixes a harmless compiler warning about a
> possible uninitialised variable.

I've looked at this patch today. The patch seems quite solid, but I do 
have a few minor comments (or perhaps questions, given that this is the 
first time I looked at the patch).

1) exprCollation contains this bit:
-----------------------------------
    case T_PartialAggref:        coll = InvalidOid; /* XXX is this correct? */        break;

I doubt this is the right thing to do. Can we actually get to this piece 
of code? I haven't tried too hard, but regression tests don't seem to 
trigger this piece of code.

Moreover, if we're handling PartialAggref in exprCollation(), maybe we 
should handle it also in exprInputCollation and exprSetCollation?

And if we really need the collation there, why not to fetch the actual 
collation from the nested Aggref? Why should it be InvalidOid?

2) partial_aggregate_walker
---------------------------

I think this should follow the naming convention that clearly identifies 
the purpose of the walker, not what kind of nodes it is supposed to 
walk. So it should be:
    aggregates_allow_partial_walker

3) create_parallelagg_path
--------------------------

I do agree with the logic that under-estimates are more dangerous than 
over-estimates, so the current estimate is safer. But I think this would 
be a good place to apply the formula I proposed a few days ago (or 
rather the one Dean Rasheed proposed in response).

That is, we do know that there are numGroups in total, and each parallel 
worker sees subpath->rows then it's expected to see
    sel = (subpath->rows / rel->tuples);    perGroup = (rel->tuples / numGroups);    workerGroups = numGroups * (1 -
powl(1- s, perGroup));    numPartialGroups = numWorkers * workerGroups

It's probably better to see Dean's message from 13/3.

4) Is clauses.h the right place for PartialAggType?

regards

-- 
Tomas Vondra                  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

В списке pgsql-hackers по дате отправления:

Предыдущее

От: Robert Haas
Дата: 15 марта 2016 г., 01:32:16
Сообщение: Re: Parallel Aggregate

Следующее

От: Peter Geoghegan
Дата: 15 марта 2016 г., 01:50:45
Сообщение: Re: Fix for OpenSSL error queue bug

Вход в личный кабинет

Восстановление пароля

Подтверждение аккаунта

Изменение пароля

Re: Parallel Aggregate

Предыдущее

Следующее