Re: Oddity with parallel safety test for scan/join target in grouping_planner

Поиск
Список
Период
Сортировка
От Etsuro Fujita
Тема Re: Oddity with parallel safety test for scan/join target in grouping_planner
Дата
Msg-id 5C85DC65.2050409@lab.ntt.co.jp
обсуждение исходный текст
Ответ на Re: Oddity with parallel safety test for scan/join target in grouping_planner  (Tom Lane <tgl@sss.pgh.pa.us>)
Ответы Re: Oddity with parallel safety test for scan/join target in grouping_planner
Список pgsql-hackers
(2019/03/09 5:36), Tom Lane wrote:
> Etsuro Fujita<fujita.etsuro@lab.ntt.co.jp>  writes:
>> (2019/02/28 0:52), Robert Haas wrote:
>>> On Tue, Feb 26, 2019 at 7:26 AM Etsuro Fujita
>>> <fujita.etsuro@lab.ntt.co.jp>   wrote:
>>>> The parallel safety of the final scan/join target is determined from the
>>>> grouping target, not that target, which [ is wrong ]
>
>>> Your patch looks right to me.
>
>> I think it would be better for you to take this one.  Could you?
>
> I concur with Robert that this is a brown-paper-bag-grade bug in
> 960df2a97.  Please feel free to push (and don't forget to back-patch).

OK, will do.

> The only really interesting question is whether it's worth adding
> a regression test for.  I doubt it; the issue seems much too narrow.
> Usually the point of a regression test is to prevent re-introduction
> of the same/similar bug, but what class of bugs would you argue
> we'd be protecting against?

Plan degradation; without the fix, we would have this on data created by 
"pgbench -i -s 10 postgres", as shown in [1]:

postgres=# set parallel_setup_cost = 10;
postgres=# set parallel_tuple_cost = 0.001;

postgres=# explain verbose select aid+bid, sum(abalance), random() from 
pgbench_accounts group by aid+bid;
                                                  QUERY PLAN

------------------------------------------------------------------------------------------------------------
  GroupAggregate  (cost=137403.01..159903.01 rows=1000000 width=20)
    Output: ((aid + bid)), sum(abalance), random()
    Group Key: ((pgbench_accounts.aid + pgbench_accounts.bid))
    ->  Sort  (cost=137403.01..139903.01 rows=1000000 width=8)
          Output: ((aid + bid)), abalance
          Sort Key: ((pgbench_accounts.aid + pgbench_accounts.bid))
          ->  Gather  (cost=10.00..24070.67 rows=1000000 width=8)
-->            Output: (aid + bid), abalance
                Workers Planned: 2
                ->  Parallel Seq Scan on public.pgbench_accounts 
(cost=0.00..20560.67 rows=416667 width=12)
                      Output: aid, bid, abalance
(11 rows)

The final scan/join target {(aid + bid), abalance} would be parallel 
safe, but in the plan the target is not parallelized across workers. 
The reason for that is because the parallel-safety of the target is 
assessed incorrectly using the grouping target {((aid + bid)), 
sum(abalance), random()}, which would not be parallel safe.  By the fix 
we would have this:

postgres=# explain verbose select aid+bid, sum(abalance), random() from 
pgbench_accounts group by aid+bid;
                                                 QUERY PLAN

-----------------------------------------------------------------------------------------------------------
  GroupAggregate  (cost=135944.68..158444.68 rows=1000000 width=20)
    Output: ((aid + bid)), sum(abalance), random()
    Group Key: ((pgbench_accounts.aid + pgbench_accounts.bid))
    ->  Sort  (cost=135944.68..138444.68 rows=1000000 width=8)
          Output: ((aid + bid)), abalance
          Sort Key: ((pgbench_accounts.aid + pgbench_accounts.bid))
          ->  Gather  (cost=10.00..22612.33 rows=1000000 width=8)
                Output: ((aid + bid)), abalance
                Workers Planned: 2
                ->  Parallel Seq Scan on public.pgbench_accounts 
(cost=0.00..21602.33 rows=416667 width=8)
-->                  Output: (aid + bid), abalance
(11 rows)

Note that the final scan/join target is parallelized in the plan.

This would only affect plan quality a little bit, so I don't think we 
need a regression test for this, either, but the fix might destabilize 
existing plan choices, so we should apply it to HEAD only?

Best regards,
Etsuro Fujita



В списке pgsql-hackers по дате отправления:

Предыдущее
От: Amit Kapila
Дата:
Сообщение: Re: Portability of strtod (was Re: pgsql: Include GUC's unit, if ithas one, in out-of-range error message)
Следующее
От: Amit Langote
Дата:
Сообщение: Re: Should we add GUCs to allow partition pruning to be disabled?