Re: Fix BUG #17335: Duplicate result rows in Gather node

Поиск
Список
Период
Сортировка
От Yura Sokolov
Тема Re: Fix BUG #17335: Duplicate result rows in Gather node
Дата
Msg-id 382c3b52071a087fd4357e22959b3ddb05e524b7.camel@postgrespro.ru
обсуждение исходный текст
Ответ на Re: Fix BUG #17335: Duplicate result rows in Gather node  (David Rowley <dgrowleyml@gmail.com>)
Ответы Re: Fix BUG #17335: Duplicate result rows in Gather node  (Yura Sokolov <y.sokolov@postgrespro.ru>)
Список pgsql-hackers
В Чт, 20/01/2022 в 09:32 +1300, David Rowley пишет:
> On Fri, 31 Dec 2021 at 00:14, Yura Sokolov <y.sokolov@postgrespro.ru> wrote:
> > Suggested quick (and valid) fix in the patch attached:
> > - If Append has single child, then copy its parallel awareness.
> 
> I've been looking at this and I've gone through changing my mind about
> what's the right fix quite a number of times.
> 
> My current thoughts are that I don't really like the fact that we can
> have plans in the following shape:
> 
>  Finalize Aggregate
>    ->  Gather
>          Workers Planned: 1
>          ->  Partial Aggregate
>                ->  Parallel Hash Left Join
>                      Hash Cond: (gather_append_1.fk = gather_append_2.fk)
>                      ->  Index Scan using gather_append_1_ix on gather_append_1
>                            Index Cond: (f = true)
>                      ->  Parallel Hash
>                            ->  Parallel Seq Scan on gather_append_2
> 
> It's only made safe by the fact that Gather will only use 1 worker.
> To me, it just seems too fragile to assume that's always going to be
> the case. I feel like this fix just relies on the fact that
> create_gather_path() and create_gather_merge_path() do
> "pathnode->num_workers = subpath->parallel_workers;". If someone
> decided that was to work a different way, then we risk this breaking
> again. Additionally, today we have Gather and GatherMerge, but we may
> one day end up with more node types that gather results from parallel
> workers, or even a completely different way of executing plans.

It seems strange parallel_aware and parallel_safe flags neither affect
execution nor are properly checked.

Except parallel_safe is checked in ExecSerializePlan which is called from
ExecInitParallelPlan, which is called from ExecGather and ExecGatherMerge.
But looks like this check doesn't affect execution as well.

> 
> I think a safer way to fix this is to just not remove the
> Append/MergeAppend node if the parallel_aware flag of the only-child
> and the Append/MergeAppend don't match. I've done that in the
> attached.
> 
> I believe the code at the end of add_paths_to_append_rel() can remain as is.

I found clean_up_removed_plan_level also called from set_subqueryscan_references.
Is there a need to patch there as well?

And there is strange state:
- in the loop by subpaths, pathnode->node.parallel_safe is set to AND of
  all its subpath's parallel_safe
  (therefore there were need to copy it in my patch version),
- that means, our AppendPath is parallel_aware but not parallel_safe.
It is ridiculous a bit.

And it is strange AppendPath could have more parallel_workers than sum of
its children parallel_workers.

So it looks like whole machinery around parallel_aware/parallel_safe has
no enough consistency.

Either way, I attach you version of fix with my tests as new patch version.

regards,
Yura Sokolov

Вложения

В списке pgsql-hackers по дате отправления:

Предыдущее
От: Trevor Gross
Дата:
Сообщение: Re: WIP: System Versioned Temporal Table
Следующее
От: Marcos Pegoraro
Дата:
Сообщение: current_schema will not use an text index ?