Обсуждение: Fix BUG #17335: Duplicate result rows in Gather node

Поиск
Список
Период
Сортировка

Fix BUG #17335: Duplicate result rows in Gather node

От
Yura Sokolov
Дата:
Good day, hackers.

Problem:
- Append path is created with explicitely parallel_aware = true
- It has two child, one is trivial, other is parallel_aware = false .
  Trivial child is dropped.
- Gather/GatherMerge path takes Append path as a child and thinks
  its child is parallel_aware = true.
- But Append path is removed at the last since it has only one child.
- Now Gather/GatherMerge thinks its child is parallel_aware, but it
  is not.
  Gather/GatherMerge runs its child twice: in a worker and in a leader,
  and gathers same rows twice.

Reproduction code attached (repro.sql. Included as a test as well).

Suggested quick (and valid) fix in the patch attached:
- If Append has single child, then copy its parallel awareness.

Bug were introduced with commit 8edd0e79460b414b1d971895312e549e95e12e4f
"Suppress Append and MergeAppend plan nodes that have a single child."

During discussion, it were supposed [1] those fields should be copied:

> I haven't looked into whether this does the right things for parallel
> planning --- possibly create_[merge]append_path need to propagate up
> parallel-related path fields from the single child?

But it were not so obvious [2].

Better fix could contain removing Gather/GatherMerge node as well if
its child is not parallel aware.

Bug is reported in https://postgr.es/m/flat/17335-4dc92e1aea3a78af%40postgresql.org
Since no way to add thread from pgsql-bugs to commitfest, I write here.

[1] https://postgr.es/m/17500.1551669976%40sss.pgh.pa.us
[2] https://postgr.es/m/CAKJS1f_Wt_tL3S32R3wpU86zQjuHfbnZbFt0eqm%3DqcRFcdbLvw%40mail.gmail.com

---- 
regards
Yura Sokolov
y.sokolov@postgrespro.ru
funny.falcon@gmail.com

Вложения

Re: Fix BUG #17335: Duplicate result rows in Gather node

От
Dilip Kumar
Дата:
On Thu, Dec 30, 2021 at 4:44 PM Yura Sokolov <y.sokolov@postgrespro.ru> wrote:
Good day, hackers.

Problem:
- Append path is created with explicitely parallel_aware = true
- It has two child, one is trivial, other is parallel_aware = false .
  Trivial child is dropped.
- Gather/GatherMerge path takes Append path as a child and thinks
  its child is parallel_aware = true.
- But Append path is removed at the last since it has only one child.
- Now Gather/GatherMerge thinks its child is parallel_aware, but it
  is not.
  Gather/GatherMerge runs its child twice: in a worker and in a leader,
  and gathers same rows twice.

Reproduction code attached (repro.sql. Included as a test as well).

Yeah, this is a problem.
 

Suggested quick (and valid) fix in the patch attached:
- If Append has single child, then copy its parallel awareness.

Bug were introduced with commit 8edd0e79460b414b1d971895312e549e95e12e4f
"Suppress Append and MergeAppend plan nodes that have a single child."

During discussion, it were supposed [1] those fields should be copied:

> I haven't looked into whether this does the right things for parallel
> planning --- possibly create_[merge]append_path need to propagate up
> parallel-related path fields from the single child?

But it were not so obvious [2].

Better fix could contain removing Gather/GatherMerge node as well if
its child is not parallel aware.

The Gather path will only be created if we have an underlying partial path, so I think if we are generating the append path only from the non-partial paths then we can see if the number of child nodes is just 1 then don't generate the partial append path, so from that you will node generate the partial join and eventually gather will be avoided.

--
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com

Re: Fix BUG #17335: Duplicate result rows in Gather node

От
David Rowley
Дата:
On Fri, 31 Dec 2021 at 00:14, Yura Sokolov <y.sokolov@postgrespro.ru> wrote:
> Problem:
> - Append path is created with explicitely parallel_aware = true
> - It has two child, one is trivial, other is parallel_aware = false .
>   Trivial child is dropped.
> - Gather/GatherMerge path takes Append path as a child and thinks
>   its child is parallel_aware = true.
> - But Append path is removed at the last since it has only one child.
> - Now Gather/GatherMerge thinks its child is parallel_aware, but it
>   is not.
>   Gather/GatherMerge runs its child twice: in a worker and in a leader,
>   and gathers same rows twice.

Thanks for the report. I can confirm that I can recreate the problem
with your script.

I will look into this further later next week.

David



Re: Fix BUG #17335: Duplicate result rows in Gather node

От
Yura Sokolov
Дата:
В Сб, 01/01/2022 в 15:19 +1300, David Rowley пишет:
> On Fri, 31 Dec 2021 at 00:14, Yura Sokolov <y.sokolov@postgrespro.ru> wrote:
> > Problem:
> > - Append path is created with explicitely parallel_aware = true
> > - It has two child, one is trivial, other is parallel_aware = false .
> >   Trivial child is dropped.
> > - Gather/GatherMerge path takes Append path as a child and thinks
> >   its child is parallel_aware = true.
> > - But Append path is removed at the last since it has only one child.
> > - Now Gather/GatherMerge thinks its child is parallel_aware, but it
> >   is not.
> >   Gather/GatherMerge runs its child twice: in a worker and in a leader,
> >   and gathers same rows twice.
> 
> Thanks for the report. I can confirm that I can recreate the problem
> with your script.
> 
> I will look into this further later next week.
> 

Good day, David.

Excuse me for disturbing.
Any update on this?
Any chance to be fixed in next minor release?
Could this simple fix be merged before further improvements?

Yura.




Re: Fix BUG #17335: Duplicate result rows in Gather node

От
David Rowley
Дата:
On Fri, 31 Dec 2021 at 00:14, Yura Sokolov <y.sokolov@postgrespro.ru> wrote:
> Suggested quick (and valid) fix in the patch attached:
> - If Append has single child, then copy its parallel awareness.

I've been looking at this and I've gone through changing my mind about
what's the right fix quite a number of times.

My current thoughts are that I don't really like the fact that we can
have plans in the following shape:

 Finalize Aggregate
   ->  Gather
         Workers Planned: 1
         ->  Partial Aggregate
               ->  Parallel Hash Left Join
                     Hash Cond: (gather_append_1.fk = gather_append_2.fk)
                     ->  Index Scan using gather_append_1_ix on gather_append_1
                           Index Cond: (f = true)
                     ->  Parallel Hash
                           ->  Parallel Seq Scan on gather_append_2

It's only made safe by the fact that Gather will only use 1 worker.
To me, it just seems too fragile to assume that's always going to be
the case. I feel like this fix just relies on the fact that
create_gather_path() and create_gather_merge_path() do
"pathnode->num_workers = subpath->parallel_workers;". If someone
decided that was to work a different way, then we risk this breaking
again. Additionally, today we have Gather and GatherMerge, but we may
one day end up with more node types that gather results from parallel
workers, or even a completely different way of executing plans.

I think a safer way to fix this is to just not remove the
Append/MergeAppend node if the parallel_aware flag of the only-child
and the Append/MergeAppend don't match. I've done that in the
attached.

I believe the code at the end of add_paths_to_append_rel() can remain as is.

David

Вложения

Re: Fix BUG #17335: Duplicate result rows in Gather node

От
Yura Sokolov
Дата:
В Чт, 20/01/2022 в 09:32 +1300, David Rowley пишет:
> On Fri, 31 Dec 2021 at 00:14, Yura Sokolov <y.sokolov@postgrespro.ru> wrote:
> > Suggested quick (and valid) fix in the patch attached:
> > - If Append has single child, then copy its parallel awareness.
> 
> I've been looking at this and I've gone through changing my mind about
> what's the right fix quite a number of times.
> 
> My current thoughts are that I don't really like the fact that we can
> have plans in the following shape:
> 
>  Finalize Aggregate
>    ->  Gather
>          Workers Planned: 1
>          ->  Partial Aggregate
>                ->  Parallel Hash Left Join
>                      Hash Cond: (gather_append_1.fk = gather_append_2.fk)
>                      ->  Index Scan using gather_append_1_ix on gather_append_1
>                            Index Cond: (f = true)
>                      ->  Parallel Hash
>                            ->  Parallel Seq Scan on gather_append_2
> 
> It's only made safe by the fact that Gather will only use 1 worker.
> To me, it just seems too fragile to assume that's always going to be
> the case. I feel like this fix just relies on the fact that
> create_gather_path() and create_gather_merge_path() do
> "pathnode->num_workers = subpath->parallel_workers;". If someone
> decided that was to work a different way, then we risk this breaking
> again. Additionally, today we have Gather and GatherMerge, but we may
> one day end up with more node types that gather results from parallel
> workers, or even a completely different way of executing plans.

It seems strange parallel_aware and parallel_safe flags neither affect
execution nor are properly checked.

Except parallel_safe is checked in ExecSerializePlan which is called from
ExecInitParallelPlan, which is called from ExecGather and ExecGatherMerge.
But looks like this check doesn't affect execution as well.

> 
> I think a safer way to fix this is to just not remove the
> Append/MergeAppend node if the parallel_aware flag of the only-child
> and the Append/MergeAppend don't match. I've done that in the
> attached.
> 
> I believe the code at the end of add_paths_to_append_rel() can remain as is.

I found clean_up_removed_plan_level also called from set_subqueryscan_references.
Is there a need to patch there as well?

And there is strange state:
- in the loop by subpaths, pathnode->node.parallel_safe is set to AND of
  all its subpath's parallel_safe
  (therefore there were need to copy it in my patch version),
- that means, our AppendPath is parallel_aware but not parallel_safe.
It is ridiculous a bit.

And it is strange AppendPath could have more parallel_workers than sum of
its children parallel_workers.

So it looks like whole machinery around parallel_aware/parallel_safe has
no enough consistency.

Either way, I attach you version of fix with my tests as new patch version.

regards,
Yura Sokolov

Вложения

Re: Fix BUG #17335: Duplicate result rows in Gather node

От
Yura Sokolov
Дата:
В Вс, 23/01/2022 в 14:56 +0300, Yura Sokolov пишет:
> В Чт, 20/01/2022 в 09:32 +1300, David Rowley пишет:
> > On Fri, 31 Dec 2021 at 00:14, Yura Sokolov <y.sokolov@postgrespro.ru> wrote:
> > > Suggested quick (and valid) fix in the patch attached:
> > > - If Append has single child, then copy its parallel awareness.
> > 
> > I've been looking at this and I've gone through changing my mind about
> > what's the right fix quite a number of times.
> > 
> > My current thoughts are that I don't really like the fact that we can
> > have plans in the following shape:
> > 
> >  Finalize Aggregate
> >    ->  Gather
> >          Workers Planned: 1
> >          ->  Partial Aggregate
> >                ->  Parallel Hash Left Join
> >                      Hash Cond: (gather_append_1.fk = gather_append_2.fk)
> >                      ->  Index Scan using gather_append_1_ix on gather_append_1
> >                            Index Cond: (f = true)
> >                      ->  Parallel Hash
> >                            ->  Parallel Seq Scan on gather_append_2
> > 
> > It's only made safe by the fact that Gather will only use 1 worker.
> > To me, it just seems too fragile to assume that's always going to be
> > the case. I feel like this fix just relies on the fact that
> > create_gather_path() and create_gather_merge_path() do
> > "pathnode->num_workers = subpath->parallel_workers;". If someone
> > decided that was to work a different way, then we risk this breaking
> > again. Additionally, today we have Gather and GatherMerge, but we may
> > one day end up with more node types that gather results from parallel
> > workers, or even a completely different way of executing plans.
> 
> It seems strange parallel_aware and parallel_safe flags neither affect
> execution nor are properly checked.
> 
> Except parallel_safe is checked in ExecSerializePlan which is called from
> ExecInitParallelPlan, which is called from ExecGather and ExecGatherMerge.
> But looks like this check doesn't affect execution as well.
> 
> > I think a safer way to fix this is to just not remove the
> > Append/MergeAppend node if the parallel_aware flag of the only-child
> > and the Append/MergeAppend don't match. I've done that in the
> > attached.
> > 
> > I believe the code at the end of add_paths_to_append_rel() can remain as is.
> 
> I found clean_up_removed_plan_level also called from set_subqueryscan_references.
> Is there a need to patch there as well?
> 
> And there is strange state:
> - in the loop by subpaths, pathnode->node.parallel_safe is set to AND of
>   all its subpath's parallel_safe
>   (therefore there were need to copy it in my patch version),
> - that means, our AppendPath is parallel_aware but not parallel_safe.
> It is ridiculous a bit.
> 
> And it is strange AppendPath could have more parallel_workers than sum of
> its children parallel_workers.
> 
> So it looks like whole machinery around parallel_aware/parallel_safe has
> no enough consistency.
> 
> Either way, I attach you version of fix with my tests as new patch version.

Looks like volatile "Memory Usage:" in EXPLAIN brokes 'make check'
sporadically.

Applied replacement in style of memoize.sql test.

Why there is no way to disable "Buckets: %d Buffers: %d Memory Usage: %dkB"
output in show_hash_info?

regards,
Yura Sokolov

Вложения

Re: Fix BUG #17335: Duplicate result rows in Gather node

От
Yura Sokolov
Дата:
В Пн, 24/01/2022 в 16:24 +0300, Yura Sokolov пишет:
> В Вс, 23/01/2022 в 14:56 +0300, Yura Sokolov пишет:
> > В Чт, 20/01/2022 в 09:32 +1300, David Rowley пишет:
> > > On Fri, 31 Dec 2021 at 00:14, Yura Sokolov <y.sokolov@postgrespro.ru> wrote:
> > > > Suggested quick (and valid) fix in the patch attached:
> > > > - If Append has single child, then copy its parallel awareness.
> > > 
> > > I've been looking at this and I've gone through changing my mind about
> > > what's the right fix quite a number of times.
> > > 
> > > My current thoughts are that I don't really like the fact that we can
> > > have plans in the following shape:
> > > 
> > >  Finalize Aggregate
> > >    ->  Gather
> > >          Workers Planned: 1
> > >          ->  Partial Aggregate
> > >                ->  Parallel Hash Left Join
> > >                      Hash Cond: (gather_append_1.fk = gather_append_2.fk)
> > >                      ->  Index Scan using gather_append_1_ix on gather_append_1
> > >                            Index Cond: (f = true)
> > >                      ->  Parallel Hash
> > >                            ->  Parallel Seq Scan on gather_append_2
> > > 
> > > It's only made safe by the fact that Gather will only use 1 worker.
> > > To me, it just seems too fragile to assume that's always going to be
> > > the case. I feel like this fix just relies on the fact that
> > > create_gather_path() and create_gather_merge_path() do
> > > "pathnode->num_workers = subpath->parallel_workers;". If someone
> > > decided that was to work a different way, then we risk this breaking
> > > again. Additionally, today we have Gather and GatherMerge, but we may
> > > one day end up with more node types that gather results from parallel
> > > workers, or even a completely different way of executing plans.
> > 
> > It seems strange parallel_aware and parallel_safe flags neither affect
> > execution nor are properly checked.
> > 
> > Except parallel_safe is checked in ExecSerializePlan which is called from
> > ExecInitParallelPlan, which is called from ExecGather and ExecGatherMerge.
> > But looks like this check doesn't affect execution as well.
> > 
> > > I think a safer way to fix this is to just not remove the
> > > Append/MergeAppend node if the parallel_aware flag of the only-child
> > > and the Append/MergeAppend don't match. I've done that in the
> > > attached.
> > > 
> > > I believe the code at the end of add_paths_to_append_rel() can remain as is.
> > 
> > I found clean_up_removed_plan_level also called from set_subqueryscan_references.
> > Is there a need to patch there as well?
> > 
> > And there is strange state:
> > - in the loop by subpaths, pathnode->node.parallel_safe is set to AND of
> >   all its subpath's parallel_safe
> >   (therefore there were need to copy it in my patch version),
> > - that means, our AppendPath is parallel_aware but not parallel_safe.
> > It is ridiculous a bit.
> > 
> > And it is strange AppendPath could have more parallel_workers than sum of
> > its children parallel_workers.
> > 
> > So it looks like whole machinery around parallel_aware/parallel_safe has
> > no enough consistency.
> > 
> > Either way, I attach you version of fix with my tests as new patch version.
> 
> Looks like volatile "Memory Usage:" in EXPLAIN brokes 'make check'
> sporadically.
> 
> Applied replacement in style of memoize.sql test.
> 
> Why there is no way to disable "Buckets: %d Buffers: %d Memory Usage: %dkB"
> output in show_hash_info?

And another attempt to fix tests volatility.

Вложения

Re: Fix BUG #17335: Duplicate result rows in Gather node

От
David Rowley
Дата:
On Tue, 25 Jan 2022 at 17:35, Yura Sokolov <y.sokolov@postgrespro.ru> wrote:
> And another attempt to fix tests volatility.

FWIW, I had not really seen the point in adding a test for this.   I
did however see a point in it with your original patch. It seemed
useful there to verify that Gather and GatherMerge did what we
expected with 1 worker.

David



Re: Fix BUG #17335: Duplicate result rows in Gather node

От
David Rowley
Дата:
On Tue, 25 Jan 2022 at 20:03, David Rowley <dgrowleyml@gmail.com> wrote:
>
> On Tue, 25 Jan 2022 at 17:35, Yura Sokolov <y.sokolov@postgrespro.ru> wrote:
> > And another attempt to fix tests volatility.
>
> FWIW, I had not really seen the point in adding a test for this.   I
> did however see a point in it with your original patch. It seemed
> useful there to verify that Gather and GatherMerge did what we
> expected with 1 worker.

I ended up pushing just the last patch I sent.

The reason I didn't think it was worth adding a new test was that no
tests were added in the original commit.  Existing tests did cover it,
but here we're just restoring the original behaviour for one simple
case.  The test in your patch just seemed a bit more hassle than it
was worth. I struggle to imagine how we'll break this again.

David



Re: Fix BUG #17335: Duplicate result rows in Gather node

От
Yura Sokolov
Дата:
В Вт, 25/01/2022 в 21:20 +1300, David Rowley пишет:
> On Tue, 25 Jan 2022 at 20:03, David Rowley <dgrowleyml@gmail.com> wrote:
> > On Tue, 25 Jan 2022 at 17:35, Yura Sokolov <y.sokolov@postgrespro.ru> wrote:
> > > And another attempt to fix tests volatility.
> > 
> > FWIW, I had not really seen the point in adding a test for this.   I
> > did however see a point in it with your original patch. It seemed
> > useful there to verify that Gather and GatherMerge did what we
> > expected with 1 worker.
> 
> I ended up pushing just the last patch I sent.
> 
> The reason I didn't think it was worth adding a new test was that no
> tests were added in the original commit.  Existing tests did cover it,

Existed tests didn't catched the issue. It is pitty fix is merged
without test case it fixes.

> but here we're just restoring the original behaviour for one simple
> case.  The test in your patch just seemed a bit more hassle than it
> was worth. I struggle to imagine how we'll break this again.

Thank you for attention and for fix.

regards,
Yura Sokolov.




Re: Fix BUG #17335: Duplicate result rows in Gather node

От
Tom Lane
Дата:
Yura Sokolov <y.sokolov@postgrespro.ru> writes:
> В Вт, 25/01/2022 в 21:20 +1300, David Rowley пишет:
>> The reason I didn't think it was worth adding a new test was that no
>> tests were added in the original commit.  Existing tests did cover it,

> Existed tests didn't catched the issue. It is pitty fix is merged
> without test case it fixes.

I share David's skepticism about the value of a test case.  The
failure mode that seems likely to me is some other code path making
the same mistake, which a predetermined test would not catch.

Therefore, what I think could be useful is some very-late-stage
assertion check (probably in createplan.c) verifying that the
child of a Gather is parallel-aware.  Or maybe the condition
needs to be more general than that, but anyway the idea is for
the back end of the planner to verify that we didn't build a
silly plan.

            regards, tom lane



Re: Fix BUG #17335: Duplicate result rows in Gather node

От
David Rowley
Дата:
On Wed, 26 Jan 2022 at 05:32, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Therefore, what I think could be useful is some very-late-stage
> assertion check (probably in createplan.c) verifying that the
> child of a Gather is parallel-aware.  Or maybe the condition
> needs to be more general than that, but anyway the idea is for
> the back end of the planner to verify that we didn't build a
> silly plan.

Yeah, it would be nice to have something like this.  I think to do it,
we might need to invent some sort of path traversal function that can
take a custom callback function.  The problem is that the parallel
aware path does not need to be directly below the gather/gathermerge.

For example (from select_distinct.out)

 Unique
   ->  Sort
         Sort Key: four
         ->  Gather
               Workers Planned: 2
               ->  HashAggregate
                     Group Key: four
                     ->  Parallel Seq Scan on tenk1

For this case, the custom callback would check that there's at least 1
parallel_aware subpath below the Gather/GatherMerge.

There's probably some other rules that we could Assert are true.  I
think any parallel_aware paths (unless they're scans) must contain
only parallel_aware subpaths. For example, parallel hash join must
have a parallel aware inner and outer.

David



Re: Fix BUG #17335: Duplicate result rows in Gather node

От
David Rowley
Дата:
On Wed, 26 Jan 2022 at 05:32, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Therefore, what I think could be useful is some very-late-stage
> assertion check (probably in createplan.c) verifying that the
> child of a Gather is parallel-aware.  Or maybe the condition
> needs to be more general than that, but anyway the idea is for
> the back end of the planner to verify that we didn't build a
> silly plan.

I had a go at writing something along these lines, but I've ended up
with something I really don't like very much.

I ended up having to write a recursive path traversal function.  It's
generic and it can be given a callback function to do whatever we like
with the Path.  The problem is, that this seems like quite a bit of
code to maintain just for plan validation in Assert builds.

Currently, the patch validates 3 rules:

1) Ensure a parallel_aware path has only parallel_aware or
parallel_safe subpaths.
2) Ensure Gather is either single_copy or contains at least one
parallel_aware subnode.
3) Ensure GatherMerge contains at least one parallel_aware subnode.

I had to relax rule #1 a little as a Parallel Append can run subnodes
that are only parallel_safe and not parallel_aware.  The problem with
relaxing this rule is that it does not catch the case that this bug
report was about. I could maybe tweak that so there's a special case
for Append to allow parallel aware or safe and ensure all other nodes
have only parallel_safe subnodes. I just don't really like that
special case as it's likely to get broken/forgotten over time when we
add new nodes.

I'm unsure if just being able to enforce rules #2 and #3 make this worthwhile.

Happy to listen to other people's opinions and ideas on this.  Without
those, I'm unlikely to try to push this any further.

David

Вложения

Re: Fix BUG #17335: Duplicate result rows in Gather node

От
Robert Haas
Дата:
On Thu, Feb 3, 2022 at 7:08 PM David Rowley <dgrowleyml@gmail.com> wrote:
> Currently, the patch validates 3 rules:
>
> 1) Ensure a parallel_aware path has only parallel_aware or
> parallel_safe subpaths.

I think that every path that is parallel_aware must also be
parallel_safe. So checking for either parallel_aware or parallel_safe
should be equivalent to just checking parallel_safe, unless I am
confused.

I think the actual rule is: every path under a Gather or GatherMerge
must be parallel-safe.

I don't think there's any real rule about what has to be under
parallel-aware paths -- except that it would have to be all
parallel-safe stuff, because the whole thing is under a Gather
(Merge). There may seem to be such a rule, but I suspect it's just an
accident of whatever code we have now rather than anything intrinsic.

> 2) Ensure Gather is either single_copy or contains at least one
> parallel_aware subnode.

I agree that this one is a rule which we could check.

> 3) Ensure GatherMerge contains at least one parallel_aware subnode.

This one, too.

-- 
Robert Haas
EDB: http://www.enterprisedb.com



Re: Fix BUG #17335: Duplicate result rows in Gather node

От
David Rowley
Дата:
Thanks for having a look at this.

On Fri, 4 Feb 2022 at 13:48, Robert Haas <robertmhaas@gmail.com> wrote:
> I think the actual rule is: every path under a Gather or GatherMerge
> must be parallel-safe.

I've adjusted the patch so that it counts parallel_aware and
parallel_safe Paths independently and verifies everything below a
Gather[Merge] is parallel_safe.

The diff stat currently looks like:

src/backend/optimizer/plan/createplan.c | 230
1 file changed, 230 insertions(+)

I still feel this is quite a bit of code for what we're getting here.
I'd be more for it if the path traversal function existed for some
other reason and I was just adding the callback functions and Asserts.

I'm keen to hear what others think about that.

David

Вложения

Re: Fix BUG #17335: Duplicate result rows in Gather node

От
Zhihong Yu
Дата:


On Tue, Feb 8, 2022 at 1:11 PM David Rowley <dgrowleyml@gmail.com> wrote:
Thanks for having a look at this.

On Fri, 4 Feb 2022 at 13:48, Robert Haas <robertmhaas@gmail.com> wrote:
> I think the actual rule is: every path under a Gather or GatherMerge
> must be parallel-safe.

I've adjusted the patch so that it counts parallel_aware and
parallel_safe Paths independently and verifies everything below a
Gather[Merge] is parallel_safe.

The diff stat currently looks like:

src/backend/optimizer/plan/createplan.c | 230
1 file changed, 230 insertions(+)

I still feel this is quite a bit of code for what we're getting here.
I'd be more for it if the path traversal function existed for some
other reason and I was just adding the callback functions and Asserts.

I'm keen to hear what others think about that.

David
Hi,

+           break;
+           case T_MergeAppend:

The case for T_MergeAppend should be left indented.

+       case T_Result:
+           if (IsA(path, ProjectionPath))

Since the remaining sub-cases don't have subpath, they are covered by the final `else` block - MinMaxAggPath and GroupResultPath don't need to be checked.

For contains_a_parallel_aware_path(), it seems path_type_counter() can return bool indicating whether the walker should return early (when parallel aware count reaches 1).

Cheers

Re: Fix BUG #17335: Duplicate result rows in Gather node

От
Robert Haas
Дата:
On Tue, Feb 8, 2022 at 4:11 PM David Rowley <dgrowleyml@gmail.com> wrote:
> I still feel this is quite a bit of code for what we're getting here.
> I'd be more for it if the path traversal function existed for some
> other reason and I was just adding the callback functions and Asserts.
>
> I'm keen to hear what others think about that.

My view is that functions like path_tree_walker are good things to
have on general principle. I find it likely that it will find other
uses, and that if we don't add as part of this patch, someone will add
it for some other reason in the future. So I would not really count
that in deciding how big this patch is, and the rest of what you have
here is pretty short and to the point.

There is the more difficult philosophical question of whether it's
worth expending any code on this at all. I think it is pretty clear
that this has positive value: it could easily prevent >0 future bugs,
which IMHO is not bad for such a small patch. However, it does feel a
little bit primitive somehow, in the sense that there are a lot of
things you could do wrong which this wouldn't catch. For example, a
Gather with no parallel-aware node under it is probably busted, unless
someone invents new kinds of parallel operators that work differently
from what we have now. But a join beneath a Gather that is not itself
parallel-aware should have a parallel-aware node under exactly one
side of the join. If there's a parallel scan on both sides or neither
side, even with stuff on top of it, that's wrong. But a parallel-aware
join could do something else, e.g. Parallel Hash Join expects a
parallel path on both sides. Some other parallel-aware join type could
expect a parallel path on exactly one side without caring which one,
or on one specific side, or maybe even on neither side.

What we're really reasoning about here is whether the input is going
to be partitioned across multiple executions of the plan in a proper
way. A Gather is going to run the same plan in all of its workers, so
it wants a subplan that when run in all workers will together produce
all output rows. Parallel-aware scans partition the results across
workers, so they behave that way. A non-parallel aware join will work
that way if it joins a partition the input on one side to all of the
input from the other side, hence the rule I describe above. For
aggregates, you can't safely apply a plain old Aggregate operation
either to a regular scan or to a parallel-aware scan and get the right
answer, which is why we need Partial and Finalize stages for parallel
query. But for a lot of other nodes, like Materialize, their output
will have the same properties as the input: if the subplan of a
Materialize node produces all the rows on each execution, the
Materialize node will too; if it produces a partition of the output
rows each time it's executed, once per worker, the Materialize node
will do the same. And I think it's that kind of case that leads to the
check we have here, that there ought to be a parallel-aware node in
there someplace.

It might be the case that there's some more sophisticated check we
could be doing here that would be more satisfying than the one you've
written, but I'm not sure. Such a check might end up needing to know
the behavior of the existing nodes in a lot of detail, which then
wouldn't help with finding bugs in new functionality we add in the
future. In that sense, the kind of simple check you've got here has
something to recommend it: it won't catch everything people can do
wrong, but when it does trip, chances are good it's found a bug, and
it's got a good chance of continuing to work as well as it does today
even in the face of future additions. So I guess I'm mildly in favor
of it, but I would also find it entirely reasonable if you were to
decide it's not quite worth it.

-- 
Robert Haas
EDB: http://www.enterprisedb.com