Re: Parallel Append can break run-time partition pruning
От | David Rowley |
---|---|
Тема | Re: Parallel Append can break run-time partition pruning |
Дата | |
Msg-id | CAApHDvqQR7i6feyEvFaE_gr2UgX0=BNHW63tyHHZrsvu5UEYrA@mail.gmail.com обсуждение исходный текст |
Ответ на | Re: Parallel Append can break run-time partition pruning (Amit Langote <amitlangote09@gmail.com>) |
Ответы |
Re: Parallel Append can break run-time partition pruning
(Amit Langote <amitlangote09@gmail.com>)
|
Список | pgsql-hackers |
On Fri, 16 Oct 2020 at 03:01, Amit Langote <amitlangote09@gmail.com> wrote: > > Going over the last few emails, it seems that David held off from > committing the patch, because of the lack of confidence in its > robustness. With the patch, a sub-partitioned child's partial > Append's child paths will *always* be pulled up into the parent's set > of partial child paths thus preventing the nesting of Appends, which > the run-time pruning code can't currently cope with. The lack of > confidence seems to be due to the fact that always pulling up a > sub-Append's child paths into the parent partial Append's child paths > *may* cause the latter to behave wrongly under parallelism and the new > code structure will prevent add_partial_path() from being the > arbitrator of whether such a path is really the best in terms of cost. > > If we can't be confident in that approach, maybe we should consider > making the run-time pruning code cope with nested Appends? I've been thinking about this and my thoughts are: There are other cases where we don't pullup sub-Merge Append nodes anyway, so I think we should just make run-time pruning work for more than just the top-level Append/Merge Append. The case I'm thinking about is the code added in 959d00e9dbe for ordered Append scans. It's possible a sub-partitioned table has partitions which don't participate in the same ordering. We need to keep the sub-Merge Append in those cases... well, at least when there's more than 1 partition remaining after plan-time pruning. I've attached the patch which, pending a final look, I'm proposing to commit for master only. I just don't quite think this is a bug fix, and certainly, due to the invasiveness of the proposed patch, that means no backpatch. I fixed all the stuff about the incorrectly set partitioned_rels list. What I ended up with there is making it accumulate_append_subpath's job to also pullup the sub-paths partitioned_rels fields when pulling up a nested Append/MergeAppend. If there's no pullup, there then we don't care about the sub-path's partitioned_rels. That's for it to deal with. With that, I think that run-time pruning will only get the RT indexes for partitions that we actually have sub-paths for. That's pretty good, so I added an Assert() to verify that in make_partitionedrel_pruneinfo(). (I hope I don't regret that later) This does mean we need to maintain a different partitioned_rels list for each Append path we consider. So there's a number (six) of these now between add_paths_to_append_rel() and generate_orderedappend_paths(). To try to minimise the impact of that, I've changed the field so that instead of being a List of IntLists, it's just a List of Relids. The top-level list just contains a single element until you get a UNION ALL that selects from a partitioned table on each side of the union. Merging sub-path partitioned rel RT indexes into the existing element is now pretty cheap as it just uses bms_add_members() rather the list_concat we'd have had to have used if it was still a List of IntLists. After fixing up how partitioned_rels is built, I saw there were no usages of RelOptInfo.partitioned_child_rels, so I got rid of it. I did another couple of little cleanups and wrote some regression tests to test all this. Overall I'm fairly happy with this, especially getting rid of a partitioned table related field from RelOptInfo. David
Вложения
В списке pgsql-hackers по дате отправления: