Re: Parallel Append can break run-time partition pruning

Поиск
Список
Период
Сортировка
От David Rowley
Тема Re: Parallel Append can break run-time partition pruning
Дата
Msg-id CAApHDvqQR7i6feyEvFaE_gr2UgX0=BNHW63tyHHZrsvu5UEYrA@mail.gmail.com
обсуждение исходный текст
Ответ на Re: Parallel Append can break run-time partition pruning  (Amit Langote <amitlangote09@gmail.com>)
Ответы Re: Parallel Append can break run-time partition pruning  (Amit Langote <amitlangote09@gmail.com>)
Список pgsql-hackers
On Fri, 16 Oct 2020 at 03:01, Amit Langote <amitlangote09@gmail.com> wrote:
>
> Going over the last few emails, it seems that David held off from
> committing the patch, because of the lack of confidence in its
> robustness.   With the patch, a sub-partitioned child's partial
> Append's child paths will *always* be pulled up into the parent's set
> of partial child paths thus preventing the nesting of Appends, which
> the run-time pruning code can't currently cope with.  The lack of
> confidence seems to be due to the fact that always pulling up a
> sub-Append's child paths into the parent partial Append's child paths
> *may* cause the latter to behave wrongly under parallelism and the new
> code structure will prevent add_partial_path() from being the
> arbitrator of whether such a path is really the best in terms of cost.
>
> If we can't be confident in that approach, maybe we should consider
> making the run-time pruning code cope with nested Appends?

I've been thinking about this and my thoughts are:

There are other cases where we don't pullup sub-Merge Append nodes
anyway, so I think we should just make run-time pruning work for more
than just the top-level Append/Merge Append.

The case I'm thinking about is the code added in 959d00e9dbe for
ordered Append scans.  It's possible a sub-partitioned table has
partitions which don't participate in the same ordering.  We need to
keep the sub-Merge Append in those cases... well, at least when
there's more than 1 partition remaining after plan-time pruning.

I've attached the patch which, pending a final look, I'm proposing to
commit for master only.  I just don't quite think this is a bug fix,
and certainly, due to the invasiveness of the proposed patch, that
means no backpatch.

I fixed all the stuff about the incorrectly set partitioned_rels list.
What I ended up with there is making it accumulate_append_subpath's
job to also pullup the sub-paths partitioned_rels fields when pulling
up a nested Append/MergeAppend.   If there's no pullup, there then we
don't care about the sub-path's partitioned_rels.  That's for it to
deal with.  With that, I think that run-time pruning will only get the
RT indexes for partitions that we actually have sub-paths for. That's
pretty good, so I added an Assert() to verify that in
make_partitionedrel_pruneinfo().  (I hope I don't regret that later)

This does mean we need to maintain a different partitioned_rels list
for each Append path we consider.  So there's a number (six) of these
now between add_paths_to_append_rel() and
generate_orderedappend_paths().  To try to minimise the impact of
that, I've changed the field so that instead of being a List of
IntLists, it's just a List of Relids.  The top-level list just
contains a single element until you get a UNION ALL that selects from
a partitioned table on each side of the union.  Merging sub-path
partitioned rel RT indexes into the existing element is now pretty
cheap as it just uses bms_add_members() rather the list_concat we'd
have had to have used if it was still a List of IntLists.

After fixing up how partitioned_rels is built, I saw there were no
usages of RelOptInfo.partitioned_child_rels, so I got rid of it.

I did another couple of little cleanups and wrote some regression
tests to test all this.

Overall I'm fairly happy with this, especially getting rid of a
partitioned table related field from RelOptInfo.

David

Вложения

В списке pgsql-hackers по дате отправления:

Предыдущее
От: Dmitry Dolgov
Дата:
Сообщение: Re: Index Skip Scan (new UniqueKeys)
Следующее
От: "Hou, Zhijie"
Дата:
Сообщение: Fix typo in src/backend/utils/cache/lsyscache.c