Parallelize correlated subqueries that execute within each worker

Поиск
Список
Период
Сортировка
От James Coleman
Тема Parallelize correlated subqueries that execute within each worker
Дата
Msg-id CAAaqYe_x1u3V4uPiv=dJ=k2EJ7txhdq6yexJDkYZ1x1pu0QwcQ@mail.gmail.com
обсуждение исходный текст
Ответы Re: Parallelize correlated subqueries that execute within each worker
Re: Parallelize correlated subqueries that execute within each worker
Список pgsql-hackers
In a bug report back in November [1] a subthread explored why parallel
query is excluded any time we have "Plan nodes which reference a
correlated SubPlan". Amit's understanding was that the reasoning had
to do with inability to easily pass (potentially variable length)
Param values between workers.

However a decent-sized subset of this kind of query doesn't actually
require that we communicate between workers. If the Subplan executes
per-tuple within the worker then there's no reason I can see why it
needs to be marked parallel unsafe. Amit concurred but noted that
identifying that subset of plans is the difficult part (as is usually
the case!)

At the time I'd started work on an approach to handle this case and
hoped to "post about it in a new thread later this week." That didn't
happen, but here we are now, and I finally have this patch cleaned up
enough to share.

The basic idea is that we need to track (both on nodes and relations)
not only whether that node or rel is parallel safe but also whether
it's parallel safe assuming params are rechecked in the using context.
That allows us to delay making a final decision until we have
sufficient context to conclude that a given usage of a Param is
actually parallel safe or unsafe.

The first patch in this series was previously posted in the thread
"Consider parallel for lateral subqueries with limit" [2] and is
required as a precursor for various test cases to work here.

The second patch implements the core of the series. It results in
parallel query being possible for subplans that execute entirely
within the context of a parallel worker for cases where that subplan
is in the target, a LATERAL JOIN, or the WHERE and ORDER BY clauses.

The final patch notes several places where we set e.g.
rel->consider_parallel but setting the corresponding new value
rel->consider_parallel_recheckng_params wasn't yet necessary. It shows
opportunity either for further improvement or concluding certain cases
can't benefit and should be left unchanged.

James

1: https://www.postgresql.org/message-id/CAAaqYe_vihKjc%2B8LuQa49EHW4%2BKfefb3wHqPYFnCuUqozo%2BLFg%40mail.gmail.com
2: https://www.postgresql.org/message-id/flat/CAAaqYe_HEkmLwf_1iEHxXwQOWiRyiFd%3DuOu6kwj3sWYdVd1-zA%40mail.gmail.com

Вложения

В списке pgsql-hackers по дате отправления:

Предыдущее
От: Tom Lane
Дата:
Сообщение: Draft back-branch release notes are up
Следующее
От: "Joel Jacobson"
Дата:
Сообщение: [PATCH] Add configure cache checkpoints before fatal checks