Обсуждение: Consider parallel for lateral subqueries with limit

Поиск

Список

Период

Сортировка

Consider parallel for lateral subqueries with limit

От

James Coleman

Дата:

01 декабря 2020 г., 00:00:26

I've been investigating parallelizing certain correlated subqueries,
and during that work stumbled across the fact that
set_rel_consider_parallel disallows parallel query on what seems like
a fairly simple case.

Consider this query:

select t.unique1
from tenk1 t
join lateral (select t.unique1 from tenk1 offset 0) l on true;

Current set_rel_consider_parallel sets consider_parallel=false on the
subquery rel because it has a limit/offset. That restriction makes a
lot of sense when we have a subquery whose results conceptually need
to be "shared" (or at least be the same) across multiple workers
(indeed the relevant comment in that function notes that cases where
we could prove a unique ordering would also qualify, but punts on
implementing that due to complexity). But if the subquery is LATERAL,
then no such conceptual restriction.

If we change the code slightly to allow considering parallel query
even in the face of LIMIT/OFFSET for LATERAL subqueries, then our
query above changes from the following plan:

 Nested Loop
   Output: t.unique1
   ->  Gather
         Output: t.unique1
         Workers Planned: 2
         ->  Parallel Index Only Scan using tenk1_unique1 on public.tenk1 t
               Output: t.unique1
   ->  Gather
         Output: NULL::integer
         Workers Planned: 2
         ->  Parallel Index Only Scan using tenk1_hundred on public.tenk1
               Output: NULL::integer

to this plan:

 Gather
   Output: t.unique1
   Workers Planned: 2
   ->  Nested Loop
         Output: t.unique1
         ->  Parallel Index Only Scan using tenk1_unique1 on public.tenk1 t
               Output: t.unique1
         ->  Index Only Scan using tenk1_hundred on public.tenk1
               Output: NULL::integer

The code change itself is quite simple (1 line). As far as I can tell
we don't need to expressly check parallel safety of the limit/offset
expressions; that appears to happen elsewhere (and that makes sense
since the RTE_RELATION case doesn't check those clauses either).

If I'm missing something about the safety of this (or any other
issue), I'd appreciate the feedback.

James

Вложения

v1-0001-Allow-parallel-LATERAL-subqueries-with-LIMIT-OFFS.patch

Re: Consider parallel for lateral subqueries with limit

От

James Coleman

Дата:

01 декабря 2020 г., 13:43:38

On Mon, Nov 30, 2020 at 7:00 PM James Coleman <jtc331@gmail.com> wrote:
>
> I've been investigating parallelizing certain correlated subqueries,
> and during that work stumbled across the fact that
> set_rel_consider_parallel disallows parallel query on what seems like
> a fairly simple case.
>
> Consider this query:
>
> select t.unique1
> from tenk1 t
> join lateral (select t.unique1 from tenk1 offset 0) l on true;
>
> Current set_rel_consider_parallel sets consider_parallel=false on the
> subquery rel because it has a limit/offset. That restriction makes a
> lot of sense when we have a subquery whose results conceptually need
> to be "shared" (or at least be the same) across multiple workers
> (indeed the relevant comment in that function notes that cases where
> we could prove a unique ordering would also qualify, but punts on
> implementing that due to complexity). But if the subquery is LATERAL,
> then no such conceptual restriction.
>
> If we change the code slightly to allow considering parallel query
> even in the face of LIMIT/OFFSET for LATERAL subqueries, then our
> query above changes from the following plan:
>
>  Nested Loop
>    Output: t.unique1
>    ->  Gather
>          Output: t.unique1
>          Workers Planned: 2
>          ->  Parallel Index Only Scan using tenk1_unique1 on public.tenk1 t
>                Output: t.unique1
>    ->  Gather
>          Output: NULL::integer
>          Workers Planned: 2
>          ->  Parallel Index Only Scan using tenk1_hundred on public.tenk1
>                Output: NULL::integer
>
> to this plan:
>
>  Gather
>    Output: t.unique1
>    Workers Planned: 2
>    ->  Nested Loop
>          Output: t.unique1
>          ->  Parallel Index Only Scan using tenk1_unique1 on public.tenk1 t
>                Output: t.unique1
>          ->  Index Only Scan using tenk1_hundred on public.tenk1
>                Output: NULL::integer
>
> The code change itself is quite simple (1 line). As far as I can tell
> we don't need to expressly check parallel safety of the limit/offset
> expressions; that appears to happen elsewhere (and that makes sense
> since the RTE_RELATION case doesn't check those clauses either).
>
> If I'm missing something about the safety of this (or any other
> issue), I'd appreciate the feedback.

Note that near the end of grouping planner we have a similar check:

if (final_rel->consider_parallel && root->query_level > 1 &&
        !limit_needed(parse))

guarding copying the partial paths from the current rel to the final
rel. I haven't managed to come up with a test case that exposes that
though since simple examples like the one above get converted into a
JOIN, so we're not in grouping_planner for a subquery. Making the
subquery above correlated results in us getting to that point, but
isn't currently marked as parallel safe for other reasons (because it
has params), so that's not a useful test. I'm not sure if there are
cases where we can't convert to a join but also don't involve params;
haven't thought about it a lot though.

James

Re: Consider parallel for lateral subqueries with limit

От

"Brian Davis"

Дата:

07 декабря 2020 г., 00:33:30

> Note that near the end of grouping planner we have a similar check:
>
> if (final_rel->consider_parallel && root->query_level > 1 &&
>         !limit_needed(parse))
> 
> guarding copying the partial paths from the current rel to the final
> rel. I haven't managed to come up with a test case that exposes that

Played around with this a bit, here's a non-correlated subquery that gets us to that if statement

DROP TABLE IF EXISTS foo;
CREATE TABLE foo (bar int);

INSERT INTO foo (bar)
SELECT
  g
FROM
  generate_series(1, 10000) AS g;


SELECT
  (
    SELECT
      bar
    FROM
      foo
    LIMIT 1
  ) AS y
FROM
  foo;


I also was thinking about the LATERAL part.

I couldn't think of any reason why the uncorrelated subquery's results would need to be shared and therefore the same,
whenwe'll be "looping" over each row of the source table, running the subquery anew for each, conceptually.
 

But then I tried this...

test=# CREATE TABLE foo (bar int);
CREATE TABLE
test=#
test=# INSERT INTO foo (bar)
test-# SELECT
test-#   g
test-# FROM
test-#   generate_series(1, 10) AS g;
INSERT 0 10
test=#
test=#
test=# SELECT
test-#   foo.bar,
test-#   lat.bar
test-# FROM
test-#   foo JOIN LATERAL (
test(#     SELECT
test(#       bar
test(#     FROM
test(#       foo AS foo2
test(#     ORDER BY
test(#       random()
test(#     LIMIT 1
test(#   ) AS lat ON true;
 bar | bar
-----+-----
   1 |   7
   2 |   7
   3 |   7
   4 |   7
   5 |   7
   6 |   7
   7 |   7
   8 |   7
   9 |   7
  10 |   7
(10 rows)


As you can see, random() is only called once.  If postgres were supposed to be running the subquery for each source
row,conceptually, it would be a mistake to cache the results of a volatile function like random().
 

The docs say: "When a FROM item contains LATERAL cross-references, evaluation proceeds as follows: for each row of the
FROMitem providing the cross-referenced column(s), or set of rows of multiple FROM items providing the columns, the
LATERALitem is evaluated using that row or row set's values of the columns. The resulting row(s) are joined as usual
withthe rows they were computed from. This is repeated for each row or set of rows from the column source table(s)."
 

They don't say what happens with LATERAL when there aren't cross-references though.  As we expect, adding one does show
random()being called once for each source row.
 

test=# SELECT
test-#   foo.bar,
test-#   lat.bar
test-# FROM
test-#   foo JOIN LATERAL (
test(#     SELECT
test(#       bar
test(#     FROM
test(#       foo AS foo2
test(#     WHERE
test(#       foo2.bar < foo.bar + 100000
test(#     ORDER BY
test(#       random()
test(#     LIMIT 1
test(#   ) AS lat ON true;
 bar | bar
-----+-----
   1 |   5
   2 |   8
   3 |   3
   4 |   4
   5 |   5
   6 |   5
   7 |   1
   8 |   3
   9 |   7
  10 |   3
(10 rows)

It seems like to keep the same behavior that exists today, results of LATERAL subqueries would need to be the same if
theyaren't correlated, and so you couldn't run them in parallel with a limit if the order wasn't guaranteed.  But I'll
bethe first to admit that it's easy enough for me to miss a key piece of logic on something like this, so I could be
wayoff base too.

Re: Consider parallel for lateral subqueries with limit

От

James Coleman

Дата:

07 декабря 2020 г., 23:45:50

Brian's email didn't keep the relevant headers, and so didn't show up
as a reply, so I've pasted it here and am replying in this original
thread. You can find the original at [1].

On Sun, Dec 6, 2020 at 7:34 PM Brian Davis <brian@brianlikespostgres.com> wrote:
>
> > Note that near the end of grouping planner we have a similar check:
> >
> > if (final_rel->consider_parallel && root->query_level > 1 &&
> >         !limit_needed(parse))
> >
> > guarding copying the partial paths from the current rel to the final
> > rel. I haven't managed to come up with a test case that exposes that
>
> Played around with this a bit, here's a non-correlated subquery that gets us to that if statement
>
> DROP TABLE IF EXISTS foo;
> CREATE TABLE foo (bar int);
>
> INSERT INTO foo (bar)
> SELECT
>   g
> FROM
>   generate_series(1, 10000) AS g;
>
>
> SELECT
>   (
>     SELECT
>       bar
>     FROM
>       foo
>     LIMIT 1
>   ) AS y
> FROM
>   foo;

Thanks. That triggers the parallel if case for limit -- any additional
GUCs need to be modified to get that result? I assume regardless a
parallel plan isn't chosen (even if you remove the limit needed
check)?

> I also was thinking about the LATERAL part.
>
> I couldn't think of any reason why the uncorrelated subquery's results would need to be shared and therefore the
same,when we'll be "looping" over each row of the source table, running the subquery anew for each, conceptually. 
>
> But then I tried this...
>
> test=# CREATE TABLE foo (bar int);
> CREATE TABLE
> test=#
> test=# INSERT INTO foo (bar)
> test-# SELECT
> test-#   g
> test-# FROM
> test-#   generate_series(1, 10) AS g;
> INSERT 0 10
> test=#
> test=#
> test=# SELECT
> test-#   foo.bar,
> test-#   lat.bar
> test-# FROM
> test-#   foo JOIN LATERAL (
> test(#     SELECT
> test(#       bar
> test(#     FROM
> test(#       foo AS foo2
> test(#     ORDER BY
> test(#       random()
> test(#     LIMIT 1
> test(#   ) AS lat ON true;
>  bar | bar
> -----+-----
>    1 |   7
>    2 |   7
>    3 |   7
>    4 |   7
>    5 |   7
>    6 |   7
>    7 |   7
>    8 |   7
>    9 |   7
>   10 |   7
> (10 rows)
>
>
> As you can see, random() is only called once.  If postgres were supposed to be running the subquery for each source
row,conceptually, it would be a mistake to cache the results of a volatile function like random(). 

This was genuinely surprising to me. I think one could argue that this
is just an optimization (after all -- if there is no correlation, then
running it once is conceptually/safely the same as running it multiple
times), but that doesn't seem to hold water with the volatile function
in play.

Of course, given the volatile function we'd never parallelize this
anyway. But we still have to consider the case where the result is
otherwise ordered differently between workers (just by virtue of disk
order, for example).

I've tried the above query using tenk1 from the regress tests to get a
bit more data, and, along with modifying several GUCs, can force
parallel plans. However in no case can I get it to execute that
uncorrelated lateral in multiple workers. That makes me suspicious
that there's another check in play here ensuring the lateral subquery
is executed for each group, and that in the uncorrelated case really
that rule still holds -- it's just a single group.

> The docs say: "When a FROM item contains LATERAL cross-references, evaluation proceeds as follows: for each row of
theFROM item providing the cross-referenced column(s), or set of rows of multiple FROM items providing the columns, the
LATERALitem is evaluated using that row or row set's values of the columns. The resulting row(s) are joined as usual
withthe rows they were computed from. This is repeated for each row or set of rows from the column source table(s)." 
>
> They don't say what happens with LATERAL when there aren't cross-references though.  As we expect, adding one does
showrandom() being called once for each source row. 

If my theory above is correct then it's implicit that the row set is
the whole previous FROM group.

> test=# SELECT
> test-#   foo.bar,
> test-#   lat.bar
> test-# FROM
> test-#   foo JOIN LATERAL (
> test(#     SELECT
> test(#       bar
> test(#     FROM
> test(#       foo AS foo2
> test(#     WHERE
> test(#       foo2.bar < foo.bar + 100000
> test(#     ORDER BY
> test(#       random()
> test(#     LIMIT 1
> test(#   ) AS lat ON true;
>  bar | bar
> -----+-----
>    1 |   5
>    2 |   8
>    3 |   3
>    4 |   4
>    5 |   5
>    6 |   5
>    7 |   1
>    8 |   3
>    9 |   7
>   10 |   3
> (10 rows)
>
> It seems like to keep the same behavior that exists today, results of LATERAL subqueries would need to be the same if
theyaren't correlated, and so you couldn't run them in parallel with a limit if the order wasn't guaranteed.  But I'll
bethe first to admit that it's easy enough for me to miss a key piece of logic on something like this, so I could be
wayoff base too. 

If it weren't for the volatile function in the example, I think I
could argue we could change before (i.e., my theorizing above
originally about just being an optimization)...but yes, it seems like
this behavior shouldn't change. I can't seem to make it break though
with the patch.

While I haven't actually tracked down to guarantee this is handled
elsewhere, a thought experiment -- I think -- shows it must be so.
Here's why: suppose we don't have a limit here, but the query return
order is different in different backends. Then we would have the same
problem you bring up. In that case this code is already setting
consider_parallel=true on the rel. So I don't think we're changing any
behavior here.

James

1: https://www.postgresql.org/message-id/a50766a4-a927-41c4-984c-76e513b6d1c4%40www.fastmail.com

Re: Consider parallel for lateral subqueries with limit

От

Greg Nancarrow

Дата:

28 мая 2021 г., 01:01:31

On Tue, Dec 8, 2020 at 10:46 AM James Coleman <jtc331@gmail.com> wrote:
>
> While I haven't actually tracked down to guarantee this is handled
> elsewhere, a thought experiment -- I think -- shows it must be so.
> Here's why: suppose we don't have a limit here, but the query return
> order is different in different backends. Then we would have the same
> problem you bring up. In that case this code is already setting
> consider_parallel=true on the rel. So I don't think we're changing any
> behavior here.
>

AFAICS, the patch seems very reasonable and specifically targets
lateral subqueries with limit/offset. It seems like the uncorrelated
case is the only real concern.
I generally agree that the current patch is probably not changing any
behavior in the uncorrelated case (and like yourself, haven't yet
found a case for which it breaks), but I'm not sure Brian's concerns
can be ruled out entirely.

How about a minor update to the patch to make it slightly more
restrictive, to exclude the case when there are no lateral
cross-references, so we'll be allowing parallelism only when we know
the lateral subquery will be evaluated anew for each source row?
I was thinking of the following patch modification:

BEFORE:
-                if (limit_needed(subquery))
+               if (!rte->lateral && limit_needed(subquery))

AFTER:
-                if (limit_needed(subquery))
+               if ((!rte->lateral || bms_is_empty(rel->lateral_relids)) &&
+                     limit_needed(subquery))

Thoughts?

Regards,
Greg Nancarrow
Fujitsu Australia

Re: Consider parallel for lateral subqueries with limit

От

James Coleman

Дата:

16 июля 2021 г., 19:16:10

On Thu, May 27, 2021 at 9:01 PM Greg Nancarrow <gregn4422@gmail.com> wrote:
>
> On Tue, Dec 8, 2020 at 10:46 AM James Coleman <jtc331@gmail.com> wrote:
> >
> > While I haven't actually tracked down to guarantee this is handled
> > elsewhere, a thought experiment -- I think -- shows it must be so.
> > Here's why: suppose we don't have a limit here, but the query return
> > order is different in different backends. Then we would have the same
> > problem you bring up. In that case this code is already setting
> > consider_parallel=true on the rel. So I don't think we're changing any
> > behavior here.
> >
>
> AFAICS, the patch seems very reasonable and specifically targets
> lateral subqueries with limit/offset. It seems like the uncorrelated
> case is the only real concern.
> I generally agree that the current patch is probably not changing any
> behavior in the uncorrelated case (and like yourself, haven't yet
> found a case for which it breaks), but I'm not sure Brian's concerns
> can be ruled out entirely.
>
> How about a minor update to the patch to make it slightly more
> restrictive, to exclude the case when there are no lateral
> cross-references, so we'll be allowing parallelism only when we know
> the lateral subquery will be evaluated anew for each source row?
> I was thinking of the following patch modification:
>
> BEFORE:
> -                if (limit_needed(subquery))
> +               if (!rte->lateral && limit_needed(subquery))
>
> AFTER:
> -                if (limit_needed(subquery))
> +               if ((!rte->lateral || bms_is_empty(rel->lateral_relids)) &&
> +                     limit_needed(subquery))
>
>
> Thoughts?

Apologies for the delayed response; this seems fine to me. I've
attached patch v2.

Thanks,
James Coleman

Вложения

v2-0001-Allow-parallel-LATERAL-subqueries-with-LIMIT-OFFS.patch

Re: Consider parallel for lateral subqueries with limit

От

James Coleman

Дата:

03 ноября 2021 г., 13:49:20

On Fri, Jul 16, 2021 at 3:16 PM James Coleman <jtc331@gmail.com> wrote:
>
> On Thu, May 27, 2021 at 9:01 PM Greg Nancarrow <gregn4422@gmail.com> wrote:
> >
> > On Tue, Dec 8, 2020 at 10:46 AM James Coleman <jtc331@gmail.com> wrote:
> > >
> > > While I haven't actually tracked down to guarantee this is handled
> > > elsewhere, a thought experiment -- I think -- shows it must be so.
> > > Here's why: suppose we don't have a limit here, but the query return
> > > order is different in different backends. Then we would have the same
> > > problem you bring up. In that case this code is already setting
> > > consider_parallel=true on the rel. So I don't think we're changing any
> > > behavior here.
> > >
> >
> > AFAICS, the patch seems very reasonable and specifically targets
> > lateral subqueries with limit/offset. It seems like the uncorrelated
> > case is the only real concern.
> > I generally agree that the current patch is probably not changing any
> > behavior in the uncorrelated case (and like yourself, haven't yet
> > found a case for which it breaks), but I'm not sure Brian's concerns
> > can be ruled out entirely.
> >
> > How about a minor update to the patch to make it slightly more
> > restrictive, to exclude the case when there are no lateral
> > cross-references, so we'll be allowing parallelism only when we know
> > the lateral subquery will be evaluated anew for each source row?
> > I was thinking of the following patch modification:
> >
> > BEFORE:
> > -                if (limit_needed(subquery))
> > +               if (!rte->lateral && limit_needed(subquery))
> >
> > AFTER:
> > -                if (limit_needed(subquery))
> > +               if ((!rte->lateral || bms_is_empty(rel->lateral_relids)) &&
> > +                     limit_needed(subquery))
> >
> >
> > Thoughts?
>
> Apologies for the delayed response; this seems fine to me. I've
> attached patch v2.

Greg,

Do you believe this is now ready for committer?

Thanks,
James Coleman

Re: Consider parallel for lateral subqueries with limit

От

Greg Nancarrow

Дата:

04 ноября 2021 г., 02:06:41

On Thu, Nov 4, 2021 at 12:49 AM James Coleman <jtc331@gmail.com> wrote:
>
> Greg,
>
> Do you believe this is now ready for committer?
>

The patch LGTM.
I have set the status to "Ready for Committer".

Regards,
Greg Nancarrow
Fujitsu Australia

Re: Consider parallel for lateral subqueries with limit

От

Tom Lane

Дата:

04 января 2022 г., 22:30:59

Greg Nancarrow <gregn4422@gmail.com> writes:
> The patch LGTM.
> I have set the status to "Ready for Committer".

I don't really see why this patch is even a little bit safe.
The argument for it seems to be that a lateral subquery will
necessarily be executed in such a way that each complete iteration
of the subquery, plus joining to its outer rel, happens within a
single worker ... but where is the guarantee of that?  Once
you've marked the rel as parallel-safe, the planner is free to
consider all sorts of parallel join structures.  I'm afraid this
would be easily broken as soon as you look at cases with three or
more rels.  Or maybe even just two.  The reason for the existing
restriction boils down to this structure being unsafe:

    Gather
        NestLoop
            Scan ...
            Limit
                Scan ...

and I don't see how the existence of a lateral reference
makes it any safer.

            regards, tom lane

Re: Consider parallel for lateral subqueries with limit

От

James Coleman

Дата:

05 января 2022 г., 02:59:21

On Tue, Jan 4, 2022 at 5:31 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
>
> Greg Nancarrow <gregn4422@gmail.com> writes:
> > The patch LGTM.
> > I have set the status to "Ready for Committer".
>
> I don't really see why this patch is even a little bit safe.
> The argument for it seems to be that a lateral subquery will
> necessarily be executed in such a way that each complete iteration
> of the subquery, plus joining to its outer rel, happens within a
> single worker ... but where is the guarantee of that?  Once
> you've marked the rel as parallel-safe, the planner is free to
> consider all sorts of parallel join structures.  I'm afraid this
> would be easily broken as soon as you look at cases with three or
> more rels.  Or maybe even just two.  The reason for the existing
> restriction boils down to this structure being unsafe:
>
>     Gather
>         NestLoop
>             Scan ...
>             Limit
>                 Scan ...
>
> and I don't see how the existence of a lateral reference
> makes it any safer.

Thanks for taking a look. I'm not following how the structure you
posited is inherently unsafe when it's a lateral reference. That
limit/scan (if lateral) has to be being executed per tuple in the
outer scan, right? And if it's a unique execution per tuple, then
consistency across tuples (that are in different workers) can't be a
concern.

Is there a scenario I'm missing where lateral can currently be
executed in that way in that structure (or a different one)?

Thanks,
James Coleman

Re: Consider parallel for lateral subqueries with limit

От

James Coleman

Дата:

13 января 2022 г., 23:26:01

On Tue, Jan 4, 2022 at 9:59 PM James Coleman <jtc331@gmail.com> wrote:
>
> On Tue, Jan 4, 2022 at 5:31 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
> >
> > Greg Nancarrow <gregn4422@gmail.com> writes:
> > > The patch LGTM.
> > > I have set the status to "Ready for Committer".
> >
> > I don't really see why this patch is even a little bit safe.
> > The argument for it seems to be that a lateral subquery will
> > necessarily be executed in such a way that each complete iteration
> > of the subquery, plus joining to its outer rel, happens within a
> > single worker ... but where is the guarantee of that?  Once
> > you've marked the rel as parallel-safe, the planner is free to
> > consider all sorts of parallel join structures.  I'm afraid this
> > would be easily broken as soon as you look at cases with three or
> > more rels.  Or maybe even just two.  The reason for the existing
> > restriction boils down to this structure being unsafe:
> >
> >     Gather
> >         NestLoop
> >             Scan ...
> >             Limit
> >                 Scan ...
> >
> > and I don't see how the existence of a lateral reference
> > makes it any safer.
>
> Thanks for taking a look. I'm not following how the structure you
> posited is inherently unsafe when it's a lateral reference. That
> limit/scan (if lateral) has to be being executed per tuple in the
> outer scan, right? And if it's a unique execution per tuple, then
> consistency across tuples (that are in different workers) can't be a
> concern.
>
> Is there a scenario I'm missing where lateral can currently be
> executed in that way in that structure (or a different one)?

To expand on this:

Suppose lateral is not in play. Then if we have a plan like:

    Gather
        NestLoop
            Scan X
            Limit
                Scan Y

Because we have the result "X join Limit(Y)"  we need "Limit(Y)" to be
consistent across all of the possible executions of "Limit(Y)" (i.e.,
in each worker it executes in). That means (absent infrastructure for
guaranteeing a unique ordering) we obviously can't parallelize the
inner side of the join as the limit may be applied in different ways
in each worker's execution.

Now suppose lateral is in play. Then (given the same plan) instead of
our result being "X join Limit(Y)" the result is "X join Limit(Y sub
X)", that is, each row in X is joined to a unique invocation of
"Limit(Y)". In this case we are already conceivably getting different
results for each execution of the subquery "Limit(Y)" even if we're
not running those executions across multiple workers. Whether we've
optimized such a subquery into running a single execution per row in X
or have managed to optimize it in such a way that a single execution
of "Limit(Y)" may be shared by multiple rows in X makes no difference
because there are no relational guarantees that that is the case
(conceptually each row in X gets its own results for "Limit(Y)").

I've not been able to come up with a scenario where this doesn't hold
-- even if part of the join or the subquery execution happens in a
different worker. I believe that for there to be a parallel safety
problem here you'd need to have a given subquery execution for a row
in X be executed multiple times. Alternatively I've been trying to
reason about whether there might be a scenario where a 3rd rel is
involved (i.e., the inner rel is itself a join rel), but as far as I
can tell the moment we end up with a join structure such that the
lateral rel is the one with the limit we'd be back to being safe again
for the reasons claimed earlier.

If there's something obvious (or not so obvious) I'm missing, I'd
appreciate a counterexample, but I'm currently unable to falsify my
original claim that the lateral reference is a fundamental difference
that allows us to consider this parallel safe.

Thanks,
James Coleman

Re: Consider parallel for lateral subqueries with limit

От

Tom Lane

Дата:

01 марта 2022 г., 22:35:22

James Coleman <jtc331@gmail.com> writes:
> On Tue, Jan 4, 2022 at 9:59 PM James Coleman <jtc331@gmail.com> wrote:
>> On Tue, Jan 4, 2022 at 5:31 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
>>> I don't really see why this patch is even a little bit safe.

> Suppose lateral is not in play. Then if we have a plan like:

>     Gather
>         NestLoop
>             Scan X
>             Limit
>                 Scan Y

> Because we have the result "X join Limit(Y)"  we need "Limit(Y)" to be
> consistent across all of the possible executions of "Limit(Y)" (i.e.,
> in each worker it executes in). That means (absent infrastructure for
> guaranteeing a unique ordering) we obviously can't parallelize the
> inner side of the join as the limit may be applied in different ways
> in each worker's execution.

> Now suppose lateral is in play. Then (given the same plan) instead of
> our result being "X join Limit(Y)" the result is "X join Limit(Y sub
> X)", that is, each row in X is joined to a unique invocation of
> "Limit(Y)".

This argument seems to be assuming that Y is laterally dependent on X,
but the patch as written will take *any* lateral dependency as a
get-out-of-jail-free card.  If what we have is "Limit(Y sub Z)" where
Z is somewhere else in the query tree, it's not apparent to me that
your argument holds.

But more generally, I don't think you've addressed the fundamental
concern, which is that a query involving Limit is potentially
nondeterministic (if it lacks a fully-deterministic ORDER BY),
so that different workers could get different answers from it if
they're using a plan type that permits that to happen.  (See the
original discussion that led to 75f9c4ca5, at [1].)  I do not see
how a lateral dependency removes that hazard.  The bug report that
started the original discussion hit the problem because it
generated a plan like

Gather
   ->  Hash Semi Join
         ->  Parallel Seq Scan
         ->  Hash
               ->  Limit
                     ->  Seq Scan

We didn't make the submitter drill down far enough to verify
exactly why he got nondeterministic results from the Limit, but
I suppose the reason was that the table was big enough to trigger
"synchronize_seqscans" behavior, allowing different workers to
read different parts of that table.  Now, that particular case
didn't have any lateral dependency, but if there was one it'd
just have resulted in changing the hash join to a nestloop join,
and the nondeterminism hazard would be exactly the same AFAICS.

> In this case we are already conceivably getting different
> results for each execution of the subquery "Limit(Y)" even if we're
> not running those executions across multiple workers.

That seems to be about the same argument Andres made initially
in the old thread, but we soon shot that down as not being the
level of guarantee we want to provide.  There's nothing in the
SQL standard that says that
    select * from events where account in
        (select account from events
         where data->>'page' = 'success.html' limit 3);
(the original problem query) shall execute the sub-query
only once, but people expect it to act that way.

If you want to improve this area, my feeling is that it'd be
better to look into what was speculated about in the old
thread: LIMIT doesn't create nondeterminism if the query has
an ORDER BY that imposes a unique row ordering, ie
order-by-primary-key.  We didn't have planner infrastructure
that would allow checking that cheaply in 2018, but maybe
there is some now?

            regards, tom lane

[1] https://www.postgresql.org/message-id/flat/153417684333.10284.11356259990921828616%40wrigleys.postgresql.org

Re: Consider parallel for lateral subqueries with limit

От

Jacob Champion

Дата:

02 августа 2022 г., 21:05:41

This entry has been waiting on author input for a while (our current
threshold is roughly two weeks), so I've marked it Returned with
Feedback.

Once you think the patchset is ready for review again, you (or any
interested party) can resurrect the patch entry by visiting

    https://commitfest.postgresql.org/38/2851/

and changing the status to "Needs Review", and then changing the
status again to "Move to next CF". (Don't forget the second step;
hopefully we will have streamlined this in the near future!)

Thanks,
--Jacob

Re: Consider parallel for lateral subqueries with limit

От

James Coleman

Дата:

19 сентября 2022 г., 19:58:33

On Tue, Mar 1, 2022 at 5:35 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
>
> But more generally, I don't think you've addressed the fundamental
> concern, which is that a query involving Limit is potentially
> nondeterministic (if it lacks a fully-deterministic ORDER BY),
> so that different workers could get different answers from it if
> they're using a plan type that permits that to happen.  (See the
> original discussion that led to 75f9c4ca5, at [1].)  I do not see
> how a lateral dependency removes that hazard.
> ...
> > In this case we are already conceivably getting different
> > results for each execution of the subquery "Limit(Y)" even if we're
> > not running those executions across multiple workers.
>
> That seems to be about the same argument Andres made initially
> in the old thread, but we soon shot that down as not being the
> level of guarantee we want to provide.  There's nothing in the
> SQL standard that says that
>     select * from events where account in
>         (select account from events
>          where data->>'page' = 'success.html' limit 3);
> (the original problem query) shall execute the sub-query
> only once, but people expect it to act that way.

I understand that particular case being a bit of a gotcha (i.e., what
we would naturally expect exceeds what the spec says).

But in the case where there's correlation via LATERAL we already don't
guarantee unique executions for a given set of params into the lateral
subquery execution, right? For example, suppose we have:

  select *
  from foo
  left join lateral (
    select n
    from bar
    where bar.a = foo.a
    limit 1
  ) on true

and suppose that foo.a (in the table scan) returns these values in
order: 1, 2, 1. In that case we'll execute the lateral subquery 3
separate times rather than attempting to order the results of foo.a
such that we can re-execute the subquery only when the param changes
to a new unique value (and we definitely don't cache the results to
guarantee unique subquery executions).

So, assuming that we can guarantee we're talking about the proper
lateral reference (not something unrelated as you pointed out earlier)
then I don't believe we're actually changing even implicit guarantees
about how the query executes. I think that probably true both in
theory (I claim that once someone declares a lateral join they're
definitionally expecting a subquery execution per outer row) and in
practice (e.g., your hypothesis about synchronized seq scans would
apply here also to subsequent executions of the subquery).

Is there still something I'm missing about the concern you have?

> If you want to improve this area, my feeling is that it'd be
> better to look into what was speculated about in the old
> thread: LIMIT doesn't create nondeterminism if the query has
> an ORDER BY that imposes a unique row ordering, ie
> order-by-primary-key.  We didn't have planner infrastructure
> that would allow checking that cheaply in 2018, but maybe
> there is some now?

Assuming what I argued earlier holds true for lateral [at minimum when
correlated] subquery execution, then I believe your suggestion here is
orthogonal and would expand the use cases even more. For example, if
we were able to guarantee a unique result set (including order), then
we could allow parallelizing subqueries even if they're not lateral
and correlated.

James Coleman

Re: Consider parallel for lateral subqueries with limit

От

Robert Haas

Дата:

19 сентября 2022 г., 20:29:14

On Mon, Sep 19, 2022 at 3:58 PM James Coleman <jtc331@gmail.com> wrote:
> But in the case where there's correlation via LATERAL we already don't
> guarantee unique executions for a given set of params into the lateral
> subquery execution, right? For example, suppose we have:
>
>   select *
>   from foo
>   left join lateral (
>     select n
>     from bar
>     where bar.a = foo.a
>     limit 1
>   ) on true
>
> and suppose that foo.a (in the table scan) returns these values in
> order: 1, 2, 1. In that case we'll execute the lateral subquery 3
> separate times rather than attempting to order the results of foo.a
> such that we can re-execute the subquery only when the param changes
> to a new unique value (and we definitely don't cache the results to
> guarantee unique subquery executions).

I think this is true, but I don't really understand why we should
focus on LATERAL here. What we really need, and I feel like we've
talked about this before, is a way to reason about where parameters
are set and used. Your sample query gets a plan like this:

 Nested Loop Left Join  (cost=0.00..1700245.00 rows=10000 width=8)
   ->  Seq Scan on foo  (cost=0.00..145.00 rows=10000 width=4)
   ->  Limit  (cost=0.00..170.00 rows=1 width=4)
         ->  Seq Scan on bar  (cost=0.00..170.00 rows=1 width=4)
               Filter: (foo.a = a)

If this were to occur inside a larger plan tree someplace, it would be
OK to insert a Gather node above the Nested Loop node without doing
anything further, because then the parameter that stores foo.a would
be both set and used in the worker. If you wanted to insert a Gather
at any other place in this plan, things get more complicated. But just
because you have LATERAL doesn't mean that you have this problem,
because if you delete the "limit 1" then the subqueries get flattened
together and the parameter disappears, and if you delete the lateral
reference (i.e. WHERE foo.a = bar.a) then there's still a subquery but
it no longer refers to an outer parameter. And on the flip side just
because you don't have LATERAL doesn't mean that you don't have this
problem. e.g. the query could instead be:

select *, (select n from bar where bar.a = foo.a limit 1) from foo;

...which I think is pretty much equivalent to your formulation and has
the same problem as far as parallel query as your formulation but does
not involve the LATERAL keyword.

-- 
Robert Haas
EDB: http://www.enterprisedb.com

Re: Consider parallel for lateral subqueries with limit

От

James Coleman

Дата:

22 сентября 2022 г., 21:19:42

On Mon, Sep 19, 2022 at 4:29 PM Robert Haas <robertmhaas@gmail.com> wrote:
>
> On Mon, Sep 19, 2022 at 3:58 PM James Coleman <jtc331@gmail.com> wrote:
> > But in the case where there's correlation via LATERAL we already don't
> > guarantee unique executions for a given set of params into the lateral
> > subquery execution, right? For example, suppose we have:
> >
> >   select *
> >   from foo
> >   left join lateral (
> >     select n
> >     from bar
> >     where bar.a = foo.a
> >     limit 1
> >   ) on true
> >
> > and suppose that foo.a (in the table scan) returns these values in
> > order: 1, 2, 1. In that case we'll execute the lateral subquery 3
> > separate times rather than attempting to order the results of foo.a
> > such that we can re-execute the subquery only when the param changes
> > to a new unique value (and we definitely don't cache the results to
> > guarantee unique subquery executions).
>
> I think this is true, but I don't really understand why we should
> focus on LATERAL here. What we really need, and I feel like we've
> talked about this before, is a way to reason about where parameters
> are set and used.

Yes, over in the thread "Parallelize correlated subqueries that
execute within each worker" [1] which was meant to build on top of
this work (though is technically separate). Your bringing it up here
too is making me wonder if we can combine the two and instead of
always allowing subqueries with LIMIT instead (like the other patch
does) delay final determination of parallel safety of rels and paths
(perhaps, as that thread is discussing, until gather/gather merge path
creation).

Side note: I'd kinda like a way (maybe a development GUC or even a
compile time option) to have EXPLAIN show where params are set. If
something like that exists, egg on my face I guess, but I'd love to
know about it.

> Your sample query gets a plan like this:
>
>  Nested Loop Left Join  (cost=0.00..1700245.00 rows=10000 width=8)
>    ->  Seq Scan on foo  (cost=0.00..145.00 rows=10000 width=4)
>    ->  Limit  (cost=0.00..170.00 rows=1 width=4)
>          ->  Seq Scan on bar  (cost=0.00..170.00 rows=1 width=4)
>                Filter: (foo.a = a)
>
> If this were to occur inside a larger plan tree someplace, it would be
> OK to insert a Gather node above the Nested Loop node without doing
> anything further, because then the parameter that stores foo.a would
> be both set and used in the worker. If you wanted to insert a Gather
> at any other place in this plan, things get more complicated. But just
> because you have LATERAL doesn't mean that you have this problem,
> because if you delete the "limit 1" then the subqueries get flattened
> together and the parameter disappears,

For future reference in this email thread when you remove the "limit
1" this is the plan you get:

 Merge Right Join  (cost=372.18..815.71 rows=28815 width=8)
   Merge Cond: (bar.a = foo.a)
   ->  Sort  (cost=158.51..164.16 rows=2260 width=8)
         Sort Key: bar.a
         ->  Seq Scan on bar  (cost=0.00..32.60 rows=2260 width=8)
   ->  Sort  (cost=179.78..186.16 rows=2550 width=4)
         Sort Key: foo.a
         ->  Seq Scan on foo  (cost=0.00..35.50 rows=2550 width=4)

Just to make sure I'm following: by "doesn't mean that you have this
problem" you mean "doesn't mean you have this limitation on parallel
query"?

The "flattening out" (conversion to join between two table scans
executed a single time each) is an interesting case because I consider
that to be "just" a performance optimization, and therefore I don't
think anything about the guarantees a user expects should change. But
interestingly: it does end up providing stronger guarantees about the
query results than it would if the conversion didn't happen (the
subquery results in only a single table scan whereas without the
change a scan per outer row means it's *possible* to get different
results across different outer rows even with the same join key
value).

> and if you delete the lateral
> reference (i.e. WHERE foo.a = bar.a) then there's still a subquery but
> it no longer refers to an outer parameter. And on the flip side just
> because you don't have LATERAL doesn't mean that you don't have this
> problem. e.g. the query could instead be:
>
> select *, (select n from bar where bar.a = foo.a limit 1) from foo;
>
> ...which I think is pretty much equivalent to your formulation and has
> the same problem as far as parallel query as your formulation but does
> not involve the LATERAL keyword.

Yes, that's a good point too. I need to play with these examples and
confirm whether lateral_relids gets set in that case. IIRC when that's
set isn't exactly the same as whether or not the LATERAL keyword is
used, and I should clarify that my claims here are meant to be about
when we execute it that way regardless of the keyword usage. The
keyword usage I'd assumed just made it easier to talk about, but maybe
you're implying that it's actually generating confusion.

James Coleman

1: https://www.postgresql.org/message-id/CA%2BTgmoYXm2NCLt1nikWfYj1_r3%3DfsoNCHCtDVdN7X1uX_xuXgw%40mail.gmail.com

Re: Consider parallel for lateral subqueries with limit

От

James Coleman

Дата:

25 сентября 2022 г., 00:35:52

On Thu, Sep 22, 2022 at 5:19 PM James Coleman <jtc331@gmail.com> wrote:
>
> On Mon, Sep 19, 2022 at 4:29 PM Robert Haas <robertmhaas@gmail.com> wrote:
> >
> > On Mon, Sep 19, 2022 at 3:58 PM James Coleman <jtc331@gmail.com> wrote:
> > > But in the case where there's correlation via LATERAL we already don't
> > > guarantee unique executions for a given set of params into the lateral
> > > subquery execution, right? For example, suppose we have:
> > >
> > >   select *
> > >   from foo
> > >   left join lateral (
> > >     select n
> > >     from bar
> > >     where bar.a = foo.a
> > >     limit 1
> > >   ) on true
> > >
> > > and suppose that foo.a (in the table scan) returns these values in
> > > order: 1, 2, 1. In that case we'll execute the lateral subquery 3
> > > separate times rather than attempting to order the results of foo.a
> > > such that we can re-execute the subquery only when the param changes
> > > to a new unique value (and we definitely don't cache the results to
> > > guarantee unique subquery executions).
> >
> > I think this is true, but I don't really understand why we should
> > focus on LATERAL here. What we really need, and I feel like we've
> > talked about this before, is a way to reason about where parameters
> > are set and used.
>
> Yes, over in the thread "Parallelize correlated subqueries that
> execute within each worker" [1] which was meant to build on top of
> this work (though is technically separate). Your bringing it up here
> too is making me wonder if we can combine the two and instead of
> always allowing subqueries with LIMIT instead (like the other patch
> does) delay final determination of parallel safety of rels and paths
> (perhaps, as that thread is discussing, until gather/gather merge path
> creation).

Upon further investigation and thought I believe it *might* be
possible to do what I'd wondered about above (delay final
determination of parallel safety of the rel until later on in planning
by marking e.g. rels as tentatively safe and re-evaluating that as we
go) as my original patch did on the referenced thread, but that thread
also ended up with a much simpler proposed approach that still moved
final parallel safety determination to later in the planner, but it
did it by checking (in generate_gather_paths and
generate_user_gather_paths) whether that point in the plan tree
supplies the params required by the partial path.

So the current approach in that other thread is orthogonal to (if
complementary in some queries) the current question, which is "must we
immediately disallow parallelism on a rel that has a limit?"

Tom's concern about my arguments about special casing lateral was:

> This argument seems to be assuming that Y is laterally dependent on X,
> but the patch as written will take *any* lateral dependency as a
> get-out-of-jail-free card.  If what we have is "Limit(Y sub Z)" where
> Z is somewhere else in the query tree, it's not apparent to me that
> your argument holds.

I do see now that there was a now obvious flaw in the original patch:
rte->lateral may well be set to true even in cases where there's no
actual lateral dependency. That being said I don't see a need to
determine if the current subtree provides the required lateral
dependency, for the following reasons:

1. We don't want to set rel->consider_parallel to false immediately
because that will then poison everything higher in the tree, despite
the fact that it may well be that it's only higher up the plan tree
that the lateral dependency is provided. Regardless of the level in
the plan tree at which the param is provided we are going to execute
the subquery (even in serial execution) once per outer tuple at the
point in the join tree where the lateral join lies.
2. We're *not* at this point actually checking parallel safety of a
given path (i.e., is the path parallel safe at this point given the
params currently provided), we're only checking to see if the rel
itself should be entirely excluded from consideration for parallel
plans at any point in the future.
3. We already know that the question of whether or not a param is
provided can't be the concern here since it isn't under consideration
in the existing code path. That is, if a subquery doesn't have a limit
then this particular check won't determine that the subquery's
existence should result in setting rel->consider_parallel to false
because of any params it may or may not contain.
4. It is already the case that a subplan using exec params in the
where clause will not be considered parallel safe in path generation.

I believe the proper check for when to exclude subqueries with limit
clauses is (as in the attached patch) prohibiting a limit when
rel->lateral_relids is empty. Here are several examples of queries and
plans and how this code plays out to attempt to validate that
hypothesis:

 select *
  from foo
  left join lateral (
    select n
    from bar
    where bar.a = foo.a
    limit 1
  ) on true;

 Nested Loop Left Join  (cost=0.00..8928.05 rows=2550 width=8)
   ->  Seq Scan on foo  (cost=0.00..35.50 rows=2550 width=4)
   ->  Limit  (cost=0.00..3.48 rows=1 width=4)
         ->  Seq Scan on bar  (cost=0.00..38.25 rows=11 width=4)
               Filter: (a = foo.a)

This was my base case for the past few emails. It hits the modified
code path with rte->lateral = true and
bms_num_members(rel->lateral_relids) = 1. The patch would allow the
subquery rel.consider_parallel to be set to true, however with solely
this patch we still won't put the limit subquery within the
parallelized portion of the plan because of the exec param used in the
where clause.

 select *
  from foo
  left join lateral (
    select foo.a
    from bar
    limit 1
  ) on true;

 Nested Loop Left Join  (cost=0.00..97.78 rows=2550 width=8)
   ->  Seq Scan on foo  (cost=0.00..35.50 rows=2550 width=4)
   ->  Limit  (cost=0.00..0.01 rows=1 width=4)
         ->  Seq Scan on bar  (cost=0.00..32.60 rows=2260 width=4)

In this case the lateral reference is only in the target list of the
subquery, and this form is where the patch kicks in to allow placing
the gather node above the whole join tree (thus executing the limit
subquery within each worker).

select *, (select n from bar where bar.a = foo.a limit 1) from foo;

 Seq Scan on foo  (cost=0.00..8902.55 rows=2550 width=8)
   SubPlan 1
     ->  Limit  (cost=0.00..3.48 rows=1 width=4)
           ->  Seq Scan on bar  (cost=0.00..38.25 rows=11 width=4)
                 Filter: (a = foo.a)

Robert wondered if this was effectively the same thing, and it
definitely, in my opinion, ought to be the same--in terms of the
results you expect--as my original example. However this example
doesn't appear to hit this code path at all. We also already
parallelize this form of query (both when the outer tuple reference is
in the subquery's target list and when it's in the subquery's where
clause).

 select *
  from foo
  left join lateral (
    select n
    from bar
    where bar.a = foo.a
  ) on true;

 Merge Right Join  (cost=372.18..815.71 rows=28815 width=8)
   Merge Cond: (bar.a = foo.a)
   ->  Sort  (cost=158.51..164.16 rows=2260 width=8)
         Sort Key: bar.a
         ->  Seq Scan on bar  (cost=0.00..32.60 rows=2260 width=8)
   ->  Sort  (cost=179.78..186.16 rows=2550 width=4)
         Sort Key: foo.a
         ->  Seq Scan on foo  (cost=0.00..35.50 rows=2550 width=4)

Removing the limit results in the correlated subquery being rewritten
as a join. Because this rewrite removes the correlation (at least in
execution) this query also doesn't hit the modified code path at all.
I do find this one interesting because it's one of these examples of
how we already provide different guarantees around correlated subquery
execution (even when executing serially): when there's a limit here
the subquery executes multiple times (which means, for example, that
the results of the scan may be returned in a different order) but
without the limit it executes a single time (so the order is
guaranteed).

 select *
  from foo
  left join lateral (
    select n
    from bar
  ) on true;

 Nested Loop Left Join  (cost=0.00..140795.50 rows=5763000 width=8)
   ->  Seq Scan on foo  (cost=0.00..35.50 rows=2550 width=4)
   ->  Seq Scan on bar  (cost=0.00..32.60 rows=2260 width=4)

If we remove the correlation but retain the lateral keyword we also
don't hit this code path. By the way, the plan is also the same if you
remove the useless lateral keyword in this query.

 select *
  from foo
  where foo.a in (
    select bar.a from bar limit 1
 );

 Hash Semi Join  (cost=0.03..42.37 rows=13 width=4)
   Hash Cond: (foo.a = bar.a)
   ->  Seq Scan on foo  (cost=0.00..35.50 rows=2550 width=4)
   ->  Hash  (cost=0.01..0.01 rows=1 width=4)
         ->  Limit  (cost=0.00..0.01 rows=1 width=4)
               ->  Seq Scan on bar  (cost=0.00..32.60 rows=2260 width=4)

This is the query form Tom referenced from a bug report that
originally brought in the current code that excludes all subqueries
with limits from parallelization. This form does indeed hit the code
block in question, but rte->lateral is false and
bms_num_members(rel->lateral_relids) is 0 so it is unaffected by this
patch.

 select *
 from foo
 left join (
   select n
   from bar
   limit 1
 ) on true;

 Nested Loop Left Join  (cost=0.00..67.39 rows=2550 width=8)
   ->  Seq Scan on foo  (cost=0.00..35.50 rows=2550 width=4)
   ->  Materialize  (cost=0.00..0.02 rows=1 width=4)
         ->  Limit  (cost=0.00..0.01 rows=1 width=4)
               ->  Seq Scan on bar  (cost=0.00..32.60 rows=2260 width=4)

This query also has rte->lateral is false and
bms_num_members(rel->lateral_relids) is 0 when reaching the line of
code changed in this patch. The interesting thing about this query is
that if you set enable_material = off the materialize node goes away,
and we do not attempt to rewrite the query into something that would
execute the subquery only once, so this is a case where we already
don't provide the theoretically strongest possible guarantees, though
one could reasonably argue it's a bit artificial with materialization
turned off.

In summary we already allow parallelizing this type of execution
pattern if the subquery is in the select clause; we apply stricter
standards to all subqueries in from clauses. I believe the prohibition
on parallelizing subqueries with limits in from clauses was
unnecessarily restrictive when it was added. When we have lateral
dependencies we execute the query in the same was we would when the
subquery is in the target list (i.e., per tuple at that point in the
join tree), and so we should be able to parallelize those subqueries
in the from clause just like we already do when they are in the target
list.

In the attached series the first patch adds a bunch of new tests to
show a bunch of permutations of queries. Most of the added test
queries don't end up changing with the 2nd patch applied (the actual
code changes)  so that you can easily see the narrow scope of what's
affected. I don't envision most of the tests sticking around if this
is committed, but hopefully it provides a helpful way to evaluate the
semantics of the change I'm proposing.

Thanks,
James Coleman

Вложения

Re: Consider parallel for lateral subqueries with limit

От

Robert Haas

Дата:

26 сентября 2022 г., 14:37:45

On Thu, Sep 22, 2022 at 5:19 PM James Coleman <jtc331@gmail.com> wrote:
> > Your sample query gets a plan like this:
> >
> >  Nested Loop Left Join  (cost=0.00..1700245.00 rows=10000 width=8)
> >    ->  Seq Scan on foo  (cost=0.00..145.00 rows=10000 width=4)
> >    ->  Limit  (cost=0.00..170.00 rows=1 width=4)
> >          ->  Seq Scan on bar  (cost=0.00..170.00 rows=1 width=4)
> >                Filter: (foo.a = a)
> >
> > If this were to occur inside a larger plan tree someplace, it would be
> > OK to insert a Gather node above the Nested Loop node without doing
> > anything further, because then the parameter that stores foo.a would
> > be both set and used in the worker. If you wanted to insert a Gather
> > at any other place in this plan, things get more complicated. But just
> > because you have LATERAL doesn't mean that you have this problem,
> > because if you delete the "limit 1" then the subqueries get flattened
> > together and the parameter disappears,
>
> For future reference in this email thread when you remove the "limit
> 1" this is the plan you get:
>
>  Merge Right Join  (cost=372.18..815.71 rows=28815 width=8)
>    Merge Cond: (bar.a = foo.a)
>    ->  Sort  (cost=158.51..164.16 rows=2260 width=8)
>          Sort Key: bar.a
>          ->  Seq Scan on bar  (cost=0.00..32.60 rows=2260 width=8)
>    ->  Sort  (cost=179.78..186.16 rows=2550 width=4)
>          Sort Key: foo.a
>          ->  Seq Scan on foo  (cost=0.00..35.50 rows=2550 width=4)
>
> Just to make sure I'm following: by "doesn't mean that you have this
> problem" you mean "doesn't mean you have this limitation on parallel
> query"?

I'm talking specifically about whether there's a parameter. The
problem here isn't created by LATERAL, but by parameters. In the
nested loop plan, there's a parameter that's storing foo.a, and the
storage location that holds that parameter value is in backend-private
memory, so you can't set the value in the leader and then use it in a
worker, and that restricts where you can insert a Gather node. But in
the Merge Join plan (or if you got a Hash Join plan) there is no
parameter. So the fact that parameter storage isn't shared between
leaders and workers does not matter.

> Yes, that's a good point too. I need to play with these examples and
> confirm whether lateral_relids gets set in that case. IIRC when that's
> set isn't exactly the same as whether or not the LATERAL keyword is
> used, and I should clarify that my claims here are meant to be about
> when we execute it that way regardless of the keyword usage. The
> keyword usage I'd assumed just made it easier to talk about, but maybe
> you're implying that it's actually generating confusion.

Yes, I think so.

Stepping back a bit, commit 75f9c4ca5a8047d7a9cfbc7d51a610933d04dc7f
introduced the code that is at issue here, and it took the position
that limit/offset should be marked parallel-restricted because they're
nondeterministic. I'm not sure that's really quite the right thing.
The original idea behind having things be parallel-restricted was that
there might be stuff which depends on state in the leader that is not
replicated to or shared with the workers, and thus it must be executed
in the leader to get the right answer. Here, however, there is no such
problem. Something like LIMIT/OFFSET that is nondeterministic is
perfectly safe to execute in a worker, or in the leader. It's just not
safe to execute the same computation more than once and assume that we
will get the same answer each time. But that's really a different
problem.

And the problem that you've run into here, I think, is that if a Limit
node is getting fed a parameter from higher up in the query plan, then
it's not really the same computation being performed every time, and
thus the non-determinism doesn't matter, and thus the
parallel-restriction goes away, but the code doesn't know that.

-- 
Robert Haas
EDB: http://www.enterprisedb.com

Re: Consider parallel for lateral subqueries with limit

От

James Coleman

Дата:

26 сентября 2022 г., 15:17:01

On Mon, Sep 26, 2022 at 10:37 AM Robert Haas <robertmhaas@gmail.com> wrote:
>
> On Thu, Sep 22, 2022 at 5:19 PM James Coleman <jtc331@gmail.com> wrote:
> > > Your sample query gets a plan like this:
> > >
> > >  Nested Loop Left Join  (cost=0.00..1700245.00 rows=10000 width=8)
> > >    ->  Seq Scan on foo  (cost=0.00..145.00 rows=10000 width=4)
> > >    ->  Limit  (cost=0.00..170.00 rows=1 width=4)
> > >          ->  Seq Scan on bar  (cost=0.00..170.00 rows=1 width=4)
> > >                Filter: (foo.a = a)
> > >
> > > If this were to occur inside a larger plan tree someplace, it would be
> > > OK to insert a Gather node above the Nested Loop node without doing
> > > anything further, because then the parameter that stores foo.a would
> > > be both set and used in the worker. If you wanted to insert a Gather
> > > at any other place in this plan, things get more complicated. But just
> > > because you have LATERAL doesn't mean that you have this problem,
> > > because if you delete the "limit 1" then the subqueries get flattened
> > > together and the parameter disappears,
> >
> > For future reference in this email thread when you remove the "limit
> > 1" this is the plan you get:
> >
> >  Merge Right Join  (cost=372.18..815.71 rows=28815 width=8)
> >    Merge Cond: (bar.a = foo.a)
> >    ->  Sort  (cost=158.51..164.16 rows=2260 width=8)
> >          Sort Key: bar.a
> >          ->  Seq Scan on bar  (cost=0.00..32.60 rows=2260 width=8)
> >    ->  Sort  (cost=179.78..186.16 rows=2550 width=4)
> >          Sort Key: foo.a
> >          ->  Seq Scan on foo  (cost=0.00..35.50 rows=2550 width=4)
> >
> > Just to make sure I'm following: by "doesn't mean that you have this
> > problem" you mean "doesn't mean you have this limitation on parallel
> > query"?
>
> I'm talking specifically about whether there's a parameter. The
> problem here isn't created by LATERAL, but by parameters. In the
> nested loop plan, there's a parameter that's storing foo.a, and the
> storage location that holds that parameter value is in backend-private
> memory, so you can't set the value in the leader and then use it in a
> worker, and that restricts where you can insert a Gather node. But in
> the Merge Join plan (or if you got a Hash Join plan) there is no
> parameter. So the fact that parameter storage isn't shared between
> leaders and workers does not matter.

Ah, yes, right.

> > Yes, that's a good point too. I need to play with these examples and
> > confirm whether lateral_relids gets set in that case. IIRC when that's
> > set isn't exactly the same as whether or not the LATERAL keyword is
> > used, and I should clarify that my claims here are meant to be about
> > when we execute it that way regardless of the keyword usage. The
> > keyword usage I'd assumed just made it easier to talk about, but maybe
> > you're implying that it's actually generating confusion.
>
> Yes, I think so.
>
> Stepping back a bit, commit 75f9c4ca5a8047d7a9cfbc7d51a610933d04dc7f
> introduced the code that is at issue here, and it took the position
> that limit/offset should be marked parallel-restricted because they're
> nondeterministic. I'm not sure that's really quite the right thing.
> The original idea behind having things be parallel-restricted was that
> there might be stuff which depends on state in the leader that is not
> replicated to or shared with the workers, and thus it must be executed
> in the leader to get the right answer. Here, however, there is no such
> problem. Something like LIMIT/OFFSET that is nondeterministic is
> perfectly safe to execute in a worker, or in the leader. It's just not
> safe to execute the same computation more than once and assume that we
> will get the same answer each time. But that's really a different
> problem.
>
> And the problem that you've run into here, I think, is that if a Limit
> node is getting fed a parameter from higher up in the query plan, then
> it's not really the same computation being performed every time, and
> thus the non-determinism doesn't matter, and thus the
> parallel-restriction goes away, but the code doesn't know that.

Correct. I think the simplest description of the proper limitation is
that we can't parallelize when either:
1. The param comes from outside the worker, or
2. The "same" param value from the same tuple is computed in multiple
workers (as distinct from the same value from different outer tuples).
Or, to put it positively, when can parallelize when:
1. There are no params, or
2. The param is uniquely generated and used inside a single worker.

I believe the followup email I sent (with an updated patch) details
the correctness of that approach; I'd be interested in your feedback
on that (I recognize it's quite a long email, though...) if you get a
chance.

Thanks for your review on this so far,
James Coleman

Re: Consider parallel for lateral subqueries with limit

От

"Gregory Stark (as CFM)"

Дата:

23 марта 2023 г., 18:42:51

This patch has been "Needs Review" and bouncing from CF to CF. It
actually looks like it got quite a bit of design discussion and while
James Coleman seems convinced of its safety it doesn't sound like Tom
Lane and Robert Haas are yet and it doesn't look like that's going to
happen in this CF.

I think I'm going to mark this Returned With Feedback on the basis
that it did actually get a significant amount of discussion and no
further review seems to be forthcoming. Perhaps a more rigorous
approach of proving the correctness or perhaps an
easier-to-reason-about set of constraints would make it easier to
reach a consensus?

Once you think the patchset is ready for review again, you (or any
interested party) can resurrect the patch entry by visiting

    https://commitfest.postgresql.org/42/2851/

and changing the status to "Needs Review", and then changing the
status again to "Move to next CF". (Don't forget the second step;
hopefully we will have streamlined this in the near future!)

-- 
Gregory Stark
As Commitfest Manager

Вход в личный кабинет

Восстановление пароля

Подтверждение аккаунта

Изменение пароля

Обсуждение: Consider parallel for lateral subqueries with limit

Вложения

Вложения

Вложения