Re: Oversight in reparameterize_path_by_child leading to executor crash

Поиск
Список
Период
Сортировка
От Ashutosh Bapat
Тема Re: Oversight in reparameterize_path_by_child leading to executor crash
Дата
Msg-id CAExHW5vJpm4t9JU+6FF6TcqM3Z3D2SHcQerabtm_wBhUhQTmoQ@mail.gmail.com
обсуждение исходный текст
Ответ на Re: Oversight in reparameterize_path_by_child leading to executor crash  (Richard Guo <guofenglinux@gmail.com>)
Список pgsql-hackers
On Wed, Aug 23, 2023 at 11:07 AM Richard Guo <guofenglinux@gmail.com> wrote:
>
>
> On Tue, Aug 22, 2023 at 10:39 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
>>
>> Richard Guo <guofenglinux@gmail.com> writes:
>> > I'm wondering if we can instead adjust the 'inner_req_outer' in
>> > create_nestloop_path before we perform the check to work around this
>> > issue for the back branches, something like
>> > +   /*
>> > +    * Adjust the parameterization information, which refers to the topmost
>> > +    * parent.
>> > +    */
>> > +   inner_req_outer =
>> > +       adjust_child_relids_multilevel(root, inner_req_outer,
>> > +                                      outer_path->parent->relids,
>> > +                                      outer_path->parent->top_parent_relids);
>>
>> Mmm ... at the very least you'd need to not do that when top_parent_relids
>> is unset.
>
>
> Hmm, adjust_child_relids_multilevel would just return the given relids
> unchanged when top_parent_relids is unset, so I think it should be safe
> to call it even for non-other rel.
>
>>
>> ...  But I think the risk/reward ratio for messing with this in the
>> back branches is unattractive in any case: to fix a corner case that
>> apparently nobody uses in the field, we risk breaking any number of
>> mainstream parameterized-path cases.  I'm content to commit the v5 patch
>> (or a successor) into HEAD, but at this point I'm not sure I even want
>> to risk it in v16, let alone perform delicate surgery to get it to work
>> in older branches.  I think we ought to go with the "tablesample scans
>> can't be reparameterized" approach in v16 and before.
>
>
> If we go with the "tablesample scans can't be reparameterized" approach
> in the back branches, I'm a little concerned that what if we find more
> cases in the futrue where we need modify RTEs for reparameterization.
> So I spent some time seeking and have managed to find one: there might
> be lateral references in a scan path's restriction clauses, and
> currently reparameterize_path_by_child fails to adjust them.
>
> regression=# explain (verbose, costs off)
> select count(*) from prt1 t1 left join lateral
>     (select t1.b as t1b, t2.* from prt2 t2) s
>     on t1.a = s.b where s.t1b = s.a;
>                               QUERY PLAN
> ----------------------------------------------------------------------
>  Aggregate
>    Output: count(*)
>    ->  Append
>          ->  Nested Loop
>                ->  Seq Scan on public.prt1_p1 t1_1
>                      Output: t1_1.a, t1_1.b
>                ->  Index Scan using iprt2_p1_b on public.prt2_p1 t2_1
>                      Output: t2_1.b
>                      Index Cond: (t2_1.b = t1_1.a)
>                      Filter: (t1.b = t2_1.a)
>          ->  Nested Loop
>                ->  Seq Scan on public.prt1_p2 t1_2
>                      Output: t1_2.a, t1_2.b
>                ->  Index Scan using iprt2_p2_b on public.prt2_p2 t2_2
>                      Output: t2_2.b
>                      Index Cond: (t2_2.b = t1_2.a)
>                      Filter: (t1.b = t2_2.a)
>          ->  Nested Loop
>                ->  Seq Scan on public.prt1_p3 t1_3
>                      Output: t1_3.a, t1_3.b
>                ->  Index Scan using iprt2_p3_b on public.prt2_p3 t2_3
>                      Output: t2_3.b
>                      Index Cond: (t2_3.b = t1_3.a)
>                      Filter: (t1.b = t2_3.a)
> (24 rows)
>
> The clauses in 'Filter' are not adjusted correctly.  Var 't1.b' still
> refers to the top parent.
>
> For this query it would just give wrong results.
>
> regression=# set enable_partitionwise_join to off;
> SET
> regression=# select count(*) from prt1 t1 left join lateral
>     (select t1.b as t1b, t2.* from prt2 t2) s
>     on t1.a = s.b where s.t1b = s.a;
>  count
> -------
>    100
> (1 row)
>
> regression=# set enable_partitionwise_join to on;
> SET
> regression=# select count(*) from prt1 t1 left join lateral
>     (select t1.b as t1b, t2.* from prt2 t2) s
>     on t1.a = s.b where s.t1b = s.a;
>  count
> -------
>      5
> (1 row)
>
>
> For another query it would error out during planning.
>
> regression=# explain (verbose, costs off)
> select count(*) from prt1 t1 left join lateral
>     (select t1.b as t1b, t2.* from prt2 t2) s
>     on t1.a = s.b where s.t1b = s.b;
> ERROR:  variable not found in subplan target list
>
> To fix it we may need to modify RelOptInfos for Path, BitmapHeapPath,
> ForeignPath and CustomPath, and modify IndexOptInfos for IndexPath.  It
> seems that that is not easily done without postponing reparameterization
> of paths until createplan.c.

Maybe I am missing something here but why aren't we copying these
restrictinfos in the path somewhere?  I have a similar question for
tablesample clause as well.

--
Best Wishes,
Ashutosh Bapat



В списке pgsql-hackers по дате отправления:

Предыдущее
От: Yugo NAGATA
Дата:
Сообщение: Re: pgbench: allow to exit immediately when any client is aborted
Следующее
От: Nathan Bossart
Дата:
Сообщение: Re: add timing information to pg_upgrade