Re: Question about pull_up_sublinks_qual_recurse

Поиск
Список
Период
Сортировка
От Andy Fan
Тема Re: Question about pull_up_sublinks_qual_recurse
Дата
Msg-id CAKU4AWoBQRw1_m524tLDnQyi8Rj2uzdy+Dsf32mC8YgsxkqABw@mail.gmail.com
обсуждение исходный текст
Ответ на Re: Question about pull_up_sublinks_qual_recurse  (Tom Lane <tgl@sss.pgh.pa.us>)
Список pgsql-hackers


On Sat, Oct 15, 2022 at 3:27 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:
Andy Fan <zhihui.fan1213@gmail.com> writes:
> After some more self review,  I find my proposal has the following side
> effects.

Yeah, I do not think this works at all.  .... 
The discussion of outer join
reordering in optimizer/README says that that doesn't work, and while
I'm too lazy to construct an example right now, I believe it's true.

I came to this topic again recently and have finally figured out the
reason.  It looks to me that semi join is slightly different with outer
join in this case.

The following test case can show why we have to
pull_up_sublinks_qual_recurse to either LHS or RHS rather than the
JoinExpr.

create table t1(a int, b int, c int);
create table t2(a int, b int, c int);
create table t3(a int, b int, c int);

insert into t1 select 1, 1, 2;
insert into t2 select 1, 2, 1;
insert into t2 select 1, 1, 2;
insert into t3 select 1, 1, 2;

select * from t1
where exists (select 1 from t2
-- below references to t1 and t2 at the same time
where exists (select 1 from t3
   where t1.c = t2.c and t2.b = t3.b)  
and t1.a = t2.a);

which can be transformed to

SELECT * FROM t1 SEMI JOIN t2
ON t1.a = t2.a
AND exists (select 1 from t3
where t1.c = t2.c
    and t2.b = t3.b)

Here the semantics of the query is return the rows in T1 iff there is a
row in t2 matches the whole clause (t1.a = t2.a AND exists..);

But If we transform it to

SELECT * FROM (t1 SEMI JOIN t2
ON t1.a = t2.a) SEMI JOIN t3
on t1.c = t2.c and t2.b = t3.b;

The scan on T2 would stop if ONLY (t1.a = t2.a) matches and the following
rows will be ignored. However the matched rows may doesn't match the
exists clause! So in the above example, the correct result set will be 1 
row. If we pull up the sublink above the JoinExpr, no row would be found.

The attached is just a comment and a test case to help understand why we
have to do things like this.
 
--
Best Regards
Andy Fan
Вложения

В списке pgsql-hackers по дате отправления:

Предыдущее
От: Andres Freund
Дата:
Сообщение: Re: Save a few bytes in pg_attribute
Следующее
От: Richard Guo
Дата:
Сообщение: Re: Fix typo plgsql to plpgsql.