Обсуждение: Support "Right Semi Join" plan shapes

Поиск
Список
Период
Сортировка

Support "Right Semi Join" plan shapes

От
Richard Guo
Дата:
In thread [1] which discussed 'Right Anti Join', Tom once mentioned
'Right Semi Join'.  After a preliminary investigation I think it is
beneficial and can be implemented with very short change.  With 'Right
Semi Join', what we want to do is to just have the first match for each
inner tuple.  For HashJoin, after scanning the hash bucket for matches
to current outer, we just need to check whether the inner tuple has been
set match and skip it if so.  For MergeJoin, we can do it by avoiding
restoring inner scan to the marked tuple in EXEC_MJ_TESTOUTER, in the
case when new outer tuple == marked tuple.

As that thread is already too long, fork a new thread and attach a patch
used for discussion.  The patch implements 'Right Semi Join' for
HashJoin.

[1] https://www.postgresql.org/message-id/CAMbWs4_eChX1bN%3Dvj0Uzg_7iz9Uivan%2BWjjor-X87L-V27A%2Brw%40mail.gmail.com

Thanks
Richard
Вложения

Re: Support "Right Semi Join" plan shapes

От
Richard Guo
Дата:

On Tue, Apr 18, 2023 at 5:07 PM Richard Guo <guofenglinux@gmail.com> wrote:
In thread [1] which discussed 'Right Anti Join', Tom once mentioned
'Right Semi Join'.  After a preliminary investigation I think it is
beneficial and can be implemented with very short change.  With 'Right
Semi Join', what we want to do is to just have the first match for each
inner tuple.  For HashJoin, after scanning the hash bucket for matches
to current outer, we just need to check whether the inner tuple has been
set match and skip it if so.  For MergeJoin, we can do it by avoiding
restoring inner scan to the marked tuple in EXEC_MJ_TESTOUTER, in the
case when new outer tuple == marked tuple.

As that thread is already too long, fork a new thread and attach a patch
used for discussion.  The patch implements 'Right Semi Join' for
HashJoin.

The cfbot reminds that this patch does not apply any more, so rebase it
to v2.

Thanks
Richard
Вложения

Re: Support "Right Semi Join" plan shapes

От
Richard Guo
Дата:

On Thu, Aug 10, 2023 at 3:24 PM Richard Guo <guofenglinux@gmail.com> wrote:
The cfbot reminds that this patch does not apply any more, so rebase it
to v2.

Attached is another rebase over the latest master.  Any feedback is
appreciated.

Thanks
Richard
Вложения

Re: Support "Right Semi Join" plan shapes

От
vignesh C
Дата:
On Wed, 1 Nov 2023 at 11:25, Richard Guo <guofenglinux@gmail.com> wrote:
>
>
> On Thu, Aug 10, 2023 at 3:24 PM Richard Guo <guofenglinux@gmail.com> wrote:
>>
>> The cfbot reminds that this patch does not apply any more, so rebase it
>> to v2.
>
>
> Attached is another rebase over the latest master.  Any feedback is
> appreciated.

One of the tests in CFBot has failed at [1] with:
-   Relations: (public.ft1 t1) SEMI JOIN (public.ft2 t2)
-   Remote SQL: SELECT r1."C 1", r1.c2, r1.c3, r1.c4, r1.c5, r1.c6,
r1.c7, r1.c8 FROM "S 1"."T 1" r1 WHERE ((r1."C 1" < 20)) AND EXISTS
(SELECT NULL FROM "S 1"."T 1" r3 WHERE ((r3."C 1" > 10)) AND
((date(r3.c5) = '1970-01-17'::date)) AND ((r3.c3 = r1.c3))) ORDER BY
r1."C 1" ASC NULLS LAST
-(4 rows)
+   Sort Key: t1.c1
+   ->  Foreign Scan
+         Output: t1.c1, t1.c2, t1.c3, t1.c4, t1.c5, t1.c6, t1.c7, t1.c8
+         Relations: (public.ft1 t1) SEMI JOIN (public.ft2 t2)
+         Remote SQL: SELECT r1."C 1", r1.c2, r1.c3, r1.c4, r1.c5,
r1.c6, r1.c7, r1.c8 FROM "S 1"."T 1" r1 WHERE ((r1."C 1" < 20)) AND
EXISTS (SELECT NULL FROM "S 1"."T 1" r3 WHERE ((r3."C 1" > 10)) AND
((date(r3.c5) = '1970-01-17'::date)) AND ((r3.c3 = r1.c3)))
+(7 rows)

More details are available at [2].

[1] - https://cirrus-ci.com/task/4868751326183424
[2] -
https://api.cirrus-ci.com/v1/artifact/task/4868751326183424/testrun/build/testrun/postgres_fdw/regress/regression.diffs

Regards,
Vignesh



Re: Support "Right Semi Join" plan shapes

От
Richard Guo
Дата:

On Sun, Jan 7, 2024 at 3:03 PM vignesh C <vignesh21@gmail.com> wrote:
One of the tests in CFBot has failed at [1] with:
-   Relations: (public.ft1 t1) SEMI JOIN (public.ft2 t2)
-   Remote SQL: SELECT r1."C 1", r1.c2, r1.c3, r1.c4, r1.c5, r1.c6,
r1.c7, r1.c8 FROM "S 1"."T 1" r1 WHERE ((r1."C 1" < 20)) AND EXISTS
(SELECT NULL FROM "S 1"."T 1" r3 WHERE ((r3."C 1" > 10)) AND
((date(r3.c5) = '1970-01-17'::date)) AND ((r3.c3 = r1.c3))) ORDER BY
r1."C 1" ASC NULLS LAST
-(4 rows)
+   Sort Key: t1.c1
+   ->  Foreign Scan
+         Output: t1.c1, t1.c2, t1.c3, t1.c4, t1.c5, t1.c6, t1.c7, t1.c8
+         Relations: (public.ft1 t1) SEMI JOIN (public.ft2 t2)
+         Remote SQL: SELECT r1."C 1", r1.c2, r1.c3, r1.c4, r1.c5,
r1.c6, r1.c7, r1.c8 FROM "S 1"."T 1" r1 WHERE ((r1."C 1" < 20)) AND
EXISTS (SELECT NULL FROM "S 1"."T 1" r3 WHERE ((r3."C 1" > 10)) AND
((date(r3.c5) = '1970-01-17'::date)) AND ((r3.c3 = r1.c3)))
+(7 rows)

Thanks.  I looked into it and have figured out why the plan differs.
With this patch the SEMI JOIN that is pushed down to the remote server
is now implemented using JOIN_RIGHT_SEMI, whereas previously it was
implemented using JOIN_SEMI.  Consequently, this leads to changes in the
costs of the paths: path with the sort pushed down to remote server, and
path with the sort added atop the foreign join.  And at last the latter
one wins by a slim margin.

I think we can simply update the expected file to fix this plan diff, as
attached.

Thanks
Richard
Вложения

Re: Support "Right Semi Join" plan shapes

От
wenhui qiu
Дата:
Hi  vignesh C    I saw this path has been passed (https://cirrus-ci.com/build/6109321080078336),can we push it?

Best wish

Richard Guo <guofenglinux@gmail.com> 于2024年1月9日周二 18:49写道:

On Sun, Jan 7, 2024 at 3:03 PM vignesh C <vignesh21@gmail.com> wrote:
One of the tests in CFBot has failed at [1] with:
-   Relations: (public.ft1 t1) SEMI JOIN (public.ft2 t2)
-   Remote SQL: SELECT r1."C 1", r1.c2, r1.c3, r1.c4, r1.c5, r1.c6,
r1.c7, r1.c8 FROM "S 1"."T 1" r1 WHERE ((r1."C 1" < 20)) AND EXISTS
(SELECT NULL FROM "S 1"."T 1" r3 WHERE ((r3."C 1" > 10)) AND
((date(r3.c5) = '1970-01-17'::date)) AND ((r3.c3 = r1.c3))) ORDER BY
r1."C 1" ASC NULLS LAST
-(4 rows)
+   Sort Key: t1.c1
+   ->  Foreign Scan
+         Output: t1.c1, t1.c2, t1.c3, t1.c4, t1.c5, t1.c6, t1.c7, t1.c8
+         Relations: (public.ft1 t1) SEMI JOIN (public.ft2 t2)
+         Remote SQL: SELECT r1."C 1", r1.c2, r1.c3, r1.c4, r1.c5,
r1.c6, r1.c7, r1.c8 FROM "S 1"."T 1" r1 WHERE ((r1."C 1" < 20)) AND
EXISTS (SELECT NULL FROM "S 1"."T 1" r3 WHERE ((r3."C 1" > 10)) AND
((date(r3.c5) = '1970-01-17'::date)) AND ((r3.c3 = r1.c3)))
+(7 rows)

Thanks.  I looked into it and have figured out why the plan differs.
With this patch the SEMI JOIN that is pushed down to the remote server
is now implemented using JOIN_RIGHT_SEMI, whereas previously it was
implemented using JOIN_SEMI.  Consequently, this leads to changes in the
costs of the paths: path with the sort pushed down to remote server, and
path with the sort added atop the foreign join.  And at last the latter
one wins by a slim margin.

I think we can simply update the expected file to fix this plan diff, as
attached.

Thanks
Richard

Re: Support "Right Semi Join" plan shapes

От
vignesh C
Дата:
On Mon, 22 Jan 2024 at 11:27, wenhui qiu <qiuwenhuifx@gmail.com> wrote:
>
> Hi  vignesh C    I saw this path has been passed (https://cirrus-ci.com/build/6109321080078336),can we push it?

If you have found no comments from your review and testing, let's mark
it as "ready for committer".

Regards,
Vignesh



Re: Support "Right Semi Join" plan shapes

От
wenhui qiu
Дата:
Hi  vignesh C
   Many thanks, I have marked it to  "ready for committer"

Best wish

vignesh C <vignesh21@gmail.com> 于2024年1月23日周二 10:56写道:
On Mon, 22 Jan 2024 at 11:27, wenhui qiu <qiuwenhuifx@gmail.com> wrote:
>
> Hi  vignesh C    I saw this path has been passed (https://cirrus-ci.com/build/6109321080078336),can we push it?

If you have found no comments from your review and testing, let's mark
it as "ready for committer".

Regards,
Vignesh

Re: Support "Right Semi Join" plan shapes

От
Alena Rybakina
Дата:
Hi! Thank you for your work on this subject.

I have reviewed your patch and I think it is better to add an Assert for 
JOIN_RIGHT_SEMI to the ExecMergeJoin and ExecNestLoop functions to 
prevent the use of RIGHT_SEMI for these types of connections (NestedLoop 
and MergeJoin).
Mostly I'm suggesting this because of the set_join_pathlist_hook 
function, which is in the add_paths_to_joinrel function, which allows 
you to create a custom node. What do you think?

-- 
Regards,
Alena Rybakina
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company




Re: Support "Right Semi Join" plan shapes

От
wenhui qiu
Дата:
Hi Alena Rybakina
 I saw this code snippet also disable mergejoin ,I think it same  effect 

+ /*
+ * For now we do not support RIGHT_SEMI join in mergejoin.
+ */
+ if (jointype == JOIN_RIGHT_SEMI)
+ {
+ *mergejoin_allowed = false;
+ return NIL;
+ }
+

Regards


Alena Rybakina <lena.ribackina@yandex.ru> 于2024年1月30日周二 14:51写道:
Hi! Thank you for your work on this subject.

I have reviewed your patch and I think it is better to add an Assert for
JOIN_RIGHT_SEMI to the ExecMergeJoin and ExecNestLoop functions to
prevent the use of RIGHT_SEMI for these types of connections (NestedLoop
and MergeJoin).
Mostly I'm suggesting this because of the set_join_pathlist_hook
function, which is in the add_paths_to_joinrel function, which allows
you to create a custom node. What do you think?

--
Regards,
Alena Rybakina
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

Re: Support "Right Semi Join" plan shapes

От
wenhui qiu
Дата:
HI Richard 
     Now it is starting the last commitfest for v17, can you respond to Alena Rybakina points?


Regards

On Thu, 8 Feb 2024 at 13:50, wenhui qiu <qiuwenhuifx@gmail.com> wrote:
Hi Alena Rybakina
 I saw this code snippet also disable mergejoin ,I think it same  effect 

+ /*
+ * For now we do not support RIGHT_SEMI join in mergejoin.
+ */
+ if (jointype == JOIN_RIGHT_SEMI)
+ {
+ *mergejoin_allowed = false;
+ return NIL;
+ }
+

Regards


Alena Rybakina <lena.ribackina@yandex.ru> 于2024年1月30日周二 14:51写道:
Hi! Thank you for your work on this subject.

I have reviewed your patch and I think it is better to add an Assert for
JOIN_RIGHT_SEMI to the ExecMergeJoin and ExecNestLoop functions to
prevent the use of RIGHT_SEMI for these types of connections (NestedLoop
and MergeJoin).
Mostly I'm suggesting this because of the set_join_pathlist_hook
function, which is in the add_paths_to_joinrel function, which allows
you to create a custom node. What do you think?

--
Regards,
Alena Rybakina
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

Re: Support "Right Semi Join" plan shapes

От
Richard Guo
Дата:

On Mon, Mar 4, 2024 at 10:33 AM wenhui qiu <qiuwenhuifx@gmail.com> wrote:
HI Richard 
     Now it is starting the last commitfest for v17, can you respond to Alena Rybakina points?

Thanks for reminding.  Will do that soon.

Thanks
Richard

Re: Support "Right Semi Join" plan shapes

От
Richard Guo
Дата:

On Tue, Jan 30, 2024 at 2:51 PM Alena Rybakina <lena.ribackina@yandex.ru> wrote:
I have reviewed your patch and I think it is better to add an Assert for
JOIN_RIGHT_SEMI to the ExecMergeJoin and ExecNestLoop functions to
prevent the use of RIGHT_SEMI for these types of connections (NestedLoop
and MergeJoin).

Hmm, I don't see why this is necessary.  The planner should already
guarantee that we won't have nestloops/mergejoins with right-semi joins.

Thanks
Richard

Re: Support "Right Semi Join" plan shapes

От
wenhui qiu
Дата:
Hi Richard
     Agree +1 ,I think can push now.

Richard

On Tue, 5 Mar 2024 at 10:44, Richard Guo <guofenglinux@gmail.com> wrote:

On Tue, Jan 30, 2024 at 2:51 PM Alena Rybakina <lena.ribackina@yandex.ru> wrote:
I have reviewed your patch and I think it is better to add an Assert for
JOIN_RIGHT_SEMI to the ExecMergeJoin and ExecNestLoop functions to
prevent the use of RIGHT_SEMI for these types of connections (NestedLoop
and MergeJoin).

Hmm, I don't see why this is necessary.  The planner should already
guarantee that we won't have nestloops/mergejoins with right-semi joins.

Thanks
Richard

Re: Support "Right Semi Join" plan shapes

От
Alena Rybakina
Дата:

To be honest, I didn't see it in the code, could you tell me where they are, please?

On 05.03.2024 05:44, Richard Guo wrote:

On Tue, Jan 30, 2024 at 2:51 PM Alena Rybakina <lena.ribackina@yandex.ru> wrote:
I have reviewed your patch and I think it is better to add an Assert for
JOIN_RIGHT_SEMI to the ExecMergeJoin and ExecNestLoop functions to
prevent the use of RIGHT_SEMI for these types of connections (NestedLoop
and MergeJoin).

Hmm, I don't see why this is necessary.  The planner should already
guarantee that we won't have nestloops/mergejoins with right-semi joins.

Thanks
Richard
-- 
Regards,
Alena Rybakina
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

Re: Support "Right Semi Join" plan shapes

От
wenhui qiu
Дата:

Hi Alena Rybakina
For merge join
+ /*
+ * For now we do not support RIGHT_SEMI join in mergejoin.
+ */
+ if (jointype == JOIN_RIGHT_SEMI)
+ {
+ *mergejoin_allowed = false;
+ return NIL;
+ }
+
Tanks


On Wed, 6 Mar 2024 at 04:10, Alena Rybakina <lena.ribackina@yandex.ru> wrote:

To be honest, I didn't see it in the code, could you tell me where they are, please?

On 05.03.2024 05:44, Richard Guo wrote:

On Tue, Jan 30, 2024 at 2:51 PM Alena Rybakina <lena.ribackina@yandex.ru> wrote:
I have reviewed your patch and I think it is better to add an Assert for
JOIN_RIGHT_SEMI to the ExecMergeJoin and ExecNestLoop functions to
prevent the use of RIGHT_SEMI for these types of connections (NestedLoop
and MergeJoin).

Hmm, I don't see why this is necessary.  The planner should already
guarantee that we won't have nestloops/mergejoins with right-semi joins.

Thanks
Richard
-- 
Regards,
Alena Rybakina
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

Re: Support "Right Semi Join" plan shapes

От
Alena Rybakina
Дата:


On 06.03.2024 05:23, wenhui qiu wrote:

Hi Alena Rybakina
For merge join
+ /*
+ * For now we do not support RIGHT_SEMI join in mergejoin.
+ */
+ if (jointype == JOIN_RIGHT_SEMI)
+ {
+ *mergejoin_allowed = false;
+ return NIL;
+ }
+
Tanks


Yes, I see it, thank you. Sorry for the noise.
-- 
Regards,
Alena Rybakina
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

Re: Support "Right Semi Join" plan shapes

От
Richard Guo
Дата:
Here is another rebase with a commit message to help review.  I also
tweaked some comments.

Thanks
Richard
Вложения

Re: Support "Right Semi Join" plan shapes

От
wenhui qiu
Дата:
Hi Richard 
     Thank you so much for your tireless work on this,I see the new version of the patch improves some of the comments .I think it can commit in July


Thanks

On Thu, 25 Apr 2024 at 11:28, Richard Guo <guofenglinux@gmail.com> wrote:
Here is another rebase with a commit message to help review.  I also
tweaked some comments.

Thanks
Richard