Обсуждение: Block nested loop join

Поиск
Список
Период
Сортировка

Block nested loop join

От
"Bramandia Ramadhana"
Дата:
<div dir="ltr">Hi all,<br /><br />I am new to postgresql. I am currently doing research to optimize the query
performanceof RDBMS, specifically postgresql. Hence, I am currently reading out the code to understand the
implementationof various query evaluation algorithm in postgresql.<br /><br />Currently, I am investigating the nested
loopjoin algorithm in nodeNestloop.c. After reading the code, my understanding is that it performs simple nested loop
join(not block nested loop join). Is this true? Does postgresql support block nested loop join? If it does, where is
it?I might miss some buffering/caching mechanism.<br /><br />I appreciate any helps/advice.<br /><br />Regards,<br
/><br/>Bramandia R.<br /><br /></div> 

Re: Block nested loop join

От
Heikki Linnakangas
Дата:
Bramandia Ramadhana wrote:
> Currently, I am investigating the nested loop join algorithm in
> nodeNestloop.c. After reading the code, my understanding is that it performs
> simple nested loop join (not block nested loop join). Is this true?

Yep.

> Does postgresql support block nested loop join? 

Nope.

--   Heikki Linnakangas  EnterpriseDB   http://www.enterprisedb.com


Re: Block nested loop join

От
Gregory Stark
Дата:
Heikki Linnakangas <heikki.linnakangas@enterprisedb.com> writes:

>> Does postgresql support block nested loop join? 
>
> Nope.

We do support Hash Join though so I think the only difference is that we can't
use the hash join for cartesian joins.

--  Gregory Stark EnterpriseDB          http://www.enterprisedb.com Ask me about EnterpriseDB's PostGIS support!


Re: Block nested loop join

От
"Bramandia Ramadhana"
Дата:
<div dir="ltr">Thanks for the clarifications.<br /><br />Just for curiosity, is there any reason of not having block
nested-loopjoin implementation? Is it rarely useful?<br /><br />As far as I am aware of, in the case of cross product
oftwo tables, block nested-loop join is the most efficient algorithm.<br /><br />Regards,<br /><br />Bramandia R.<br
/><br/><div class="gmail_quote">On Fri, Oct 10, 2008 at 3:30 PM, Gregory Stark <span dir="ltr"><<a
href="mailto:stark@enterprisedb.com">stark@enterprisedb.com</a>></span>wrote:<br /><blockquote class="gmail_quote"
style="border-left:1px solid rgb(204, 204, 204); margin: 0pt 0pt 0pt 0.8ex; padding-left: 1ex;"><div class="Ih2E3d"><br
/>Heikki Linnakangas <<a
href="mailto:heikki.linnakangas@enterprisedb.com">heikki.linnakangas@enterprisedb.com</a>>writes:<br /><br />
>>Does postgresql support block nested loop join?<br /> ><br /> > Nope.<br /><br /></div>We do support Hash
Jointhough so I think the only difference is that we can't<br /> use the hash join for cartesian joins.<br /><br />
--<br/>  Gregory Stark<br /><div class="Ih2E3d">  EnterpriseDB          <a href="http://www.enterprisedb.com"
target="_blank">http://www.enterprisedb.com</a><br/></div>  Ask me about EnterpriseDB's PostGIS support!<br
/></blockquote></div><br/></div> 

Re: Block nested loop join

От
Gregory Stark
Дата:
"Bramandia Ramadhana" <bramandia@gmail.com> writes:

> Thanks for the clarifications.
>
> Just for curiosity, is there any reason of not having block nested-loop join
> implementation? Is it rarely useful?

Oh, actually it occurs to me that we do implement something analogous to a
degenerate block nested loop where we materialize one side of the nested loop.

It looks "backward" since the materialized side is the "inner" side of the
loop but it's basically equivalent to a block nested loop with the entire
outer side in a single block.

So the use case of a real block nested loop would be doing a cartesian join of
two large tables where neither fits in RAM. That does seem like it might be
kind of narrow given how large the output would be.

But we won't know unless someone does the experiment.

--  Gregory Stark EnterpriseDB          http://www.enterprisedb.com Ask me about EnterpriseDB's Slony Replication
support!


Re: Block nested loop join

От
Tom Lane
Дата:
Gregory Stark <stark@enterprisedb.com> writes:
> So the use case of a real block nested loop would be doing a cartesian join of
> two large tables where neither fits in RAM. That does seem like it might be
> kind of narrow given how large the output would be.

Yeah.  If you have a hashable join condition then our existing batched
hash join code should be roughly equivalent to this method.  So the use
case is joining a couple of large tables with an un-hashable,
un-indexable join condition (or none at all, ie cross product) and that
just isn't something we hear people wanting to do a lot.  I can't really
see why we'd bother maintaining extra code for block nested loop.
        regards, tom lane


Re: Block nested loop join

От
Gregory Stark
Дата:
Tom Lane <tgl@sss.pgh.pa.us> writes:

> Gregory Stark <stark@enterprisedb.com> writes:
>> So the use case of a real block nested loop would be doing a cartesian join of
>> two large tables where neither fits in RAM. That does seem like it might be
>> kind of narrow given how large the output would be.
>
> Yeah.  If you have a hashable join condition then our existing batched
> hash join code should be roughly equivalent to this method.  So the use
> case is joining a couple of large tables with an un-hashable,
> un-indexable join condition (or none at all, ie cross product) and that
> just isn't something we hear people wanting to do a lot.  I can't really
> see why we'd bother maintaining extra code for block nested loop.

Hm, I hadn't thought of other non-hashable join conditions.

I wonder how much code it would be though if we just hacked hash join to
support returning the full cartesian product. Ie, instead of doing a hash
lookup do a full scan of the hash and recheck the join condition if any for
every combination.

That seems like it would be a pretty small change to HashJoin and would
effectively support precisely this join type.

--  Gregory Stark EnterpriseDB          http://www.enterprisedb.com Ask me about EnterpriseDB's On-Demand Production
Tuning


Re: Block nested loop join

От
"Bramandia Ramadhana"
Дата:
<div dir="ltr">Dear All,<br /><br /> I took a look at the source code for hash join this morning and I realized that
theblock nested loop join is somewhat similar to that. <br /><br /> Thanks for the discussions. <br /><br /> Bramandia
R.<br/><br /><div class="gmail_quote">On Fri, Oct 10, 2008 at 8:19 PM, Tom Lane <span dir="ltr"><<a
href="mailto:tgl@sss.pgh.pa.us">tgl@sss.pgh.pa.us</a>></span>wrote:<br /><blockquote class="gmail_quote"
style="border-left:1px solid rgb(204, 204, 204); margin: 0pt 0pt 0pt 0.8ex; padding-left: 1ex;"><div
class="Ih2E3d">GregoryStark <<a href="mailto:stark@enterprisedb.com">stark@enterprisedb.com</a>> writes:<br />
>So the use case of a real block nested loop would be doing a cartesian join of<br /> > two large tables where
neitherfits in RAM. That does seem like it might be<br /> > kind of narrow given how large the output would be.<br
/><br/></div>Yeah.  If you have a hashable join condition then our existing batched<br /> hash join code should be
roughlyequivalent to this method.  So the use<br /> case is joining a couple of large tables with an un-hashable,<br />
un-indexablejoin condition (or none at all, ie cross product) and that<br /> just isn't something we hear people
wantingto do a lot.  I can't really<br /> see why we'd bother maintaining extra code for block nested loop.<br /><br />
                      regards, tom lane<br /></blockquote></div><br /></div>