On Thu, Feb 19, 2009 at 1:38 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> [ after re-reading the code a bit ]
>
> Robert Haas <robertmhaas@gmail.com> writes:
>> Cool. On the topic of documentation, I find the following comment in
>> joinrels.c rather impenetrable:
>
>> /*
>> * Do these steps only if we actually have a
>> regular semijoin,
>> * as opposed to a case where we should
>> unique-ify the RHS.
>> */
>
> The point here is that a semijoin ordinarily requires forming the whole
> LHS relation (ie, min_lefthand) before applying the special join rule.
> However, if you unique-ify the RHS then it's a regular inner join and
> you don't have to obey that restriction, ie, you can join to just part
> of min_lefthand. Now ordinarily that's not an amazingly good idea but
> there are important special cases where it is. IIRC the case that
> motivated this was
>
> SELECT FROM a, b WHERE (a.x, b.y) IN (SELECT c1, c2 FROM c)
>
> If you do this as a semijoin then you are forced to form the cartesian
> product of a*b before you can semijoin to c. If you uniqueify c
> then you can join it to a first and then b using regular joins (possibly
> indexscans on a.x and then b.y), or b and then a.
>
> So join_is_legal allows such a join order, and the code in make_join_rel
> has to be careful not to claim that "a semijoin c" is a valid way of
> forming that join.
Gotcha.
> I'll change the comment. Does this help?
>
> /*
> * We might have a normal semijoin, or a case where we don't have
> * enough rels to do the semijoin but can unique-ify the RHS and
> * then do an innerjoin. In the latter case we can't apply
> * JOIN_SEMI joining.
> */
It's an improvement, but your example above is so helpful in
understanding what is going on here that it might be worth explicitly
mentioning it in the comment, maybe something like this:
/** In a case like the following, we don't have enough rels to plan
this as a semijoin,* but we don't give up completely, because it might be possible to
unique-ify the* RHS and perform part of the join at this level.** SELECT FROM a, b WHERE (a.x, b.y) IN (SELECT c1, c2
FROMc)*/
...Robert