Dennis Haney <davh@diku.dk> writes:
> But this limited optimization makes me wonder, why the limitation to
> optimizing '='?
In the first place, you wouldn't get any improvement anyway if the
combining operator is not '=' --- if it isn't, then merge and hash join
aren't applicable and so you're gonna end up with a nestloop anyhow,
which is no better than what the executor will do with a subselect.
In the second place, what the code is doing is dependent on an understanding
of the semantics of IN; I'm not sure it's applicable to, say,WHERE outervar > ANY (SELECT innervar FROM ...)
and it's definitely not applicable toWHERE outervar > ALL (SELECT innervar FROM ...)
In particular, the optimization paths that involve unique-ifying the
subselect output and then using it as the outer side of a join would
definitely not work for these sorts of things.
> And why must the lefthand of the sublink be a variable of the upper query?
Otherwise the expression isn't a join and I don't think the semantics are
the same as the code is expecting.
> Then I don't understand why it gives two different execution plans?
They look the same to me, other than that a different join rule is
needed (because after all IN is not the same thing as a straight join).
regards, tom lane