Martijn van Oosterhout wrote:
> That's because they're not equivalent. IN/NOT IN have special semantics
> w.r.t. NULLs that make them a bit more difficult to optimise. OUTER
> JOINs on the other hand is easier since in a join condition anything =
> NULL evaluates to NULL -> FALSE.
Which is why Hash IN Joins were added, presumably. But there's nothing
analogous for NOT IN, I guess, perhaps there can't be.
> I think there's been some discussion about teaching the planner about
> columns that cannot be NULL (like primary keys) thus allowing it to
> perform this transformation safely. I don't know if anyone has done it
> though...
Yeah, I've noticed cases where I've thought "Ah, the planner doesn't
know that column can't be null". Similarly, it has seemed to me that
knowing that a column was UNIQUE could have made for a better plan,
although I can't think of any examples off-hand. Maybe where I saw it
using a Hash aggregate on a unique column, and I thought it could just
use the index, although that may not make sense either.
- John D. Burger
MITRE