On Mon, Mar 7, 2011 at 2:00 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Jon Nelson <jnelson+pgsql@jamponi.net> writes:
>> I was hoping that somebody could help me understand the differences
>> between three plans.
>> All of the plans are updating a table using a second table, and should
>> be logically equivalent.
>> Two of the plans use joins, and one uses an exists subquery.
>> One of the plans uses row constructors and IS NOT DISTINCT FROM. It is
>> this plan which has really awful performance.
>> Clearly it is due to the nested loop, but why would the planner choose
>> that approach?
>
> IS NOT DISTINCT FROM pretty much disables all optimizations: it can't be
> an indexqual, merge join qual, or hash join qual. So it's not
> surprising that you get a sucky plan for it. Possibly somebody will
> work on improving that someday.
>
> As for your other questions, what PG version are you using? Because I
> do get pretty much the same plan (modulo a plain join versus a semijoin)
> for the first two queries, when using 9.0 or later. And the results of
> ANALYZE are only approximate, so you shouldn't be surprised at all if a
> rowcount estimate is off by a couple percent. Most of the time, you
> should be happy if it's within a factor of 2 of reality.
Sorry - I had stated in the original post that I was using 8.4.5 on 64
bit openSUSE and CentOS 5.5, and had forgotten to carry that
information over into the second post.
What is the difference between a plain join and a semi join?
--
Jon