Re: [HACKERS] having and union in v7beta

Поиск
Список
Период
Сортировка
От Tom Lane
Тема Re: [HACKERS] having and union in v7beta
Дата
Msg-id 6366.951937607@sss.pgh.pa.us
обсуждение исходный текст
Ответ на Re: [HACKERS] having and union in v7beta  (Tom Lane <tgl@sss.pgh.pa.us>)
Список pgsql-hackers
I wrote:
> I compiled up current code with profiling enabled (make PROFILE=-pg
> if you want to try it), and found that actually nearly all of the
> runtime for
>     select * from comuni union select * from comuni
> is spent in the sort step; so I was on the wrong track in guessing
> that there might be a performance problem in the new Unique coding.

Wait a second.  Stop the presses.  I see what's going on here.

6.5.*:
play=> explain select * from comuni union select * from comuni;
NOTICE:  QUERY PLAN:

Seq Scan on comuni  (cost=512.00 rows=10000 width=84)

7.0beta1:
regression=# explain select * from comuni union select * from comuni;
NOTICE:  QUERY PLAN:

Unique  (cost=149.66..184.66 rows=200 width=84) ->  Sort  (cost=149.66..149.66 rows=2000 width=84)       ->  Append
(cost=0.00..40.00rows=2000 width=84)               ->  Seq Scan on comuni  (cost=0.00..20.00 rows=1000 width=84)
      ->  Seq Scan on comuni  (cost=0.00..20.00 rows=1000 width=84)
 


7.0beta1's behavior is actually "correct", in the sense that it yields
the SQL-approved result: the UNION implies a DISTINCT pass over its
result, according to SQL, and 7.0beta1 is giving you a DISTINCT result.
6.5 is failing to generate the DISTINCT operation, because it
incorrectly simplifies "select foo union select foo" into "select foo"
if the two select queries are identical.  (There is a TODO item for this.)
So that's why 6.5 is a lot faster.  But it gives the wrong answer.

*However*, we have not fixed the bug that causes "select foo union
select foo" to be incorrectly simplified --- the UNION code is still
applying cnfify.  (Which it probably shouldn't, but I haven't wanted
to touch that code until I have the time to rewrite it completely.)
The reason 7.0beta1 generates the "right" answer is that it has a
recently-introduced bug in the comparison routines that causes it to
think the two select subqueries aren't the same.

I just fixed that bug, with the result that current CVS code is now back
to mis-simplifying this query.  (Yes, this is a step forward --- that
bug could have caused the system to unify two queries that AREN'T the
same, which would definitely be a bad thing...)

So, thanks!  You did indeed identify a bug!  But you should expect that
this query *will* get slower when we fix the other bug ;-).  You should
use a less silly test case for UNION if you want to make realistic
performance comparisons across versions.
        regards, tom lane


В списке pgsql-hackers по дате отправления:

Предыдущее
От: Tom Lane
Дата:
Сообщение: Re: [HACKERS] Re: bit types
Следующее
От: "Ross J. Reedstrom"
Дата:
Сообщение: Re: [HACKERS] Re: bit types