Re: PostgreSQL runs a query much slower than BDE and MySQL

Поиск
Список
Период
Сортировка
От Tom Lane
Тема Re: PostgreSQL runs a query much slower than BDE and MySQL
Дата
Msg-id 29246.1155768717@sss.pgh.pa.us
обсуждение исходный текст
Ответ на PostgreSQL runs a query much slower than BDE and MySQL  ("Peter Hardman" <peter@ssbg.zetnet.co.uk>)
Ответы Re: PostgreSQL runs a query much slower than BDE and MySQL  ("Peter Hardman" <peter@ssbg.zetnet.co.uk>)
Re: PostgreSQL runs a query much slower than BDE and MySQL  (Scott Lamb <slamb@slamb.org>)
Список pgsql-performance
"Peter Hardman" <peter@ssbg.zetnet.co.uk> writes:
> I'm in the process of migrating a Paradox 7/BDE 5.01 database from single-user
> Paradox to a web based interface to either MySQL or PostgreSQL.
> The query I run is:

> /* Select all sheep who's most recent transfer was into the subject flock */
> SELECT DISTINCT f1.regn_no, f1.transfer_date as date_in
> FROM SHEEP_FLOCK f1 JOIN
>     /* The last transfer date for each sheep */
>     (SELECT f.regn_no, MAX(f.transfer_date) as last_xfer_date
>     FROM  SHEEP_FLOCK f
>     GROUP BY f.regn_no) f2
> ON f1.regn_no = f2.regn_no
> WHERE f1.flock_no = '1359'
> AND f1.transfer_date = f2.last_xfer_date

This seems pretty closely related to this recent thread:
http://archives.postgresql.org/pgsql-performance/2006-08/msg00220.php
in which the OP is doing a very similar kind of query in almost exactly
the same way.

I can't help thinking that there's probably a better way to phrase this
type of query in SQL, though it's not jumping out at me what that is.

What I find interesting though is that it sounds like both MSSQL and
Paradox know something we don't about how to optimize it.  PG doesn't
have any idea how to do the above query without forming the full output
of the sub-select, but I suspect that the commercial DBs know a
shortcut; perhaps they are able to automatically derive a restriction
in the subquery similar to what you did by hand.  Does Paradox have
anything comparable to EXPLAIN that would give a hint about the query
plan they are using?

Also, just as in the other thread, I'm thinking that a seqscan+hash
aggregate would be a better idea than this bit:

>                    ->  GroupAggregate  (cost=0.00..3924.91 rows=33676 width=13) (actual time=0.324..473.131
rows=38815loops=1) 
>                          ->  Index Scan using sheep_flock_pkey on sheep_flock f (cost=0.00..3094.95 rows=81802
width=13)(actual time=0.295..232.156) 

Possibly you need to raise work_mem to get it to consider the hash
aggregation method.

BTW, are you *sure* you are testing PG 8.1?  The "Subquery Scan f2" plan
node looks unnecessary to me, and I'd have expected 8.1 to drop it out.
8.0 and before would have left it in the plan though.  This doesn't make
all that much difference performance-wise in itself, but it does make me
wonder what you are testing.

            regards, tom lane

В списке pgsql-performance по дате отправления:

Предыдущее
От: "Peter Hardman"
Дата:
Сообщение: Re: PostgreSQL runs a query much slower than BDE and MySQL
Следующее
От: Steve Poe
Дата:
Сообщение: Re: Postgresql Performance on an HP DL385 and