Обсуждение: Query Performance...

Поиск
Список
Период
Сортировка

Query Performance...

От
jhood@hmcon.com (Jeffrey Hood)
Дата:
We have recently converted from MS SQLServer 7 to Postgres 7.1...  The
following query runs perfectly fine in SQL Server with any where
clause attached, but in Postgres runs fine until you put a where
clause that has *no* criteria for Patients... then it runs forever...
the table sizes are approx:

Patients:   1.5M rows
Dr:         2000 rows
Dr Groups:  500 rows
Rx:         750000 rows

All of the fields that are joined and used in where clauses are
indexed, and the query that has problems is:

SELECT
Pharm.Name, Pharm.City, Pharm.Phone,
Dr.LastName, Dr.City, Dr.Phone,
DrGroup.Name,
P.LastName, P.FirstName,P.DOB,
Rx.Medication, Rx.Unit, Rx.Qty
FROM Rx
INNER JOIN Pharm ON Rx.PharmID = Pharm.PharmID
INNER JOIN Dr ON Rx.DrID = Dr.DrID
INNER JOIN Patient P ON Rx.PatientID = P.PatientID
LEFT OUTER JOIN DrGroup ON Dr.DrGroupID = DrGroup.DrGroupID

I figure that there must be some other way to get it to run...

Any help is appreciated...

JH

Re: Query Performance...

От
Martijn van Oosterhout
Дата:
On Wed, Jul 17, 2002 at 01:31:29PM -0700, Jeffrey Hood wrote:
> We have recently converted from MS SQLServer 7 to Postgres 7.1...  The
> following query runs perfectly fine in SQL Server with any where
> clause attached, but in Postgres runs fine until you put a where
> clause that has *no* criteria for Patients... then it runs forever...
> the table sizes are approx:

EXPLAIN ANALYSE output please.
--
Martijn van Oosterhout   <kleptog@svana.org>   http://svana.org/kleptog/
> There are 10 kinds of people in the world, those that can do binary
> arithmetic and those that can't.

Re: Query Performance...

От
Tom Lane
Дата:
jhood@hmcon.com (Jeffrey Hood) writes:
> We have recently converted from MS SQLServer 7 to Postgres 7.1...  The
> following query runs perfectly fine in SQL Server with any where
> clause attached, but in Postgres runs fine until you put a where
> clause that has *no* criteria for Patients... then it runs forever...

Define "runs forever".  AFAICT this query is going to give 750K rows of
output, if there's no WHERE clause.  How long do you expect that to
take?  How long does it actually take?  Is SQL Server really any faster
given the same query with no WHERE?

            regards, tom lane

Re: Query Performance...

От
jhood@hmcon.com (Jeffrey Hood)
Дата:
Here is the output from an explain...

explain
select r.dateissued, r.medication, p.lastname, p.dob
from rx r
inner join patient p on r.patientid = p.patientid
where r.dateissued between '7/13/02' and '7/14/02'
and lower (p.lastname) = 'may';

Merge Join  (cost=42343.82..42498.98 rows=248957 width=64)
  ->  Sort  (cost=8590.77..8590.77 rows=2515 width=32)
        ->  Index Scan using idx_rx_date_issued on rx r
(cost=0.00..8448.70 rows=2515 width=32)
  ->  Sort  (cost=33753.05..33753.05 rows=9897 width=32)
        ->  Index Scan using idx_patient_last_name on patient p
(cost=0.00..32959.90 rows=9897 width=32)

explain
select r.dateissued, r.medication, p.lastname, p.dob
from rx r
inner join patient p on r.patientid = p.patientid
where r.dateissued between '7/13/02' and '7/14/02';

Merge Join  (cost=237899.24..250302.47 rows=24895698 width=64)
  ->  Sort  (cost=8590.77..8590.77 rows=2515 width=32)
        ->  Index Scan using idx_rx_date_issued on rx r
(cost=0.00..8448.70 rows=2515 width=32)
  ->  Sort  (cost=229308.47..229308.47 rows=989743 width=32)
        ->  Seq Scan on patient p  (cost=0.00..35256.43 rows=989743
width=32)

The first would return around 4 rows, the second around 1000...

How does one get rid of the table scan on patient in the second...???

Thanks,
JH

Re: Query Performance...

От
Tom Lane
Дата:
jhood@hmcon.com (Jeffrey Hood) writes:
> explain
> select r.dateissued, r.medication, p.lastname, p.dob
> from rx r
> inner join patient p on r.patientid = p.patientid
> where r.dateissued between '7/13/02' and '7/14/02';

> Merge Join  (cost=237899.24..250302.47 rows=24895698 width=64)
>   ->  Sort  (cost=8590.77..8590.77 rows=2515 width=32)
>         ->  Index Scan using idx_rx_date_issued on rx r
> (cost=0.00..8448.70 rows=2515 width=32)
>   ->  Sort  (cost=229308.47..229308.47 rows=989743 width=32)
>         ->  Seq Scan on patient p  (cost=0.00..35256.43 rows=989743
> width=32)

> How does one get rid of the table scan on patient in the second...???

The only *possible* alternative to a seqscan on patient would be to use
a nestloop with inner indexscan on patient.patientid, and I'm not at all
clear that that'd be faster than the seqscan --- it would depend on how
many rows are actually returned by the rx scan.

Have you got an index on patientid?  If you set enable_seqscan = off,
does the plan change?

            regards, tom lane

Re: Query Performance...

От
jhood@hmcon.com (Jeffrey Hood)
Дата:
> > explain
> > select r.dateissued, r.medication, p.lastname, p.dob
> > from rx r
> > inner join patient p on r.patientid = p.patientid
> > where r.dateissued between '7/13/02' and '7/14/02';
>
> > Merge Join  (cost=237899.24..250302.47 rows=24895698 width=64)
> >   ->  Sort  (cost=8590.77..8590.77 rows=2515 width=32)
> >         ->  Index Scan using idx_rx_date_issued on rx r
> > (cost=0.00..8448.70 rows=2515 width=32)
> >   ->  Sort  (cost=229308.47..229308.47 rows=989743 width=32)
> >         ->  Seq Scan on patient p  (cost=0.00..35256.43 rows=989743
> > width=32)
>
> > How does one get rid of the table scan on patient in the second...???
>
> The only *possible* alternative to a seqscan on patient would be to use
> a nestloop with inner indexscan on patient.patientid, and I'm not at all
> clear that that'd be faster than the seqscan --- it would depend on how
> many rows are actually returned by the rx scan.
>
> Have you got an index on patientid?  If you set enable_seqscan = off,
> does the plan change?

There is an index on patientid... and setting enable_seqscan = off and
running explain shows that the index will be used, but running the
query, it takes the same time (45 sec) that it does with the scan...
and the same query on SQLServer and mysql runs in .5 sec... (they both
always use the index...)

Thanks,
JH