Обсуждение: Query Performance...
We have recently converted from MS SQLServer 7 to Postgres 7.1... The following query runs perfectly fine in SQL Server with any where clause attached, but in Postgres runs fine until you put a where clause that has *no* criteria for Patients... then it runs forever... the table sizes are approx: Patients: 1.5M rows Dr: 2000 rows Dr Groups: 500 rows Rx: 750000 rows All of the fields that are joined and used in where clauses are indexed, and the query that has problems is: SELECT Pharm.Name, Pharm.City, Pharm.Phone, Dr.LastName, Dr.City, Dr.Phone, DrGroup.Name, P.LastName, P.FirstName,P.DOB, Rx.Medication, Rx.Unit, Rx.Qty FROM Rx INNER JOIN Pharm ON Rx.PharmID = Pharm.PharmID INNER JOIN Dr ON Rx.DrID = Dr.DrID INNER JOIN Patient P ON Rx.PatientID = P.PatientID LEFT OUTER JOIN DrGroup ON Dr.DrGroupID = DrGroup.DrGroupID I figure that there must be some other way to get it to run... Any help is appreciated... JH
On Wed, Jul 17, 2002 at 01:31:29PM -0700, Jeffrey Hood wrote: > We have recently converted from MS SQLServer 7 to Postgres 7.1... The > following query runs perfectly fine in SQL Server with any where > clause attached, but in Postgres runs fine until you put a where > clause that has *no* criteria for Patients... then it runs forever... > the table sizes are approx: EXPLAIN ANALYSE output please. -- Martijn van Oosterhout <kleptog@svana.org> http://svana.org/kleptog/ > There are 10 kinds of people in the world, those that can do binary > arithmetic and those that can't.
jhood@hmcon.com (Jeffrey Hood) writes: > We have recently converted from MS SQLServer 7 to Postgres 7.1... The > following query runs perfectly fine in SQL Server with any where > clause attached, but in Postgres runs fine until you put a where > clause that has *no* criteria for Patients... then it runs forever... Define "runs forever". AFAICT this query is going to give 750K rows of output, if there's no WHERE clause. How long do you expect that to take? How long does it actually take? Is SQL Server really any faster given the same query with no WHERE? regards, tom lane
Here is the output from an explain... explain select r.dateissued, r.medication, p.lastname, p.dob from rx r inner join patient p on r.patientid = p.patientid where r.dateissued between '7/13/02' and '7/14/02' and lower (p.lastname) = 'may'; Merge Join (cost=42343.82..42498.98 rows=248957 width=64) -> Sort (cost=8590.77..8590.77 rows=2515 width=32) -> Index Scan using idx_rx_date_issued on rx r (cost=0.00..8448.70 rows=2515 width=32) -> Sort (cost=33753.05..33753.05 rows=9897 width=32) -> Index Scan using idx_patient_last_name on patient p (cost=0.00..32959.90 rows=9897 width=32) explain select r.dateissued, r.medication, p.lastname, p.dob from rx r inner join patient p on r.patientid = p.patientid where r.dateissued between '7/13/02' and '7/14/02'; Merge Join (cost=237899.24..250302.47 rows=24895698 width=64) -> Sort (cost=8590.77..8590.77 rows=2515 width=32) -> Index Scan using idx_rx_date_issued on rx r (cost=0.00..8448.70 rows=2515 width=32) -> Sort (cost=229308.47..229308.47 rows=989743 width=32) -> Seq Scan on patient p (cost=0.00..35256.43 rows=989743 width=32) The first would return around 4 rows, the second around 1000... How does one get rid of the table scan on patient in the second...??? Thanks, JH
jhood@hmcon.com (Jeffrey Hood) writes: > explain > select r.dateissued, r.medication, p.lastname, p.dob > from rx r > inner join patient p on r.patientid = p.patientid > where r.dateissued between '7/13/02' and '7/14/02'; > Merge Join (cost=237899.24..250302.47 rows=24895698 width=64) > -> Sort (cost=8590.77..8590.77 rows=2515 width=32) > -> Index Scan using idx_rx_date_issued on rx r > (cost=0.00..8448.70 rows=2515 width=32) > -> Sort (cost=229308.47..229308.47 rows=989743 width=32) > -> Seq Scan on patient p (cost=0.00..35256.43 rows=989743 > width=32) > How does one get rid of the table scan on patient in the second...??? The only *possible* alternative to a seqscan on patient would be to use a nestloop with inner indexscan on patient.patientid, and I'm not at all clear that that'd be faster than the seqscan --- it would depend on how many rows are actually returned by the rx scan. Have you got an index on patientid? If you set enable_seqscan = off, does the plan change? regards, tom lane
> > explain > > select r.dateissued, r.medication, p.lastname, p.dob > > from rx r > > inner join patient p on r.patientid = p.patientid > > where r.dateissued between '7/13/02' and '7/14/02'; > > > Merge Join (cost=237899.24..250302.47 rows=24895698 width=64) > > -> Sort (cost=8590.77..8590.77 rows=2515 width=32) > > -> Index Scan using idx_rx_date_issued on rx r > > (cost=0.00..8448.70 rows=2515 width=32) > > -> Sort (cost=229308.47..229308.47 rows=989743 width=32) > > -> Seq Scan on patient p (cost=0.00..35256.43 rows=989743 > > width=32) > > > How does one get rid of the table scan on patient in the second...??? > > The only *possible* alternative to a seqscan on patient would be to use > a nestloop with inner indexscan on patient.patientid, and I'm not at all > clear that that'd be faster than the seqscan --- it would depend on how > many rows are actually returned by the rx scan. > > Have you got an index on patientid? If you set enable_seqscan = off, > does the plan change? There is an index on patientid... and setting enable_seqscan = off and running explain shows that the index will be used, but running the query, it takes the same time (45 sec) that it does with the scan... and the same query on SQLServer and mysql runs in .5 sec... (they both always use the index...) Thanks, JH