Re: Performance problem with semi-large tables
От | PFC |
---|---|
Тема | Re: Performance problem with semi-large tables |
Дата | |
Msg-id | opsldw48czth1vuj@musicbox обсуждение исходный текст |
Ответ на | Performance problem with semi-large tables ("Ken Egervari" <ken@upfactor.com>) |
Список | pgsql-performance |
> select s.*, ss.* > from shipment s, shipment_status ss, release_code r > where s.current_status_id = ss.id > and ss.release_code_id = r.id > and r.filtered_column = '5' > order by ss.date desc > limit 100; > Release code is just a very small table of 8 rows by looking at the > production data, hence the 0.125 filter ratio. However, the data > distribution is not normal since the filtered column actually pulls out > about 54% of the rows in shipment_status when it joins. Postgres seems > to be doing a sequencial scan to pull out all of these rows. Next, it > joins approx 17550 rows to shipment. Since this query has a limit, it > only returns the first 100, which seems like a waste. Well, postgres does what you asked. It will be slow, because you have a full table join. LIMIT does not change this because the rows have to be sorted first. The date is in shipment_status so you should first get the shipment_status.id that you need and later join to shipment. This will avoid the big join : SELECT s.*, ss.* FROM (SELECT * FROM shipment_status WHERE release_code_id IN (SELECT r.id FROM release_code WHERE r.filtered_column = '5') ORDER BY date DESC LIMIT 100 ) as ss, shipment s WHERE s.current_status_id = ss.id ORDER BY date DESC LIMIT 100 Is this better ?
В списке pgsql-performance по дате отправления: