WHERE vs ORDER BY vs LIMIT why not using the correct index?
От | D. Dante Lorenso |
---|---|
Тема | WHERE vs ORDER BY vs LIMIT why not using the correct index? |
Дата | |
Msg-id | 4783FC05.5000907@lorenso.com обсуждение исходный текст |
Ответы |
Re: WHERE vs ORDER BY vs LIMIT why not using the correct index?
|
Список | pgsql-general |
All, I have a simple query: SELECT tcq_id FROM queue q WHERE q.status = 'D' ORDER BY tcq_id ASC LIMIT 1; What I'm trying to do is find all the items in the queue which have a status of 'D' and then select the oldest item first. My data is very skewed such that there are not many records matching the WHERE clause. status my_count D 4 Q 6 S 20 P 74 F 3294 C 291206 However, when I explain that query, it's doing an index scan on the 'tcq_id' primary key column instead of using the index on 'status': ------------------------------ EXPLAIN SELECT tcq_id FROM queue q WHERE q.status = 'D' ORDER BY tcq_id ASC LIMIT 1; Limit (cost=0.00..40.09 rows=1 width=8) -> Index Scan using queue_pkey on queue q (cost=0.00..59059.80 rows=1473 width=8) Filter: (status = 'D'::bpchar) ------------------------------ But then when I take out the ORDER BY clause, PostgreSQL will use a different (and proper) index: ------------------------------ EXPLAIN SELECT tcq_id FROM transcode_queue q WHERE q.status = 'D' LIMIT 1; Limit (cost=0.00..3.81 rows=1 width=8) -> Index Scan using queue_idx_status on queue q (cost=0.00..5618.07 rows=1473 width=8) Index Cond: (status = 'D'::bpchar) ------------------------------ I don't understand why the ORDER BY condition would be affecting my WHERE criteria. Shouldn't the ordering be done after the filter is first applied? I'm wanting: "find the 4 rows where status = 'D' then order those by tcq_id and return the first one." But postgresql seems to be choosing: "order all records by the tcq_id then scan them sequentially and find the first one matching status = 'D'". How can I influence the planner's decision while keeping my ORDER BY clause? After furthing testing, maybe it's not the ORDER BY but the LIMIT that is causing the poor planner choice? I tried to do this: ------------------------------ SELECT tcq_id FROM ( SELECT tcq_id FROM queue q WHERE q.status = 'D' ) x ORDER BY x.tcq_id ASC LIMIT 1; Limit (cost=0.00..40.09 rows=1 width=8) -> Index Scan using queue_pkey on queue q (cost=0.00..59059.80 rows=1473 width=8) Filter: (status = 'D'::bpchar) ------------------------------ But this results in another wrong index choice. So, I removed the LIMIT clause and now it does use the right index: ------------------------------ SELECT tcq_id FROM ( SELECT tcq_id FROM queue q WHERE q.status = 'D' ) x ORDER BY x.tcq_id ASC; Sort (cost=4314.36..4318.05 rows=1473 width=8) Sort Key: q.tcq_id -> Bitmap Heap Scan on queue q (cost=35.71..4236.85 rows=1473 width=8) Recheck Cond: (status = 'D'::bpchar) -> Bitmap Index Scan on queue_idx_status (cost=0.00..35.34 rows=1473 width=0) Index Cond: (status = 'D'::bpchar) ------------------------------ Can someone shed some insight here and help me understand what's going on? -- Dante
В списке pgsql-general по дате отправления:
Следующее
От: Ivan Sergio BorgonovoДата:
Сообщение: benchmarking update/insert and random record update