Re: SELECT MIN, MAX took longer time than SELECT
От | K C Lau |
---|---|
Тема | Re: SELECT MIN, MAX took longer time than SELECT |
Дата | |
Msg-id | 6.2.1.2.0.20060121211310.08cc91c8@localhost обсуждение исходный текст |
Ответ на | Re: SELECT MIN, MAX took longer time than SELECT COUNT, MIN, MAX (Tom Lane <tgl@sss.pgh.pa.us>) |
Список | pgsql-performance |
I have worked round the issue by using 2 separate queries with the LIMIT construct. LogSN and create_time are indeed directly correlated, both monotonously increasing, occasionally with multiple LogSN's having the same create_time. What puzzles me is why the query with COUNT, MIN, MAX uses idx_logtime for the scan, but the query without the COUNT uses pk_log and takes much longer. If it had chosen idx_logtime instead, then it should have returned immediately for both MIN and MAX. Best regards, KC. At 02:51 06/01/21, Tom Lane wrote: >"Jim C. Nasby" <jnasby@pervasive.com> writes: > > On Fri, Jan 20, 2006 at 12:35:36PM +0800, K C Lau wrote: > > Here's the problem... the estimate for the backwards index scan is *way* > > off: > > >> -> Limit (cost=0.00..1.26 rows=1 width=4) (actual > >> time=200032.928..200032.931 rows=1 loops=1) > >> -> Index Scan Backward using pk_log on > >> log (cost=0.00..108047.11 rows=86089 width=4) (actual > >> time=200032.920..200032.920 rows=1 loops=1) > >> Filter: (((create_time)::text < '2005/10/19'::text) AND > >> (logsn IS NOT NULL)) > >> Total runtime: 200051.701 ms > >It's more subtle than you think. The estimated rowcount is the >estimated number of rows fetched if the indexscan were run to >completion, which it isn't because the LIMIT cuts it off after the >first returned row. That estimate is not bad (we can see from the >aggregate plan that the true value would have been 106708, assuming >that the "logsn IS NOT NULL" condition isn't filtering anything). > >The real problem is that it's taking quite a long time for the scan >to reach the first row with create_time < 2005/10/19, which is not >too surprising if logsn is strongly correlated with create_time ... >but in the absence of any cross-column statistics the planner has >no very good way to know that. (Hm ... but both of them probably >also show a strong correlation to physical order ... we could look >at that maybe ...) The default assumption is that the two columns >aren't correlated and so it should not take long to hit the first such >row, which is why the planner likes the indexscan/limit plan. > > regards, tom lane
В списке pgsql-performance по дате отправления: