Re: new correlation metric
От | Ron Mayer |
---|---|
Тема | Re: new correlation metric |
Дата | |
Msg-id | 49056D70.7030802@cheapcomplexdevices.com обсуждение исходный текст |
Ответ на | new correlation metric (Jeff Davis <pgsql@j-davis.com>) |
Ответы |
Re: new correlation metric
|
Список | pgsql-hackers |
Jeff Davis wrote: > Currently, we use correlation to estimate the I/O costs of an index > scan. However, this has some problems: It certainly helps some cases. Without the patch, the little test script below ends up picking the third fastest plan (a seq-scan) instead of a faster bitmapscan, or an even faster-than-that indexscan for the query below. With the patch, it finds the fastest index scan. Without Patch Estimated_cost Actual_Time Index Scan 39638.36 331ms Bitmap Scan 22218.43 415ms Seq Scan 20125.83 595ms With Patch Estimated_cost Actual_Time Index Scan 17684.18 333ms Bitmap Scan 22110.60 400ms Seq Scan 20117.51 573ms I was somewhat surprised that the bitmap cost estimates didn't also change much. Wouldn't the estimated # of data blocks read for the bitmap be roughly the same as for the index? And yes, I know that table's a contrived one that is almost ideal for this patch - but I have some large clustered-by-zip address tables where I can find queries that show similar results. Back in 8.0 I cared a lot since I had a number of real-world queries picking Seq-Scans instead of Index-Scans. With 8.3, though, AFAICT the vast majority of my similar real-world queries pick the bitmap scans which in practice are pretty close in speed to the index scans. ====================================================================== -- [1] Test script variation from this 2005 thread: -- http://archives.postgresql.org/pgsql-hackers/2005-02/msg00298.php create temporary table tmp1mil as select * from (select generate_series as a from generate_series(0,9))as a, (select generate_series as b from generate_series(0,9)) as b, (selectgenerate_series as c from generate_series(0,9)) as c, (select generate_series as d from generate_series(0,9))as d, (select generate_series as e from generate_series(0,9)) as e, (selectgenerate_series as f from generate_series(0,9)) as f order by a,b,c,d,e,f; create index tmp1mil__c on tmp1mil(c); vacuum analyze tmp1mil; select * from pg_stats where tablename='tmp1mil'; \timing explain select count(*) from tmp1mil where c<5; select count(*) from tmp1mil where c<5; select count(*) from tmp1mil where c<5; -- 615 ms seqscan set enable_seqscan = false; explain select count(*) from tmp1mil where c<5; select count(*) from tmp1mil where c<5; select count(*) from tmp1mil where c<5; -- 425 ms bitmapscan set enable_bitmapscan to false; explain select count(*) from tmp1mil where c<5; select count(*) from tmp1mil where c<5; select count(*) from tmp1mil where c<5; -- 342 ms indexscan
В списке pgsql-hackers по дате отправления:
Предыдущее
От: Martijn van OosterhoutДата:
Сообщение: Re: WIP patch: convert SQL-language functions to return tuplestores