new correlation metric
От | Jeff Davis |
---|---|
Тема | new correlation metric |
Дата | |
Msg-id | 1225010283.21241.33.camel@localhost.localdomain обсуждение исходный текст |
Ответы |
Re: new correlation metric
Re: new correlation metric |
Список | pgsql-hackers |
Currently, we use correlation to estimate the I/O costs of an index scan. However, this has some problems: http://archives.postgresql.org/pgsql-hackers/2008-01/msg01084.php I worked with Nathan Boley to come up with what we think is a better metric for measuring this cost. It is based on the number of times in the ordered sample that you have to physically backtrack (i.e. the data value increases, but the physical position is earlier). For example, if the table's physical order is 6 7 8 9 10 1 2 3 4 5 then the new metric will see one backtrack -- after reading 5 it backtracks to the beginning to read 6. The old code would have produced a correlation near -0.5. The new code will yield a "correlation" of 8/9. The old code would have produced a run_cost of 1/4*min_IO_cost + 3/4*max_IO_cost, while the new code produces a run_cost of 8/9*min_IO_cost + 1/9*max_IO_cost. Attached is a patch. We replaced "correlation", but left the name the same (even though that's now a misnomer). We can change it if needed. Comments? Regards, Jeff Davis
Вложения
В списке pgsql-hackers по дате отправления: