Re: Should we update the random_page_cost default value?
От | Andres Freund |
---|---|
Тема | Re: Should we update the random_page_cost default value? |
Дата | |
Msg-id | ke6nnme7eiwf3m2whrxwj6s2zbzkoex2tvwohd6bcshs4mkdva@e5irn76rhcrj обсуждение исходный текст |
Ответ на | Re: Should we update the random_page_cost default value? (Laurenz Albe <laurenz.albe@cybertec.at>) |
Список | pgsql-hackers |
Hi, On 2025-10-06 07:34:53 +0200, Laurenz Albe wrote: > However, my practical experience is that PostgreSQL tends to favor > sequential scans too much. Often, that happens together with a parallel > plan, and I find PostgreSQL with the default configuration prefer a plan > with two parallel workers performing a sequential scan with a ridiculously > selective (correctly estimated!) filter condition like 500 rows out of a > million over an index scan that is demonstrably faster. > > I have no artificial reproducer for that, and I admit that I didn't hunt > down the reason why the planner might prefer such a plan. I just tell > people to lower random_page_cost, and the problem goes away. So I am > clearly fighting symptoms. Often, an alternative solution is to set > max_parallel_workers_per_gather to 0, which seems to suggest that perhaps > the suggestion from [1] is more interesting than I thought. I've seen this quite often too. IIRC in the cases I've actually analyzed in any depth it came down to a few root causes: 1) Fast start plans (where the planner though that a sequential scan will find a matching tuple quickly, but doesn't, leading to scanning most of the table). I frankly think we should just disable these, they're a very low confidence bet with high costs in the case of a loss. 2) Not taking the likelihood of data already being cached into account leads to preferring sequential scans due to seq_page_cost, even though the index scan would not have required any IO 3) Our costing for the cost of predicate evaluation is extremely poor. Among the reasons are - There is no difference in cost between common operators, despite significant real evaluation cost. E.g. int and text operators are not close in evaluation cost. - IIRC we disregard the cost of potentially needing to detoast completely, despite that very easily becoming the determining factor - Tuple deforming cost. It's a lot more CPU intensive to deform column 105 than column 5, often the index might avoid needing to do the more epensive deforming, but we don't take that into account. This often leads to under-estimating the CPU cost of seqscans. Greetings, Andres Freund
В списке pgsql-hackers по дате отправления: