Re: verbose cost estimate
От | Greg Stark |
---|---|
Тема | Re: verbose cost estimate |
Дата | |
Msg-id | CAM-w4HO8FZckhf1AAzx=8RzRcJSJ3jDg_kz+K4yvE9JgDrgHEw@mail.gmail.com обсуждение исходный текст |
Ответ на | Re: verbose cost estimate (Tomas Vondra <tomas.vondra@2ndquadrant.com>) |
Ответы |
Re: verbose cost estimate
(Tomas Vondra <tomas.vondra@2ndquadrant.com>)
|
Список | pgsql-hackers |
On Mon, 9 Dec 2019 at 17:14, Tomas Vondra <tomas.vondra@2ndquadrant.com> wrote: > > On Sat, Dec 07, 2019 at 11:34:12AM -0500, Tom Lane wrote: > >Justin Pryzby <pryzby@telsasoft.com> writes: > >> Jeff said: > >>> |What would I find very useful is a verbosity option to get the cost > >>> |estimates expressed as a multiplier of each *_cost parameter, rather than > >>> |just as a scalar. > > > >> It seems to me that's "just" a matter of redefining Cost and fixing everything that breaks: > > > >> struct Cost { > >> double seq, rand; > >> double cpu_tuple, cpu_index_tuple, cpu_oper; > >> double parallel_setup; // This is probably always in startup_cost and never in run_cost > >> double parallel_tuple; // This is probably always in run_cost and never in startup_cost > >> double disable; > >> }; > > > >> I'm perhaps 50% done with that - is there some agreement that's a desirable > >> goal and a good way to do it ? > > > >No, I think this will get rejected out of hand. The implications for > >the planner's speed and memory consumption seem quite unacceptable > >for the size of the benefit. What you're showing above probably > >doubles the size of most Paths, and the added cycles in hot-spots > >like add_path seem pretty daunting. > > > > Yeah, that's an issue. But I have to admit my main issue with this > proposal is that I have no idea how I'd interpret this Cost. I mean, > what do the fields express for different types of paths? How do they > contribute to the actual cost of that path? What I think users would be able to do with this info is understand which parameter to tweak to raise the estimated cost of the node. Everyone knows if you see a index scan is being used but is taking longer than a sequential scan then you might try raising random_page_cost. But I rarely see people tweaking the more "exotic" parameters like operator_tuple_cost or index_tuple_cost and when they do they aren't really sure what nodes they're affecting... I remember planning to do a very similar thing back in the 8.3 era and never getting around to it. You could imaging even storing these for the overall plan in the logs and building a large matrix of actual execution values versus these broken out individual costs. Then it becomes a standard linear optimization problem to find the optimal values for each parameter to minimize inaccurate plan estimates (and to identify cases where there are outliers). -- greg
В списке pgsql-hackers по дате отправления: