Re: An Idea for planner hints

Поиск
Список
Период
Сортировка
От Mark Dilger
Тема Re: An Idea for planner hints
Дата
Msg-id 44DA36D7.9020902@markdilger.com
обсуждение исходный текст
Ответ на Re: An Idea for planner hints  ("Jim C. Nasby" <jnasby@pervasive.com>)
Список pgsql-hackers
Jim C. Nasby wrote:
> Been suggested before... the problem is actually doing something useful
> with all that data that's collected, as well as how to collect it
> without greatly impacting the system.

Identifying the best plan by means of actually running multiple plans and timing 
them is useful.  That would be the point.

As far as "without greatly impacting the system", I don't think that is a real 
concern.  The whole idea is to greatly impact the system *once*, sometime when 
the DBA doesn't mind impacting the system (like before you go live on a 
production network, or between midnight and 3 AM, or whatever), and then store 
the best plan for future use.

The planner trades-off the desire to find the best plan and the need to find a 
plan quickly.  It also chooses a plan based on statistics and not based on 
actual runtimes (because there is a chicken-and-egg problem: how do you know 
which plan has the smallest runtime without running it?), so the chosen plan 
that looks best based on statistics might not actually be best.

The idea I'm proposing circumvents the whole trade-off problem by explicitly 
choosing to do something that makes the planner run really slowly and take a 
really long time.  But it doesn't do it "at runtime", in the sense that you 
don't do it for each query.  You just do it once up front and be done with it. 
Of course, this is only useful for people with reasonably static queries and 
reasonably static table statistics, so that a good plan found up-front continues 
to be a good plan as it is repeatedly used.

My personal motivation is that I have tables whose statistics are quite static.  The data itself changes, but the
statisticaldistribution from which the data 
 
is pulled is unchanging, so the table statistics end up about the same even as 
the data itself is added and deleted.  On top of that, the planner keeps 
choosing the wrong plan, which I know to be true because I can make individual 
queries run faster by structuring them in ways that the planner can't see 
through and "optimize" away the particular plan that I am effectively giving it.  But this is a PITA for me, especially
sinceI don't always know what the best 
 
plan might be and have to try them all until I find the right one.  (With the 
added complexity that I can't always figure out how to trick the planner into 
choosing a specific plan, and hence can't test it.)  It would be *so much 
easier* to have an option to tell the planner to try them all.

mark

> On Tue, Aug 08, 2006 at 08:23:05AM -0700, Mark Dilger wrote:
>> If this feature I'm proposing already exists, sorry for the waste of 
>> bandwidth, and could someone please point me to it? :)
>>
>> What if there were a mode that told postgres to do an exhaustive search (or 
>> if not exhaustive, then much more extensive search) of all plans (or many 
>> plans), trying each plan, reporting the performance of each, and discarding 
>> the query results, much like "explain analyze" does.  Postgres could then 
>> dump the best plan in machine readable (and semi-human readable) form which 
>> the planner could parse and use at some later date in lieu of a SQL query.
>>
>> This would allow people with reasonably static table statistics (where the 
>> best plan is not likely to change) to spend upfront cycles investigating 
>> the best plan and then embed that plan in their business logic.  Since the 
>> stored plan is both written-by and read-by postgres, it can get quite 
>> complicated without putting a burden on humans to read and write such 
>> complicated things.  It would also remove the risk that the planner will 
>> occasionally (due to its nondeterministic workings) choose a really bad 
>> plan and stall a production system.
>>
>> mark


В списке pgsql-hackers по дате отправления:

Предыдущее
От: Tom Lane
Дата:
Сообщение: Re: 8.2 features status
Следующее
От: Joshua Reich
Дата:
Сообщение: Re: An Idea for planner hints