Re: An Idea for planner hints

Поиск
Список
Период
Сортировка
От Mark Dilger
Тема Re: An Idea for planner hints
Дата
Msg-id 44EB5351.5000704@markdilger.com
обсуждение исходный текст
Ответ на Re: An Idea for planner hints  (Peter Eisentraut <peter_e@gmx.net>)
Ответы Re: An Idea for planner hints
Список pgsql-hackers
Peter Eisentraut wrote:
> Jim C. Nasby wrote:
>>> Meet EXPLAIN ANALYZE.
>> Which does no good for apps that you don't control the code on. Even 
>> if you do control the code, you have to find a way to stick EXPLAIN
>> ANALYZE in  front of every query, and figure out how to deal with
>> what's comming back.
> 
> It would not be hard to create an "auto explain analyze" mode that 
> implicitly runs EXPLAIN ANALYZE along with every query and logs the 
> result.  On its face, it sounds like an obviously great idea.  I just 
> don't see how you would put that to actual use, unless you want to read 
> server logs all day long.  Grepping for query duration and using the 
> statistics views are much more manageable tuning methods.  In my view 
> anyway.
> 
>> Going back to the original discussion though, there's no reason this
>> needs to involve EXPLAIN ANALYZE. All we want to know is what columns
>> the planner is dealing with as a set rather than individually.
> 
> This would log a whole bunch of column groups, since every moderately 
> interesting query uses a column in combination with some other column, 
> but you still won't know which ones you want the planner to optimize.
> 
> To get that piece of information, you'd need to do something like 
> principal component analysis over the column groups thus identified.  
> Which might be a fun thing to do.  But for the moment I think it's 
> better to stick to declaring the interesting pairs/groups manually.
> 

If the system logs which cross-table join statistics it didn't have for 
cross-table joins that it actually performed, it won't log the really 
interesting stuff.

What is interesting are the plans that it didn't chose on account of guessing 
that they were too expensive, when in reality the cross-table statistics were 
such that they were not too expensive.  This case might not be the common case, 
but it is the interesting case.  We are trying to get the planner to notice 
cheap plans that don't look cheap unless you have the cross-table statistics. 
So you have a chicken-and-egg problem here unless the system attempts (or 
outputs without actually attempting) what appear to be sub-optimal plans in 
order to determine how bad they really are.

I proposed something like this quite a bit up-thread.  I was hoping we could 
have a mode in which the system would run the second, third, fourth, ... best 
plans rather than just the best looking one, and then determine from actual 
runtime statistics which was best.  (The proposal also included the ability to 
output the best plan and read that in at a later time in lieu of a SQL query, 
but that part of it can be ignored if you like.)  The posting didn't generate 
much response, so I'm not sure what people thought of it.  The only major 
problem I see is getting the planner to keep track of alternate plans.  I don't 
know the internals of it very well, but I think the genetic query optimizer 
doesn't have a concept of "runner-up #1", "runner-up #2", etc., which it would 
need to have.

mark


В списке pgsql-hackers по дате отправления:

Предыдущее
От: Greg Stark
Дата:
Сообщение: Re: Tricky bugs in concurrent index build
Следующее
От: Alvaro Herrera
Дата:
Сообщение: Re: [PATCHES] COPY view