Обсуждение: PROPOSAL: geqo improvement

Поиск
Список
Период
Сортировка

PROPOSAL: geqo improvement

От
"marcin mank"
Дата:
Hello, List.

There are cases when GEQO returns a very bad plan in some rare
executions of a query. To decrease likehood of this happening, I
propose:

When GEQO detects that what it found is in fact a miserable plan  it
restarts the search. Simple math shows that if the probability of a
bad plan found in one 'go' is p, the overall probability of a bad plan
is p^N  .

GEQO would decide that the plan is bad when the calculated cost of the
plan would exceed the time spent planning so far a fixed number of
times (100 ? a configurable parameter ?) .
I think a function infering cost from time spent could be calculated
from cpu_operator_cost - or is there a better way?

As a safety mean I wish to limit the number of replannings to a fixed
value (10? 20? a configurable parameter?)

If I introduce some configuration variables, I plan to infer the
defaults from geqo_effort (no concrete plan for this now).

An alternative to restarting the search might be just extending it -
running the main loop of geqo() function longer.  I plan restarting
because I`m afraid the real reason for getting bad plans could be that
the algorithm is getting into some local minimum and can`t get out. I
will explore that more.

If there is agreement to do this, it looks simple enough that I
volunteer to implement it. Please tell me what is the deadline for
this to make into 8.4 .

What I lack is good test cases to verify the solution.

Greetings
Marcin Mańk

Re: PROPOSAL: geqo improvement

От
"Robert Haas"
Дата:
2009/1/4 marcin mank <marcin.mank@gmail.com>:
> GEQO would decide that the plan is bad when the calculated cost of the
> plan would exceed the time spent planning so far a fixed number of
> times (100 ? a configurable parameter ?) .
> I think a function infering cost from time spent could be calculated
> from cpu_operator_cost - or is there a better way?

It sounds like you're proposing to compare the time spent planning to
the estimated execution time.  AFAICS, those things are unrelated, so
I'm not sure what you hope to figure out by comparing them.

> An alternative to restarting the search might be just extending it -
> running the main loop of geqo() function longer.  I plan restarting
> because I`m afraid the real reason for getting bad plans could be that
> the algorithm is getting into some local minimum and can`t get out. I
> will explore that more.

It sounds like you may have some concrete queries that suffer from
this problem.  It might be helpful to post the queries and the good
and bad plans.  It may be that the problem can be fixed with some
tuning of the existing parameters.

> If there is agreement to do this, it looks simple enough that I
> volunteer to implement it. Please tell me what is the deadline for
> this to make into 8.4 .

The deadline for the final CommitFest was November 1st, so I think it
is too late for 8.4.

...Robert


Re: PROPOSAL: geqo improvement

От
"marcin mank"
Дата:
> It sounds like you're proposing to compare the time spent planning to
> the estimated execution time.  AFAICS, those things are unrelated, so
> I'm not sure what you hope to figure out by comparing them.

The idea is: If we are to spend a LOT of resources executing the
query, we might as well burn some cycles in hope of finding a better
plan.


> It sounds like you may have some concrete queries that suffer from
> this problem.  It might be helpful to post the queries and the good
> and bad plans.  It may be that the problem can be fixed with some
> tuning of the existing parameters.

Actually, no. This is my random thought based on observing some
threads where people get bad plans due to GEQO.


> The deadline for the final CommitFest was November 1st, so I think it
> is too late for 8.4.

ugh.. too bad. I`m still interested anyway :)

Marcin