Обсуждение: Shouldn't we have a way to avoid "risky" plans?

Поиск
Список
Период
Сортировка

Shouldn't we have a way to avoid "risky" plans?

От
Josh Berkus
Дата:
Folks,

Yet more evidence that we need some way to assess query plans which are
high-risk and avoid them (or have Yet Another GUC):

 Merge Join  (cost=29.16..1648.00 rows=382 width=78) (actual
time=57215.167..57215.216 rows=1 loops=1)
   Merge Cond: (rn.node_id = device_nodes.node_id)
   ->  Nested Loop  (cost=0.00..11301882.40 rows=6998 width=62) (actual
time=57209.291..57215.030 rows=112 loops=1)
         Join Filter: (node_ep.node_id = rn.node_id)
         ->  Nested Loop  (cost=0.00..11003966.85 rows=90276 width=46)
(actual time=0.027..52792.422 rows=90195 loops=1)
               ->  Index Scan using ix_ne_ns on node_ep
(cost=0.00..1545943.45 rows=32606992 width=26) (actual
time=0.010..7787.043 rows=32606903 loops=1)
               ->  Index Scan using ix_nefp_eid on ep_fp
(cost=0.00..0.28 rows=1 width=20) (actual time=0.001..0.001 rows=0
loops=32606903)
                     Index Cond: (ep_fp.ep_id = node_ep.ep_id)
         ->  Materialize  (cost=0.00..5.30 rows=220 width=16) (actual
time=0.000..0.019 rows=220 loops=90195)
               ->  Seq Scan on mytable rn  (cost=0.00..4.20 rows=220
width=16) (actual time=0.008..0.043 rows=220 loops=1)
   ->  Sort  (cost=28.18..28.21 rows=12 width=16) (actual
time=0.164..0.165 rows=10 loops=1)
         Sort Key: device_nodes.node_id
         Sort Method:  quicksort  Memory: 25kB
         ->  Index Scan using ix_dn_did on device_nodes
(cost=0.00..27.96 rows=12 width=16) (actual time=0.086..0.134 rows=10
loops=1)
               Index Cond: (dev_id = 18165)
 Total runtime: 57215.329 ms


AFAICT, what's happening in this query is that PostgreSQL's statistics
on the device_nodes and several other tables are slightly out of date
(as in 5% of the table).  Thus it thinks that nothing will match the
list of node_ids in "mytable", and that it can exit the merge join early
and ignore the whole huge cost of the join plan.  This particular form
of out-of-dateness will be fixed in 9.1 (it's due to values being higher
than the highest histogram bucket in pg_stat), but not all forms will be.

It really seems like we should be able to detect an obvious high-risk
situation like this one.  Or maybe we're just being too optimistic about
discarding subplans?

BTW, the optimal plan for this query (post-analyze) is this one:

 Nested Loop  (cost=0.00..213068.26 rows=12 width=78) (actual
time=0.374..0.514 rows=1 loops=1)
   Join Filter: (device_nodes.node_id = rn.node_id)
   ->  Seq Scan on mytable rn  (cost=0.00..4.20 rows=220 width=16)
(actual time=0.013..0.050 rows=220 loops=1)
   ->  Materialize  (cost=0.00..213024.49 rows=12 width=62) (actual
time=0.001..0.002 rows=1 loops=220)
         ->  Nested Loop  (cost=0.00..213024.43 rows=12 width=62)
(actual time=0.077..0.278 rows=1 loops=1)
               ->  Nested Loop  (cost=0.00..211740.04 rows=4428
width=42) (actual time=0.070..0.269 rows=1 loops=1)
                     ->  Index Scan using ix_dn_did on device_nodes
(cost=0.00..51.92 rows=13 width=16) (actual time=0.058..0.115 rows=10
loops=1)
                           Index Cond: (dev_id = 18165)
                     ->  Index Scan using ix_ne_ns on node_ep
(cost=0.00..16137.45 rows=11700 width=26) (actual time=0.014..0.014
rows=0 loops=10)
                           Index Cond: (node_ep.node_id =
device_nodes.node_id)
               ->  Index Scan using ix_nefp_eid on ep_fp
(cost=0.00..0.28 rows=1 width=20) (actual time=0.006..0.007 rows=1 loops=1)
                     Index Cond: (ep_fp.ep_id = node_ep.ep_id);


-- -- Josh Berkus PostgreSQL Experts Inc. http://www.pgexperts.com

Re: Shouldn't we have a way to avoid "risky" plans?

От
Claudio Freire
Дата:
On Wed, Mar 23, 2011 at 2:12 PM, Josh Berkus <josh@agliodbs.com> wrote:
> Folks,
>
>...
> It really seems like we should be able to detect an obvious high-risk
> situation like this one.  Or maybe we're just being too optimistic about
> discarding subplans?

Why not letting the GEQO learn from past mistakes?

If somehow a post-mortem analysis of queries can be done and accounted
for, then these kinds of mistakes would be a one-time occurrence.

Ideas:
 *  you estimate cost IFF there's no past experience.
 *  if rowcount estimates miss by much, a correction cache could be
populated with extra (volatile - ie in shared memory) statistics
 *  or, if rowcount estimates miss by much, autoanalyze could be scheduled
 *  consider plan bailout: execute a tempting plan, if it takes too
long or its effective cost raises well above the expected cost, bail
to a safer plan
 *  account for worst-case performance when evaluating plans

Re: Shouldn't we have a way to avoid "risky" plans?

От
Josh Berkus
Дата:
On 3/23/11 10:35 AM, Claudio Freire wrote:
>  *  consider plan bailout: execute a tempting plan, if it takes too
> long or its effective cost raises well above the expected cost, bail
> to a safer plan

That would actually solve this particular case.  It would still require
us to have some definition of "safer" though.

--
                                  -- Josh Berkus
                                     PostgreSQL Experts Inc.
                                     http://www.pgexperts.com

Re: Shouldn't we have a way to avoid "risky" plans?

От
Claudio Freire
Дата:
On Wed, Mar 23, 2011 at 5:29 PM, Josh Berkus <josh@agliodbs.com> wrote:
> On 3/23/11 10:35 AM, Claudio Freire wrote:
>>  *  consider plan bailout: execute a tempting plan, if it takes too
>> long or its effective cost raises well above the expected cost, bail
>> to a safer plan
>
> That would actually solve this particular case.  It would still require
> us to have some definition of "safer" though.

In my head, safer = better worst-case performance.

Re: Shouldn't we have a way to avoid "risky" plans?

От
Justin Pitts
Дата:
On Wed, Mar 23, 2011 at 1:12 PM, Josh Berkus <josh@agliodbs.com> wrote:
> AFAICT, what's happening in this query is that PostgreSQL's statistics
> on the device_nodes and several other tables are slightly out of date
> (as in 5% of the table).

What about some manner of query feedback mechanism ( along the lines
of what explain analyze yields ) to track "stats staleness" in
general?

Probably, I misunderstand the costs of executing explain analyze.

Re: Shouldn't we have a way to avoid "risky" plans?

От
Tom Lane
Дата:
Claudio Freire <klaussfreire@gmail.com> writes:
> On Wed, Mar 23, 2011 at 5:29 PM, Josh Berkus <josh@agliodbs.com> wrote:
>> On 3/23/11 10:35 AM, Claudio Freire wrote:
>>> �* �consider plan bailout: execute a tempting plan, if it takes too
>>> long or its effective cost raises well above the expected cost, bail
>>> to a safer plan

>> That would actually solve this particular case. �It would still require
>> us to have some definition of "safer" though.

> In my head, safer = better worst-case performance.

If the planner starts operating on the basis of worst case rather than
expected-case performance, the complaints will be far more numerous than
they are today.

            regards, tom lane

Re: Shouldn't we have a way to avoid "risky" plans?

От
Claudio Freire
Дата:
On Wed, Mar 23, 2011 at 6:00 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Claudio Freire <klaussfreire@gmail.com> writes:
>> In my head, safer = better worst-case performance.
>
> If the planner starts operating on the basis of worst case rather than
> expected-case performance, the complaints will be far more numerous than
> they are today.

I imagine, that's why, if you put my comment in context, I was talking
about picking a safer plan only when the "better on average one" fails
miserably.

Re: Shouldn't we have a way to avoid "risky" plans?

От
Josh Berkus
Дата:
> If the planner starts operating on the basis of worst case rather than
> expected-case performance, the complaints will be far more numerous than
> they are today.

Yeah, I don't think that's the way to go.  The other thought I had was
to accumulate a "risk" stat the same as we accumulate a "cost" stat.

However, I'm thinking that I'm overengineering what seems to be a fairly
isolated problem, in that we might simply need to adjust the costing on
this kind of a plan.

Also, can I say that the cost figures in this plan are extremely
confusing?  Is it really necessary to show them the way we do?

Merge Join  (cost=29.16..1648.00 rows=382 width=78) (actual
time=57215.167..57215.216 rows=1 loops=1)
   Merge Cond: (rn.node_id = device_nodes.node_id)
   ->  Nested Loop  (cost=0.00..11301882.40 rows=6998 width=62) (actual
time=57209.291..57215.030 rows=112 loops=1)
         Join Filter: (node_ep.node_id = rn.node_id)
         ->  Nested Loop  (cost=0.00..11003966.85 rows=90276 width=46)
(actual time=0.027..52792.422 rows=90195 loops=1)

The first time I saw the above, I thought we had some kind of glibc math
bug on the host system.  Costs are supposed to accumulate upwards.

--
                                  -- Josh Berkus
                                     PostgreSQL Experts Inc.
                                     http://www.pgexperts.com

Re: Shouldn't we have a way to avoid "risky" plans?

От
Віталій Тимчишин
Дата:


2011/3/23 Tom Lane <tgl@sss.pgh.pa.us>
Claudio Freire <klaussfreire@gmail.com> writes:
> On Wed, Mar 23, 2011 at 5:29 PM, Josh Berkus <josh@agliodbs.com> wrote:
>> On 3/23/11 10:35 AM, Claudio Freire wrote:
>>>  *  consider plan bailout: execute a tempting plan, if it takes too
>>> long or its effective cost raises well above the expected cost, bail
>>> to a safer plan

>> That would actually solve this particular case.  It would still require
>> us to have some definition of "safer" though.

> In my head, safer = better worst-case performance.

If the planner starts operating on the basis of worst case rather than
expected-case performance, the complaints will be far more numerous than
they are today.

This can se GUC-controllable. Like plan_safety=0..1 with low default value. This can influence costs of plans where cost changes dramatically with small table changes and/or statistics is uncertain. Also this can be used as direct "hint" for such dangerous queries by changing GUC for session/single query. 


--
Best regards,
 Vitalii Tymchyshyn

Re: Shouldn't we have a way to avoid "risky" plans?

От
Merlin Moncure
Дата:
2011/3/24 Віталій Тимчишин <tivv00@gmail.com>:
> 2011/3/23 Tom Lane <tgl@sss.pgh.pa.us>
>>
>> Claudio Freire <klaussfreire@gmail.com> writes:
>> > On Wed, Mar 23, 2011 at 5:29 PM, Josh Berkus <josh@agliodbs.com> wrote:
>> >> On 3/23/11 10:35 AM, Claudio Freire wrote:
>> >>>  *  consider plan bailout: execute a tempting plan, if it takes too
>> >>> long or its effective cost raises well above the expected cost, bail
>> >>> to a safer plan
>>
>> >> That would actually solve this particular case.  It would still require
>> >> us to have some definition of "safer" though.
>>
>> > In my head, safer = better worst-case performance.
>>
>> If the planner starts operating on the basis of worst case rather than
>> expected-case performance, the complaints will be far more numerous than
>> they are today.
>>
> This can se GUC-controllable. Like plan_safety=0..1 with low default value.
> This can influence costs of plans where cost changes dramatically with small
> table changes and/or statistics is uncertain. Also this can be used as
> direct "hint" for such dangerous queries by changing GUC for session/single
> query.

ISTM if you add statistics miss and 'risk margin' to the things the
planner would have to consider while generating a plan, you are
greatly increasing the number of plan paths that would have to be
considered for any non trivial query.

merlin

merlin

Re: Shouldn't we have a way to avoid "risky" plans?

От
Nathan Boley
Дата:
>> This can se GUC-controllable. Like plan_safety=0..1 with low default value.
>> This can influence costs of plans where cost changes dramatically with small
>> table changes and/or statistics is uncertain. Also this can be used as
>> direct "hint" for such dangerous queries by changing GUC for session/single
>> query.
>
> ISTM if you add statistics miss and 'risk margin' to the things the
> planner would have to consider while generating a plan, you are
> greatly increasing the number of plan paths that would have to be
> considered for any non trivial query.


FWIW, these ideas are well established in the statistical community.

Currently, we are essentially using "maximum likelihood estimators".
We estimate a bunch of selectivities by choosing what is most likely,
plug them in to our objective function, and then minimize based upon
the plugged in values. In a wide variety of cases, MLE's can be shown
to be "asymptotically" optimal. That is, as our sample distribution
approaches the true distribution, the best we can possibly do is to
use the MLE. This is pretty sensible - if we actually knew all of the
selectivities then the results aren't really random anymore. However,
they often perform very poorly with small sample sizes - particularly
if the loss function is very sensitive to relatively small
fluctuations in the parameter estimates ( which postgres certainly is
- switching from a hash join to a nest-loop can be disastrous ).

Using the estimate that minimizes the "worst-case" performance is
precisely a minimax estimator. There, the goal is to minimize the risk
function ( iow, plan cost ) under the worst possible conditions.
Wikipedia has a pretty good treatment - just think "plan cost"
whenever you see "risk".

Another approach, that hasn't been suggested yet, is some Bayesian
update method. There, rather than calculating a specific parameter
value ( like ndistinct ), you try to store the entire distribution and
choose the plan that minimizes cost averaged over all of the possible
parameter values.

Example: ( please excuse the unrealistic numbers )

For instance, rather than estimate the selectivity of the join (
relation1.col1 = relation2.col1 ) to be 0.01, we would say it is 0.1
w/ probability 0.2 and 0.001 with probability 0.8. So, here is how we
would choose the plan now:

cost( nestloop | selectivity = 0.01 ) = 1
cost( hashjoin | selectivity = 0.01 ) = 2
cost( mergejoin | selectivity = 0.01 ) = 50

Here would be the bayesian approach:

cost( nestloop | selectivity = 0.001 ) = 0.1
cost( hashjoin | selectivity = 0.001 ) = 1
cost( mergejoin | selectivity = 0.001 ) = 50

cost( nestloop | selectivity = 0.1 ) = 10
cost( hashjoin | selectivity = 0.1 ) = 3
cost( mergejoin | selectivity = 0.1 ) = 50

So, the bayesian costs are:

nestloop: 0.1*0.8 + 10*0.2 = 2.08
hashjoin: 1.0*0.8 + 3*0.2 = 1.4
nestloop: 50*0.8 + 50*0.2 = 50

so the hashjoin would be chosen.

For completeness, the minimax costs would be:

nestloop: max( 0.1, 10 )
hashjoin: max( 1, 3   )
nestloop: max( 50, 50 )

So, again, the hashjoin is chosen.

I obviously have a bias towards the Bayesian approach, but it's not
because I expect it to necessarily perform a whole lot better but,
rather, it reduces to the other two approaches. If we want the current
behavior, then simply store the MLE selectivity with probability 1. If
we want the minimax estimate, choose the worst possible value. Or
anything in between.

Also, ( not that I have even close to the experience / expertise to
make this claim - so take this with a grain of salt ) it seems that
the code changes would be substantial but pretty straightforward and
easy to localize. Rather than passing a selectivity, pass a pair of
arrays with selectivities and probabilities. Initially, we could keep
the current estimates ( every passed array would be of length one )
and then make changes as problems appear ( like Josh's )

I hope my little estimation procedure tutorial has been a little
helpful, please feel free to contact me off list if you have
questions/want references.

Best,
Nathan Boley

Re: Shouldn't we have a way to avoid "risky" plans?

От
Claudio Freire
Дата:
On Thu, Mar 24, 2011 at 5:30 PM, Nathan Boley <npboley@gmail.com> wrote:
> Another approach, that hasn't been suggested yet, is some Bayesian
> update method. There, rather than calculating a specific parameter
> value ( like ndistinct ), you try to store the entire distribution and
> choose the plan that minimizes cost averaged over all of the possible
> parameter values.

I've done similar stuff for work, you don't have to go all the way to
storing complete probability distributions, usually a simple
likelihood range is enough.

In essence, instead of having a scalar MLE for plan cost, you
implement a "ranged" estimator, that estimates the most-likely range
of plan costs, with mean and standard deviation from mean.

This essentially gives a risk value, since risky plans will have very
large standard deviations from the mean.

> Also, ( not that I have even close to the experience / expertise to
> make this claim - so take this with a grain of salt ) it seems that
> the code changes would be substantial but pretty straightforward and
> easy to localize. Rather than passing a selectivity, pass a pair of
> arrays with selectivities and probabilities.

If you approximage the probability distributions as I outlined above,
it's even simpler. Approximate, but simpler - and since you retain the
original cost estimations in the form of mean cost values, you can
easily tune the GEQO to perform as it currently does (ignore variance)
or with a safety margin (account for variance).


About issues like these being uncommon - I disagree.

I routinely have to work around query inefficiencies because GEQO does
something odd - and since postgres gives me too few tools to tweak
plans (increase statistics, use subqueries, rephrase joins, no direct
tool before CTEs which are rather new), it becomes an art form, and it
becomes very unpredictable and an administrative burden. Out of the
blue, statistics change, queries that worked fine start to perform
poorly, and sites go down.

If GEQO could detect unsafe plans and work around them automatically,
it would be a major improvement.

Granted, though, this should be approached with care.

Re: Shouldn't we have a way to avoid "risky" plans?

От
Vitalii Tymchyshyn
Дата:
24.03.11 20:41, Merlin Moncure написав(ла):
> 2011/3/24 Віталій Тимчишин<tivv00@gmail.com>:
>>
>> This can se GUC-controllable. Like plan_safety=0..1 with low default value.
>> This can influence costs of plans where cost changes dramatically with small
>> table changes and/or statistics is uncertain. Also this can be used as
>> direct "hint" for such dangerous queries by changing GUC for session/single
>> query.
> ISTM if you add statistics miss and 'risk margin' to the things the
> planner would have to consider while generating a plan, you are
> greatly increasing the number of plan paths that would have to be
> considered for any non trivial query.
Why so? I simply change cost estimation functions. This won't change
number of pathes.

Best regards, Vitalii Tymchyshyn.

Re: Shouldn't we have a way to avoid "risky" plans?

От
Tom Lane
Дата:
Vitalii Tymchyshyn <tivv00@gmail.com> writes:
> 24.03.11 20:41, Merlin Moncure �������(��):
>> ISTM if you add statistics miss and 'risk margin' to the things the
>> planner would have to consider while generating a plan, you are
>> greatly increasing the number of plan paths that would have to be
>> considered for any non trivial query.

> Why so? I simply change cost estimation functions. This won't change
> number of pathes.

If you have multiple figures of merit, that means you have to keep more
paths, with consequent slowdown when it comes to choosing which path to
use at higher join levels.

As an example, we used to keep only the paths with best total cost.
When we started to optimize LIMIT, we had to keep around paths with best
startup cost too, in case that made for the best combined result at a
higher join level.  If you're going to consider "risk" while choosing
paths, that means you'll start keeping paths you would have discarded
before, while not necessarily getting rid of any other paths.  The only
way to avoid that would be to have a completely brain-dead notion of
risk that wasn't affected by how the path is used at a higher join
level, and I'm pretty sure that that wouldn't solve anybody's problem.

Any significant expansion of the planner's fundamental cost model *will*
make it slower.  By a lot.  Rather than going into this with fantasies
of "it won't cost anything", you should be worrying about how to keep
the speed penalty to factor-of-two rather than factor-of-ten.

            regards, tom lane

Re: Shouldn't we have a way to avoid "risky" plans?

От
Tom Lane
Дата:
Josh Berkus <josh@agliodbs.com> writes:
>> If the planner starts operating on the basis of worst case rather than
>> expected-case performance, the complaints will be far more numerous than
>> they are today.

> Yeah, I don't think that's the way to go.  The other thought I had was
> to accumulate a "risk" stat the same as we accumulate a "cost" stat.

> However, I'm thinking that I'm overengineering what seems to be a fairly
> isolated problem, in that we might simply need to adjust the costing on
> this kind of a plan.

mergejoinscansel doesn't currently try to fix up the histogram bounds by
consulting indexes.  At the time I was afraid of the costs of doing
that, and I still am; but it would be a way to address this issue.

Author: Tom Lane <tgl@sss.pgh.pa.us>
Branch: master Release: REL9_0_BR [40608e7f9] 2010-01-04 02:44:40 +0000

    When estimating the selectivity of an inequality "column > constant" or
    "column < constant", and the comparison value is in the first or last
    histogram bin or outside the histogram entirely, try to fetch the actual
    column min or max value using an index scan (if there is an index on the
    column).  If successful, replace the lower or upper histogram bound with
    that value before carrying on with the estimate.  This limits the
    estimation error caused by moving min/max values when the comparison
    value is close to the min or max.  Per a complaint from Josh Berkus.

    It is tempting to consider using this mechanism for mergejoinscansel as well,
    but that would inject index fetches into main-line join estimation not just
    endpoint cases.  I'm refraining from that until we can get a better handle
    on the costs of doing this type of lookup.


            regards, tom lane

Re: Shouldn't we have a way to avoid "risky" plans?

От
Vitalii Tymchyshyn
Дата:
25.03.11 16:12, Tom Lane написав(ла):
> Vitalii Tymchyshyn<tivv00@gmail.com>  writes:
>
>> Why so? I simply change cost estimation functions. This won't change
>> number of pathes.
> If you have multiple figures of merit, that means you have to keep more
> paths, with consequent slowdown when it comes to choosing which path to
> use at higher join levels.
>
> As an example, we used to keep only the paths with best total cost.
> When we started to optimize LIMIT, we had to keep around paths with best
> startup cost too, in case that made for the best combined result at a
> higher join level.  If you're going to consider "risk" while choosing
> paths, that means you'll start keeping paths you would have discarded
> before, while not necessarily getting rid of any other paths.  The only
> way to avoid that would be to have a completely brain-dead notion of
> risk that wasn't affected by how the path is used at a higher join
> level, and I'm pretty sure that that wouldn't solve anybody's problem.
>
> Any significant expansion of the planner's fundamental cost model *will*
> make it slower.  By a lot.  Rather than going into this with fantasies
> of "it won't cost anything", you should be worrying about how to keep
> the speed penalty to factor-of-two rather than factor-of-ten.
But I am not talking about model change, it's more like formula change.
Introducing limit added one variable where outer plan could influence
inner plan selection.
But I am talking simply about cost calculation for given node. Now cost
is based on statistical expected value, the proposal is (something like)
to take maximum cost on n% probability range near expected value.
This, of course, will make calculations slower, but won't add any degree
of freedom to calculations.

Best regards, Vitalii Tymchyshyn



Re: Shouldn't we have a way to avoid "risky" plans?

От
Scott Carey
Дата:

On 3/23/11 2:08 PM, "Claudio Freire" <klaussfreire@gmail.com> wrote:

>On Wed, Mar 23, 2011 at 6:00 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> Claudio Freire <klaussfreire@gmail.com> writes:
>>> In my head, safer = better worst-case performance.
>>
>> If the planner starts operating on the basis of worst case rather than
>> expected-case performance, the complaints will be far more numerous than
>> they are today.
>
>I imagine, that's why, if you put my comment in context, I was talking
>about picking a safer plan only when the "better on average one" fails
>miserably.

Postgres' assumption about what is 'better on average' is wrong in the
presence of nonlinear relationships between various statistics and
execution time anyway.

AVG(f(x)) != f(AVG(x))

In english, the fastest plan for the average (most likely) case is not
always the fastest plan on average.  It works very well for many cases,
but falls down in others.

Many of the 'why is this query slow' and 'I wish there were hints'
problems I see here that are not user error seem related to this.  The
approaches discussed by Nathan Boley and Claudio Freire in this thread
could significantly mitigate many of the issues I have seen when wrestling
with the planner.

>
>--
>Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
>To make changes to your subscription:
>http://www.postgresql.org/mailpref/pgsql-performance


Re: Shouldn't we have a way to avoid "risky" plans?

От
Nathan Boley
Дата:
> mergejoinscansel doesn't currently try to fix up the histogram bounds by
> consulting indexes.  At the time I was afraid of the costs of doing
> that, and I still am; but it would be a way to address this issue.
>

Another cheaper but less accurate way to deal with this is to note
that we are trying to estimate the max of the population by using the
max of the sample, which obviously has a negative bias. If we could
correct the bias ( though the bootstrap, or an analytical correction
under some parametric assumptions ( ie, the distribution is uniform in
the last bucket ) ) , then we should get better estimates at the cost
of some analyze time. But this wouldn't even deal with Josh's
particular problem, since it's due to out of date stats rather than
sampling error...

-Nathan

Re: Shouldn't we have a way to avoid "risky" plans?

От
Joshua Berkus
Дата:
> mergejoinscansel doesn't currently try to fix up the histogram bounds
> by
> consulting indexes. At the time I was afraid of the costs of doing
> that, and I still am; but it would be a way to address this issue.

Oh?  Hmmm.  I have a ready-made test case for the benefit case on this.   However, I'm not clear on how we would test
thecosts. 

But this type of query plan is clearly pathological, and is experienced by users as a performance regression since 8.3.
I now have the user doing analyzes of fairly large tables 2/hour to avoid the problem.  So I don't think we can leave
italone. 

--
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com
San Francisco

Re: Shouldn't we have a way to avoid "risky" plans?

От
Robert Haas
Дата:
On Thu, Mar 24, 2011 at 4:30 PM, Nathan Boley <npboley@gmail.com> wrote:
> Another approach, that hasn't been suggested yet, is some Bayesian
> update method. There, rather than calculating a specific parameter
> value ( like ndistinct ), you try to store the entire distribution and
> choose the plan that minimizes cost averaged over all of the possible
> parameter values.
>
> Example: ( please excuse the unrealistic numbers )
>
> For instance, rather than estimate the selectivity of the join (
> relation1.col1 = relation2.col1 ) to be 0.01, we would say it is 0.1
> w/ probability 0.2 and 0.001 with probability 0.8. So, here is how we
> would choose the plan now:
>
> cost( nestloop | selectivity = 0.01 ) = 1
> cost( hashjoin | selectivity = 0.01 ) = 2
> cost( mergejoin | selectivity = 0.01 ) = 50
>
> Here would be the bayesian approach:
>
> cost( nestloop | selectivity = 0.001 ) = 0.1
> cost( hashjoin | selectivity = 0.001 ) = 1
> cost( mergejoin | selectivity = 0.001 ) = 50
>
> cost( nestloop | selectivity = 0.1 ) = 10
> cost( hashjoin | selectivity = 0.1 ) = 3
> cost( mergejoin | selectivity = 0.1 ) = 50
>
> So, the bayesian costs are:
>
> nestloop: 0.1*0.8 + 10*0.2 = 2.08
> hashjoin: 1.0*0.8 + 3*0.2 = 1.4
> nestloop: 50*0.8 + 50*0.2 = 50
>
> so the hashjoin would be chosen.
>
> For completeness, the minimax costs would be:
>
> nestloop: max( 0.1, 10 )
> hashjoin: max( 1, 3   )
> nestloop: max( 50, 50 )
>
> So, again, the hashjoin is chosen.
>
> I obviously have a bias towards the Bayesian approach, but it's not
> because I expect it to necessarily perform a whole lot better but,
> rather, it reduces to the other two approaches. If we want the current
> behavior, then simply store the MLE selectivity with probability 1. If
> we want the minimax estimate, choose the worst possible value. Or
> anything in between.

This is a very elegant suggestion to this problem, and I really like
it.  It elegantly models the concept we're trying to capture here,
which is that we're sometimes just guessing how things really are, and
if it turns out that we're way off, we may be stuck in a
pathologically bad plan.

One limitation of this method is that it is difficult to apply more
than locally.  Consider:

SELECT * FROM foo, bar WHERE foo.id  = bar.id AND some_crazy_function(foo.id)

The best method of joining foo and bar is likely going to depend on
the percentage of rows in foo for which some_crazy_function(foo.id)
returns true, and we have no idea what that is.  We could represent
that by kicking out a range of probabilistic *cost* estimates for each
path over foo, but that adds a lot of code complexity.  And compute
time  - because (I think) now we've made it so that more paths have to
percolate all the way up through the join tree.

It's perhaps also worth looking at our old nemesis:

SELECT * FROM foo WHERE a = 1 ORDER BY b LIMIT 1

What I really want to do here is have the planner be able to reflect
the fact that an index scan over b may be very expensive or very cheap
depending on how lucky we get applying the a = 1 predicate, but I'm
not quite sure how to make that work.

It seems like the time when this would help the most without costing
too much or requiring excessively invasive surgery is the case where
the join selectivity itself is uncertain.  We can estimate that fairly
accurately as far as MCVs go, but after that it does get very murky.
Still, my gut feeling is that many (not all) of the worst problems
actually bubble up from under the join, rather than happening at that
level.

Thoughts?

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Re: Shouldn't we have a way to avoid "risky" plans?

От
Robert Haas
Дата:
On Fri, Mar 25, 2011 at 10:24 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Josh Berkus <josh@agliodbs.com> writes:
>>> If the planner starts operating on the basis of worst case rather than
>>> expected-case performance, the complaints will be far more numerous than
>>> they are today.
>
>> Yeah, I don't think that's the way to go.  The other thought I had was
>> to accumulate a "risk" stat the same as we accumulate a "cost" stat.
>
>> However, I'm thinking that I'm overengineering what seems to be a fairly
>> isolated problem, in that we might simply need to adjust the costing on
>> this kind of a plan.
>
> mergejoinscansel doesn't currently try to fix up the histogram bounds by
> consulting indexes.  At the time I was afraid of the costs of doing
> that, and I still am; but it would be a way to address this issue.

Apparently, this is a pain point for the MySQL query planner - not so
much for merge joins, which I don't think are supported in any of the
major forks anyway - but the planner's desire to go estimate things by
probing the indexes.  IIRC, the MariaDB guys are looking into adding
persistent statistics to address this problem.  That doesn't
necessarily mean that we shouldn't do this, but it probably does mean
that we should be awfully careful about it.

Another thought is that we might want to consider reducing
autovacuum_analyze_scale_factor.  The root of the original problem
seems to be that the table had some data churn but not enough to cause
an ANALYZE.  Now, if the data churn is random, auto-analyzing after
10% churn might be reasonable, but a lot of data churn is non-random,
and ANALYZE is fairly cheap.  I'm just shooting in the dark here; I
might be all wet.  I think part of the problem is that the AV launcher
isn't very smart about looking at the overall picture.  It'd be nice,
for example, to be able to be more aggressive when the system is quiet
and to be a bit more careful when the system is saturated, but it's a
bit tricky to think about how to make that work, or exactly what the
heuristics should be.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Re: Shouldn't we have a way to avoid "risky" plans?

От
Josh Berkus
Дата:
On 4/19/11 7:29 AM, Robert Haas wrote:
> Another thought is that we might want to consider reducing
> autovacuum_analyze_scale_factor.  The root of the original problem
> seems to be that the table had some data churn but not enough to cause
> an ANALYZE.  Now, if the data churn is random, auto-analyzing after
> 10% churn might be reasonable, but a lot of data churn is non-random,
> and ANALYZE is fairly cheap.

I wouldn't reduce the defaults for PostgreSQL; this is something you do
on specific tables.

For example, on very large tables I've been known to set
analyze_scale_factor to 0 and analyze_threshold to 5000.

And don't assume that analyzing is always cheap.  If you have an 800GB
table, most of which is very cold data, and have statistics set to 5000
for some columns, accessing many of the older blocks could take a while.

--
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com

Re: Shouldn't we have a way to avoid "risky" plans?

От
Robert Haas
Дата:
On Apr 19, 2011, at 9:50 PM, Josh Berkus <josh@agliodbs.com> wrote:
> For example, on very large tables I've been known to set
> analyze_scale_factor to 0 and analyze_threshold to 5000.

Huh?  Why?
>

...Robert

Re: Shouldn't we have a way to avoid "risky" plans?

От
Jim Nasby
Дата:
On Mar 24, 2011, at 5:23 PM, Claudio Freire wrote:
> I routinely have to work around query inefficiencies because GEQO does
> something odd - and since postgres gives me too few tools to tweak
> plans (increase statistics, use subqueries, rephrase joins, no direct
> tool before CTEs which are rather new), it becomes an art form, and it
> becomes very unpredictable and an administrative burden. Out of the
> blue, statistics change, queries that worked fine start to perform
> poorly, and sites go down.
>
> If GEQO could detect unsafe plans and work around them automatically,
> it would be a major improvement.

This isn't limited to GEQO queries either. Every few months we'll have what should be a very fast query suddenly become
farslower. Still on the order of seconds, but when you're running several of those a second and they normally take
fractionsof a second, this kind of performance degradation can easily bring a server to it's knees. Every time this has
happenedthe solution has been to re-analyze a fairly large table; even with default stats target of 1000 it's very easy
forone bad analyze to ruin your day.  
--
Jim C. Nasby, Database Architect                   jim@nasby.net
512.569.9461 (cell)                         http://jim.nasby.net