Обсуждение: Re: [PERFORM] Hints proposal

Поиск
Список
Период
Сортировка

Re: [PERFORM] Hints proposal

От
Tom Lane
Дата:
[ trying once again to push this thread over to -hackers where it belongs ]

Arjen van der Meijden <acmmailing@tweakers.net> writes:
> On 12-10-2006 21:07 Jeff Davis wrote:
>> On Thu, 2006-10-12 at 19:15 +0200, Csaba Nagy wrote:
>> To formalize the proposal a litte, you could have syntax like:
>> CREATE HINT [FOR USER username] MATCHES regex APPLY HINT some_hint;
>>
>> Where "some_hint" would be a hinting language perhaps like Jim's, except
>> not guaranteed to be compatible between versions of PostgreSQL. The
>> developers could change the hinting language at every release and people
>> can just re-write the hints without changing their application.

Do you have any idea how much push-back there would be to that?  In
practice we'd be bound by backwards-compatibility concerns for the hints
too.

> There are some disadvantages of not writing the hints in a query. But of
> course there are disadvantages to do as well ;)

> One I can think of is that it can be very hard to define which hint
> should apply where. Especially in complex queries, defining at which
> point exaclty you'd like your hint to work is not a simple matter,
> unless you can just place a comment right at that position.

The problems that you are seeing all come from the insistence that a
hint should be textually associated with a query.  Using a regex is a
little better than putting it right into the query, but the only thing
that really fixes is not having the hints directly embedded into
client-side code.  It's still wrong at the conceptual level.

The right way to think about it is to ask why is the planner not picking
the right plan to start with --- is it missing a statistical
correlation, or are its cost parameters wrong for a specific case, or
is it perhaps unable to generate the desired plan at all?  (If the
latter, no amount of hinting is going to help.)  If it's a statistics or
costing problem, I think the right thing is to try to fix it with hints
at that level.  You're much more likely to fix the behavior across a
class of queries than you will be with a hint textually matched to a
specific query.

            regards, tom lane

Re: [PERFORM] Hints proposal

От
Jeff Davis
Дата:
On Thu, 2006-10-12 at 17:28 -0400, Tom Lane wrote:
> [ trying once again to push this thread over to -hackers where it belongs ]
>
> Arjen van der Meijden <acmmailing@tweakers.net> writes:
> > On 12-10-2006 21:07 Jeff Davis wrote:
> >> On Thu, 2006-10-12 at 19:15 +0200, Csaba Nagy wrote:
> >> To formalize the proposal a litte, you could have syntax like:
> >> CREATE HINT [FOR USER username] MATCHES regex APPLY HINT some_hint;
> >>
> >> Where "some_hint" would be a hinting language perhaps like Jim's, except
> >> not guaranteed to be compatible between versions of PostgreSQL. The
> >> developers could change the hinting language at every release and people
> >> can just re-write the hints without changing their application.
>
> Do you have any idea how much push-back there would be to that?  In
> practice we'd be bound by backwards-compatibility concerns for the hints
> too.
>

No, I don't have any idea, except that it would be less push-back than
changing a language that's embedded in client code. Also, I see no
reason to think that a hint would not be obsolete upon a new release
anyway.

> The problems that you are seeing all come from the insistence that a
> hint should be textually associated with a query.  Using a regex is a
> little better than putting it right into the query, but the only thing

"Little better" is all I was going for. I was just making the
observation that we can separate two concepts:
(1) Embedding code in the client's queries, which I see as very
undesirable and unnecessary
(2) Providing very specific hints

which at least gives us a place to talk about the debate more
reasonably.

> that really fixes is not having the hints directly embedded into
> client-side code.  It's still wrong at the conceptual level.
>

I won't disagree with that. I will just say it's no more wrong than
applying the same concept in addition to embedding the hints in client
queries.

> The right way to think about it is to ask why is the planner not picking
> the right plan to start with --- is it missing a statistical
> correlation, or are its cost parameters wrong for a specific case, or
> is it perhaps unable to generate the desired plan at all?  (If the
> latter, no amount of hinting is going to help.)  If it's a statistics or
> costing problem, I think the right thing is to try to fix it with hints
> at that level.  You're much more likely to fix the behavior across a
> class of queries than you will be with a hint textually matched to a
> specific query.
>

Agreed.

Regards,
    Jeff Davis


Re: [PERFORM] Hints proposal

От
Christopher Browne
Дата:
Quoth pgsql@j-davis.com (Jeff Davis):
> On Thu, 2006-10-12 at 17:28 -0400, Tom Lane wrote:
>> [ trying once again to push this thread over to -hackers where it belongs ]
>> 
>> Arjen van der Meijden <acmmailing@tweakers.net> writes:
>> > On 12-10-2006 21:07 Jeff Davis wrote:
>> >> On Thu, 2006-10-12 at 19:15 +0200, Csaba Nagy wrote:
>> >> To formalize the proposal a litte, you could have syntax like:
>> >> CREATE HINT [FOR USER username] MATCHES regex APPLY HINT some_hint;
>> >> 
>> >> Where "some_hint" would be a hinting language perhaps like
>> >> Jim's, except not guaranteed to be compatible between versions
>> >> of PostgreSQL. The developers could change the hinting language
>> >> at every release and people can just re-write the hints without
>> >> changing their application.
>> 
>> Do you have any idea how much push-back there would be to that?  In
>> practice we'd be bound by backwards-compatibility concerns for the
>> hints too.
>
> No, I don't have any idea, except that it would be less push-back
> than changing a language that's embedded in client code. Also, I see
> no reason to think that a hint would not be obsolete upon a new
> release anyway.

I see *plenty* of reason.

1.  Suppose the scenario where Hint h was useful hasn't been affected   by *any* changes in how the query planner works
inthe new   version, it *obviously* continues to be necessary.
 

2.  If Version n+0.1 hasn't resolved all/most cases where Hint h was   useful in Version n, then people will entirely
reasonablyexpect   for Hint h to continue to be in effect in version n+0.1
 

3.  Suppose support for Hint h is introduced in PostgreSQL version   n, and an optimization that makes it obsolete does
notarrive   until version n+0.3, which is quite possible.  That hint has been   carried forward for 2 versions already,
longenough for client   code that contains it to start to ossify.  (After all, if   developers get promoted to new
projectsevery couple of years,   two versions is plenty of time for the original programmer to    be gone...)
 

That's not just one good reason, but three.

>> The problems that you are seeing all come from the insistence that a
>> hint should be textually associated with a query.  Using a regex is a
>> little better than putting it right into the query, but the only thing
>
> "Little better" is all I was going for. I was just making the
> observation that we can separate two concepts:
> (1) Embedding code in the client's queries, which I see as very
> undesirable and unnecessary
> (2) Providing very specific hints
>
> which at least gives us a place to talk about the debate more
> reasonably.

It seems to me that there is a *LOT* of merit in trying to find
alternatives to embedding code into client queries, to be sure.

>> that really fixes is not having the hints directly embedded into
>> client-side code.  It's still wrong at the conceptual level.
>
> I won't disagree with that. I will just say it's no more wrong than
> applying the same concept in addition to embedding the hints in client
> queries.
>
>> The right way to think about it is to ask why is the planner not
>> picking the right plan to start with --- is it missing a
>> statistical correlation, or are its cost parameters wrong for a
>> specific case, or is it perhaps unable to generate the desired plan
>> at all?  (If the latter, no amount of hinting is going to help.)
>> If it's a statistics or costing problem, I think the right thing is
>> to try to fix it with hints at that level.  You're much more likely
>> to fix the behavior across a class of queries than you will be with
>> a hint textually matched to a specific query.
>
> Agreed.

That's definitely a useful way to look at the issue, which seems to be
lacking in many of the cries for hints.

Perhaps I'm being unfair, but it often seems that people demanding
hinting systems are uninterested in why the planner is getting things
wrong.  Yes, they have an immediate problem (namely the wrong plan
that is getting generated) that they want to resolve.

But I'm not sure that you can get anything out of hinting without
coming close to answering "why the planner got it wrong."
-- 
"cbbrowne","@","gmail.com"
http://linuxfinances.info/info/lsf.html
"Optimization hinders evolution."  -- Alan Perlis


Re: [PERFORM] Hints proposal

От
Jeff Davis
Дата:
On Thu, 2006-10-12 at 23:12 -0400, Christopher Browne wrote:
> > No, I don't have any idea, except that it would be less push-back
> > than changing a language that's embedded in client code. Also, I see
> > no reason to think that a hint would not be obsolete upon a new
> > release anyway.
> 
> I see *plenty* of reason.
> 
> 1.  Suppose the scenario where Hint h was useful hasn't been affected
>     by *any* changes in how the query planner works in the new
>     version, it *obviously* continues to be necessary.
> 
> 2.  If Version n+0.1 hasn't resolved all/most cases where Hint h was
>     useful in Version n, then people will entirely reasonably expect
>     for Hint h to continue to be in effect in version n+0.1
> 

Fair enough. I had considered those situations, but a lot of people are
talking about "I need a better plan now, can't wait for planner
improvements". Also, even if the hint is still useful, I would think
that on a new version you'd want to test to see how useful it still is.

> 3.  Suppose support for Hint h is introduced in PostgreSQL version
>     n, and an optimization that makes it obsolete does not arrive
>     until version n+0.3, which is quite possible.  That hint has been
>     carried forward for 2 versions already, long enough for client
>     code that contains it to start to ossify.  (After all, if
>     developers get promoted to new projects every couple of years,
>     two versions is plenty of time for the original programmer to 
>     be gone...)

Ok, that is a good reason. But it's not helped at all by putting the
hints in the queries themselves.

> > "Little better" is all I was going for. I was just making the
> > observation that we can separate two concepts:
> > (1) Embedding code in the client's queries, which I see as very
> > undesirable and unnecessary
> > (2) Providing very specific hints
> >
> > which at least gives us a place to talk about the debate more
> > reasonably.
> 
> It seems to me that there is a *LOT* of merit in trying to find
> alternatives to embedding code into client queries, to be sure.
> 

I think almost any alternative to client query hints is worth
considering.

> >> The right way to think about it is to ask why is the planner not
> >> picking the right plan to start with --- is it missing a
> >> statistical correlation, or are its cost parameters wrong for a
> >> specific case, or is it perhaps unable to generate the desired plan
> >> at all?  (If the latter, no amount of hinting is going to help.)
> >> If it's a statistics or costing problem, I think the right thing is
> >> to try to fix it with hints at that level.  You're much more likely
> >> to fix the behavior across a class of queries than you will be with
> >> a hint textually matched to a specific query.
> >
> > Agreed.
> 
> That's definitely a useful way to look at the issue, which seems to be
> lacking in many of the cries for hints.
> 
> Perhaps I'm being unfair, but it often seems that people demanding
> hinting systems are uninterested in why the planner is getting things
> wrong.  Yes, they have an immediate problem (namely the wrong plan
> that is getting generated) that they want to resolve.
> 
> But I'm not sure that you can get anything out of hinting without
> coming close to answering "why the planner got it wrong."

Right. And it's not always easy to determine why the planner got it
wrong without making it execute other plans through hinting :)

Note: I'll restate this just to be clear. I'm not advocating an overly-
specific, band-aid style hinting language. My only real concern is that
if one appears, I would not like it to appear in the client's queries. 

Same goes for more general kinds of hints. We don't want a bunch of
client queries to contain comments like "table foo has a
random_page_cost of 1.1". That belongs in the system catalogs.

Regards,Jeff Davis