Обсуждение: Re: [PERFORM] optimizing immutable vs. stable function calls?

Поиск
Список
Период
Сортировка

Re: [PERFORM] optimizing immutable vs. stable function calls?

От
Tom Lane
Дата:
Karl Czajkowski <karlcz@isi.edu> writes:
> The query planner does not seem to
> recognize that it can eliminate redundant calls to a STABLE function.

No, it doesn't.

> In my case, the function call does not take any arguments and is thus
> trivially independent of row data, and appears in a WHERE clause being
> compared to constants. Why wouldn't the optimizer treat this case the
> same as IMMUTABLE?

"The same as IMMUTABLE" would be to reduce the function to a constant at
plan time, which would be the wrong thing.  It would be valid to execute
it only once at query start, but there's no built-in mechanism for that.

But you could force it by putting it in a sub-SELECT, that is if you
don't like the performance of

      SELECT ... slow_stable_function() ...

try this:

      SELECT ... (SELECT slow_stable_function()) ...

That works because it's an uncorrelated sub-query, which gets evaluated
just once per run.  But the overhead associated with that mechanism is
high enough that forcing it automatically for every stable function would
be a loser.  I'd recommend doing it only where it *really* matters.

            regards, tom lane


Re: [PERFORM] optimizing immutable vs. stable function calls?

От
Karl Czajkowski
Дата:
On Jan 18, Tom Lane modulated:
> "The same as IMMUTABLE" would be to reduce the function to a constant at
> plan time, which would be the wrong thing.  It would be valid to execute
> it only once at query start, but there's no built-in mechanism for that.
>

That's what I was afraid of... right, we'd like for it to partially
evaluate these and then finish the plan.

Is there a correctness hazard with pretending our function is
IMMUTABLE, even though we will change the underlying config parameter
in the same connection?  It would be very bad if we changed our
parameter to reflect a different web client identity context, but then
somehow got cached plans based on the previous setting...

> But you could force it by putting it in a sub-SELECT, that is if you
> don't like the performance of
>
>       SELECT ... slow_stable_function() ...
>
> try this:
>
>       SELECT ... (SELECT slow_stable_function()) ...
>

Ha, you might recall a while back I was having problems with
round-tripping our RLS policies because I had tried such sub-selects
which return arrays and the internal format lost the casts needed to
get the correct parse when reloading a dump... :-)


karl



Re: [PERFORM] optimizing immutable vs. stable function calls?

От
"David G. Johnston"
Дата:
On Wed, Jan 18, 2017 at 3:54 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
Karl Czajkowski <karlcz@isi.edu> writes:
> The query planner does not seem to
> recognize that it can eliminate redundant calls to a STABLE function.

No, it doesn't.

> In my case, the function call does not take any arguments and is thus
> trivially independent of row data, and appears in a WHERE clause being
> compared to constants. Why wouldn't the optimizer treat this case the
> same as IMMUTABLE?

"The same as IMMUTABLE" would be to reduce the function to a constant at
plan time, which would be the wrong thing.  It would be valid to execute
it only once at query start, but there's no built-in mechanism for that.

​I'm feeling a bit dense here but even after having read a number of these kinds of interchanges I still can't get it to stick.  I think part of the problem is this sentence from the docs:


(Stable): "​This category allows the optimizer to optimize multiple calls of the function to a single call"

I read that sentence (and the surrounding paragraph) and wonder why then doesn't it do so in this case.

If PostgreSQL cannot execute it only once at query start then all this talk about optimization seems misleading.  At worse there should be an sentence explaining when the optimizations noted in that paragraph cannot occur - and probably examples of both as well since its not clear when it can occur.

Some TLC to the docs here would be welcomed.

David J.

Re: [PERFORM] optimizing immutable vs. stable function calls?

От
Tom Lane
Дата:
"David G. Johnston" <david.g.johnston@gmail.com> writes:
> ​I'm feeling a bit dense here but even after having read a number of these
> kinds of interchanges I still can't get it to stick.  I think part of the
> problem is this sentence from the docs:
> https://www.postgresql.org/docs/current/static/xfunc-volatility.html

> (Stable): "​This category allows the optimizer to optimize multiple calls
> of the function to a single call"

> I read that sentence (and the surrounding paragraph) and wonder why then
> doesn't it do so in this case.

It says "allows", it doesn't say "requires".

The reason we have this category is that without it, it would be formally
invalid to optimize an expression involving a non-immutable function into
an index comparison value, because in that context the function is indeed
only evaluated once (before the comparison value is fed into the index
machinery).  But there isn't a mechanism for that behavior outside of
index scans.

> If PostgreSQL cannot execute it only once at query start then all this talk
> about optimization seems misleading.  At worse there should be an sentence
> explaining when the optimizations noted in that paragraph cannot occur -
> and probably examples of both as well since its not clear when it can occur.

If you want an exact definition of when things will happen or not happen,
start reading the source code.  I'm loath to document small optimizer
details since they change all the time.

            regards, tom lane


Re: [PERFORM] optimizing immutable vs. stable function calls?

От
Tom Lane
Дата:
Karl Czajkowski <karlcz@isi.edu> writes:
> Is there a correctness hazard with pretending our function is
> IMMUTABLE, even though we will change the underlying config parameter
> in the same connection?

You could probably get away with that if you never ever use prepared
queries (beware that almost anything in plpgsql is a prepared query).
It's a trick that's likely to bite you eventually though.

            regards, tom lane


Re: [PERFORM] optimizing immutable vs. stable function calls?

От
"David G. Johnston"
Дата:
On Wed, Jan 18, 2017 at 4:23 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
"David G. Johnston" <david.g.johnston@gmail.com> writes:
> ​I'm feeling a bit dense here but even after having read a number of these
> kinds of interchanges I still can't get it to stick.  I think part of the
> problem is this sentence from the docs:
> https://www.postgresql.org/docs/current/static/xfunc-volatility.html

> (Stable): "​This category allows the optimizer to optimize multiple calls
> of the function to a single call" 
 
> If PostgreSQL cannot execute it only once at query start then all this talk
> about optimization seems misleading.  At worse there should be an sentence
> explaining when the optimizations noted in that paragraph cannot occur -
> and probably examples of both as well since its not clear when it can occur.

If you want an exact definition of when things will happen or not happen,
start reading the source code.  I'm loath to document small optimizer
details since they change all the time.

​That would not be a productive exercise for me, or most people who just want
some idea of what to expect in terms of behavior when they write and use a 
Stable function (Immutable and Volatile seem fairly easy to reason about).

Is there anything fatally wrong with the following comprehension?

"""
A STABLE function cannot modify the database and is guaranteed to 
return the same results given the same arguments for all rows 
within a single statement.

This category allows the optimizer to take an expression of the form
(indexed_column = stable_function(...)) and evaluate stable_function(...)
once at the beginning of the query and use the result to scan 
the index. (Since an index scan will evaluate the comparison 
value only once, not once at each row, it is not valid to use a VOLATILE
 function in an index scan condition).  ?Note that should an index scan not be
chosen for the plan the function will be invoked once-per-row?

Expressions of the forms (constant = stable_function()), 
and (SELECT stable_function() FROM generate_series(1,5)) are not presently 
optimized to a single per-query evaluation.  To obtain the equivalent you 
can invoke the function in a sub-query or CTE and reference the result 
wherever it is needed.
"""

It probably isn't perfect but if the average user isn't going to benefit from
anything besides "index_column = function()" with an index plan then the
false hope that is being held due to the use of "allows + in particular" 
should probably be dispelled.

Thanks!

David J.

Re: [PERFORM] optimizing immutable vs. stable function calls?

От
Karl Czajkowski
Дата:
On Jan 18, Tom Lane modulated:
> Karl Czajkowski <karlcz@isi.edu> writes:
> > Is there a correctness hazard with pretending our function is
> > IMMUTABLE, even though we will change the underlying config parameter
> > in the same connection?
>
> You could probably get away with that if you never ever use prepared
> queries (beware that almost anything in plpgsql is a prepared query).
> It's a trick that's likely to bite you eventually though.
>

That sounds unnerving. I think I need to play it safe. :-/

Does the plan cache disappear with each connection/backend process?
Or is there also a risk of plans being shared between backends?

Would it be invasive or a small hack to have something like
"transaction-immutable" which can be precomputed during planning, like
immutable, but then must discard those plans at the end of the
transaction...?


karl



Re: [PERFORM] optimizing immutable vs. stable function calls?

От
Jim Nasby
Дата:
On 1/18/17 6:09 PM, David G. Johnston wrote:
> ​That would not be a productive exercise for me, or most people who just
> want
> some idea of what to expect in terms of behavior when they write and use a
> Stable function (Immutable and Volatile seem fairly easy to reason about).

Yeah, this isn't an uncommon question for users to have, and "read the
code" isn't a great answer.

If there's a README or comment block that describes this, that might be
a reasonable compromise.

It would certainly be useful to document how to push the planner in the
right direction as well. I didn't realize that SELECT ... (SELECT
slow_stable_function()) was a thing until reading this thread.
--
Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX
Experts in Analytics, Data Architecture and PostgreSQL
Data in Trouble? Get it in Treble! http://BlueTreble.com
855-TREBLE2 (855-873-2532)


Re: [PERFORM] optimizing immutable vs. stable function calls?

От
Merlin Moncure
Дата:
On Mon, Jan 23, 2017 at 9:10 AM, Jim Nasby <Jim.Nasby@bluetreble.com> wrote:
> On 1/18/17 6:09 PM, David G. Johnston wrote:
>>
>> That would not be a productive exercise for me, or most people who just
>> want
>> some idea of what to expect in terms of behavior when they write and use a
>> Stable function (Immutable and Volatile seem fairly easy to reason about).
>
>
> Yeah, this isn't an uncommon question for users to have, and "read the code"
> isn't a great answer.
>
> If there's a README or comment block that describes this, that might be a
> reasonable compromise.
>
> It would certainly be useful to document how to push the planner in the
> right direction as well. I didn't realize that SELECT ... (SELECT
> slow_stable_function()) was a thing until reading this thread.

Totally agree.

There are other odd cases as well, mostly relating to SQL inlining
(for example, marking a function IMMUTABLE can cause it to fall out of
inlining you would get by giving no designation).  If you documented
all the rules, I think you'd find the rules are a bit odd and could be
simplified.

merlin