Обсуждение: right way of using case-expressions in plpgsql functions

Поиск
Список
Период
Сортировка

right way of using case-expressions in plpgsql functions

От
Victor Dobrovolsky
Дата:
I understood from documentation that case expression can be
1) optimized by planner
2) may compute its subexpressions in advance, in case of presence aggregation functions in them, for example.

The question is - how it is combined with generic prepared plans in pl/pgsql.
How can I deduct - when using case-expression is "safe" regarding query goals with parameters in pl/pgsql and when is not.
There are two cases - expression in select list and expression in where clause.

in where clause:
suppose I have a sql-function like this:

CREATE OR REPLACE FUNCTION nvl_in_where(pf anycompatible, px anycompatible, py anycompatible)
 RETURNS boolean
 LANGUAGE sql
 STABLE
AS $function$
select (case when px is not null then pf = px
        else pf is not distinct from py
end);
$function$
;


and then I use it in some pl/pgsql function:
 
 
CREATE OR REPLACE FUNCTION plsql_query_function(in pn numeric )
 RETURNS boolean
 LANGUAGE plpgsql
 STABLE
AS $function$
Declare
  sr record;  
Begin

   For sr in Select tbl.p1,tbl.p2
             From tbl
Where
  nvl_in_where(tbl.p1, pn, tbl.p1)
   Loop
     -- do some logic with sr ...
-- ...
     Null;
   end loop;
end;
$function$
;

If execute this query individually with fixed value of parameter $1, the query plan
would be like

Select tbl.p1,tbl.p2
From tbl
Where
  tbl.p1 = pn::numeric
;

or    

Select tbl.p1,tbl.p2
From tbl
Where
  tbl.p1 is not distinct from tbl.p1
;

depending if pn is null or  not.

The documentation states that after some executions of such functions the plan should become generic.
What is a generic plan for such a case and how would it work?
If it is safe to use function like nvl_in_where in where clause of queries in plpgsql function or,
 maybe, I should only use that query only with the execute statement in plpgsql?
 
 in select list:
 
 suppose, i need something like this :
 
 select case $1
        when '1'::numeric then tbl.p1
when '2'::numeric then tbl.p2
        ...
        end as cresult
 from tbl ...      

Can I be sure, that this expression would not be "optimised" in generic plan just to
select
     tbl.some_lucky_fied
from tbl
Can I use this type of expression in the select list regarding generic plans
or should I prefer dynamic execution  for such type queries in plpgsql?

Thanks in advance.

PS
The question looks like from a novice, and , indeed, I am.

 
 

Re: right way of using case-expressions in plpgsql functions

От
"David G. Johnston"
Дата:
On Sunday, October 15, 2023, Victor Dobrovolsky <booby.stager@gmail.com> wrote:

select (case when px is not null then pf = px
        else pf is not distinct from py
end);


Every single time this function is called “px is not null” will be evaluated and then one of the two branches will be evaluated.  Nothing the optimizer does will change that.  The planner for the function internals does not know whether px will or will not be null on any given invocation.

David J.

Re: right way of using case-expressions in plpgsql functions

От
Tom Lane
Дата:
"David G. Johnston" <david.g.johnston@gmail.com> writes:
> On Sunday, October 15, 2023, Victor Dobrovolsky <booby.stager@gmail.com>
> wrote:
>> select (case when px is not null then pf = px
>> else pf is not distinct from py
>> end);

> Every single time this function is called “px is not null” will be
> evaluated and then one of the two branches will be evaluated.  Nothing the
> optimizer does will change that.  The planner for the function internals
> does not know whether px will or will not be null on any given invocation.

Not necessarily --- I think the SQL-language function will get inlined
and then there would be opportunity for const-simplification if a
known value is available for px in the outer function.

At least in the px-not-null case, having "pf = px" rather than an
impenetrable CASE condition will probably be enough better for
optimization that the plancache would never choose to switch to a
generic plan.  However, that might not be true for the other case,
since we aren't terribly smart about optimizing NOT DISTINCT conditions.
So the performance you get might well vary depending on which case
occurs more often during the first few query runs.

On the whole though, the entire question seems like solving the wrong
problem.  If you have to resort to this kind of thing to get your
answers, it's past time to rethink your data representation.

            regards, tom lane



Re: right way of using case-expressions in plpgsql functions

От
Victor Dobrovolsky
Дата:
First of all, thanks everyone for the answers.

вс, 15 окт. 2023 г. в 20:08, Tom Lane <tgl@sss.pgh.pa.us>:
"David G. Johnston" <david.g.johnston@gmail.com> writes:
> On Sunday, October 15, 2023, Victor Dobrovolsky <booby.stager@gmail.com>
> wrote:
>> select (case when px is not null then pf = px
>> else pf is not distinct from py
>> end);

> Every single time this function is called “px is not null” will be
> evaluated and then one of the two branches will be evaluated.  Nothing the
> optimizer does will change that.  The planner for the function internals
> does not know whether px will or will not be null on any given invocation.

Not necessarily --- I think the SQL-language function will get inlined
and then there would be opportunity for const-simplification if a
known value is available for px in the outer function.

At least in the px-not-null case, having "pf = px" rather than an
impenetrable CASE condition will probably be enough better for
optimization that the plancache would never choose to switch to a
generic plan.  However, that might not be true for the other case,
since we aren't terribly smart about optimizing NOT DISTINCT conditions.
So the performance you get might well vary depending on which case
occurs more often during the first few query runs.

Ok. I am a "man coming from Oracle-sql" .
The first case is reminiscence of using oracle nvl "sql-function" in form of
Select * 
from tbl
Where tbl.somefield = nvl(:parameter, tbl.somefield)
;
In such a case Oracle will produce generic plan like this:
Filter :parameter is not null
  Select * from tbl
  Where tbl.somefield = :parameter -- index scan if possible
Union All
Filter :parameter is null
  Select * from tbl
  Where 1=1  -- table full scan guaranteed
;
Here aligned left Filter conditions assured, that only one branch of union all will be really taken in each particular execution of query.
 
Regarding the "pf is not distinct from py" condition - I saw that it was translated to "not (pf is distinct from py)"
 which is totally correct from a mathematical point of view.
But maybe it would be useful to have an independent translation of the statement that "pf is identical to pf",
to eliminate the condition totally at last.... 
(don't pay too much attention, this is a superficial newbie opinion, anyway)

In total - should I manually divide these cases in the plpgsql function if I like to avoid any prepared statements caveats, or should I use "execute"-statements, if I am lazy enough for that,
or, there is nothing to complain in terms of "generic plan"...

On the whole though, the entire question seems like solving the wrong
problem.  If you have to resort to this kind of thing to get your
answers, it's past time to rethink your data representation.
Definitely Yes.
But...
My goal is "to translate" some application "as fast as possible", using "as few structure transformations as possible".
From that - "short and dirty translation" - point of view - should I prefer to divide that 
$$
Select 
case $1
when '1'::numeric then tbl.p1
when '2'::numeric then tbl.p2
...
end as cresult
$$

expression into:
$$
...
if $1 = '1'::numeric then (query_1)
elsif when '2'::numeric then (query_2)
 ...
end if;
...
$$
If I do not want to use an execute statement for that?

Thank you.

Re: right way of using case-expressions in plpgsql functions

От
Ron
Дата:
On 10/15/23 11:19, Victor Dobrovolsky wrote:
[snip]
The documentation states that after some executions of such functions the plan should become generic.
What is a generic plan for such a case and how would it work?

It's highly dependent on the query

When I see this happen (after we notice that a procedure starts taking a long time), the query planner flips from a custom plan to a generic plan after about the fifth execution in a session of a function/procedure.

This will make it calculate the plan every time:
set plan_cache_mode = force_custom_plan;

--
Born in Arizona, moved to Babylonia.

Re: right way of using case-expressions in plpgsql functions

От
Tom Lane
Дата:
Victor Dobrovolsky <booby.stager@gmail.com> writes:
> From that - "short and dirty translation" - point of view - should I prefer
> to divide that
> $$
> Select
> case $1
> when '1'::numeric then tbl.p1
> when '2'::numeric then tbl.p2
> ...
> end as cresult
> $$

> expression into:
> $$
> ...
> if $1 = '1'::numeric then (query_1)
> elsif when '2'::numeric then (query_2)
>  ...
> end if;
> ...
> $$

Yeah, that would probably be a preferable approach if you're hoping
for significantly different query plans for the two cases.  PG does
not have the sort of run-time plan choice mechanism that you're
describing for Oracle.

            regards, tom lane



Re: right way of using case-expressions in plpgsql functions

От
Victor Dobrovolsky
Дата:
>Yeah, that would probably be a preferable approach if you're hoping
>for significantly different query plans for the two cases.

Thank you. 
My goal is to find out some basic rules that could help me to navigate issues like this.
Regarding generalized plans in general, and the use of the case-expressions in particular.


пн, 16 окт. 2023 г. в 00:15, Tom Lane <tgl@sss.pgh.pa.us>:
Victor Dobrovolsky <booby.stager@gmail.com> writes:
> From that - "short and dirty translation" - point of view - should I prefer
> to divide that
> $$
> Select
> case $1
> when '1'::numeric then tbl.p1
> when '2'::numeric then tbl.p2
> ...
> end as cresult
> $$

> expression into:
> $$
> ...
> if $1 = '1'::numeric then (query_1)
> elsif when '2'::numeric then (query_2)
>  ...
> end if;
> ...
> $$

Yeah, that would probably be a preferable approach if you're hoping
for significantly different query plans for the two cases.  PG does
not have the sort of run-time plan choice mechanism that you're
describing for Oracle.

                        regards, tom lane

Re: right way of using case-expressions in plpgsql functions

От
Victor Dobrovolsky
Дата:
Thank you. I'll take it.

пн, 16 окт. 2023 г. в 00:20, Ron <ronljohnsonjr@gmail.com>:
On 10/15/23 11:19, Victor Dobrovolsky wrote:
[snip]
The documentation states that after some executions of such functions the plan should become generic.
What is a generic plan for such a case and how would it work?

It's highly dependent on the query

When I see this happen (after we notice that a procedure starts taking a long time), the query planner flips from a custom plan to a generic plan after about the fifth execution in a session of a function/procedure.

This will make it calculate the plan every time:
set plan_cache_mode = force_custom_plan;

--
Born in Arizona, moved to Babylonia.