Обсуждение: Strange performance degregation in sql function (PG11.1)
Hi all,
I recently experienced a performance degradation in an operational system that I can't explain.  I had a function wrapper for a aggregate query that was performing well using the expected indexes with the approximate structure as shown below.
create or replace function example_function(param1 int, param2 int) returns setof custom_type as
$$
    select * from big_table where col1 = param1 and col2 = param2;
$$ language sql;
After creating two new indexes on this table to support a different use case during a migration, this unchanged function reduced in performance by several orders of magnitude.  Running the query inside the function manually on the console however worked as expected and the query plan did not appear to have changed.  On a hunch I changed the structure of the function to the structure below and immediately the query performance returned to the expected baseline.
create or replace function example_function(param1 int, param2 int) returns setof custom_type as
$$
BEGIN
    return query execute format($query$
        select * from big_table where col1 = %1$L and col2 = %1$
    $query$,param1,param2);
END;
$$ language plpgsql;
The source data itself did not change during the time when I noticed this issue.  Can anyone explain or guess what could have caused this degradation?  The only other maintenance that I attempted was 'DISCARD PLANS;' which did not help.
Best regards,
On 5/18/19 4:17 AM, Alastair McKinley wrote: > Hi all, > > I recently experienced a performance degradation in an operational > system that I can't explain. I had a function wrapper for a aggregate > query that was performing well using the expected indexes with the > approximate structure as shown below. > > create or replace function example_function(param1 int, param2 int) > returns setof custom_type as > $$ > select * from big_table where col1 = param1 and col2 = param2; > $$ language sql; > > After creating two new indexes on this table to support a different use > case during a migration, this unchanged function reduced in performance Postgres version? Was the migration from one Postgres version to another? Did you run ANALYZE after migration? More below. > by several orders of magnitude. Running the query inside the function > manually on the console however worked as expected and the query plan > did not appear to have changed. On a hunch I changed the structure of > the function to the structure below and immediately the query > performance returned to the expected baseline. Can you provide the EXPLAIN ANALYZE for each case. If you are worried about the information revealed maybe use the anonymization available here: https://explain.depesz.com/ Using EXECUTE will override the plan caching in plpgsql. > > create or replace function example_function(param1 int, param2 int) > returns setof custom_type as > $$ > BEGIN > return query execute format($query$ > select * from big_table where col1 = %1$L and col2 = %1$ > $query$,param1,param2); > END; > $$ language plpgsql; > > The source data itself did not change during the time when I noticed > this issue. Can anyone explain or guess what could have caused this > degradation? The only other maintenance that I attempted was 'DISCARD > PLANS;' which did not help. > > Best regards, > > Alastair -- Adrian Klaver adrian.klaver@aklaver.com
>>>>> "Alastair" == Alastair McKinley <a.mckinley@analyticsengines.com> writes: Alastair> Hi all, Alastair> I recently experienced a performance degradation in an Alastair> operational system that I can't explain. I had a function Alastair> wrapper for a aggregate query that was performing well using Alastair> the expected indexes with the approximate structure as shown Alastair> below. Alastair> create or replace function example_function(param1 int, param2 int) returns setof custom_type as Alastair> $$ Alastair> select * from big_table where col1 = param1 and col2 = param2; Alastair> $$ language sql; This function isn't inlinable due to missing a STABLE qualifier; that's a pretty big issue. Without inlining, the function will be run only with generic plans, which means that the decision about index usage will be made without knowledge of the parameter values. Was your actual function inlinable? See https://wiki.postgresql.org/wiki/Inlining_of_SQL_functions You can get the query plan of a non-inlined function using the auto_explain module (with its log_nested_statements option). The query plan of non-inlined function calls is not otherwise shown by EXPLAIN. Alastair> After creating two new indexes on this table to support a Alastair> different use case during a migration, this unchanged Alastair> function reduced in performance by several orders of Alastair> magnitude. Running the query inside the function manually on Alastair> the console however worked as expected and the query plan did Alastair> not appear to have changed. But when you run it manually, you'll get a custom plan, based on the parameter values. Alastair> On a hunch I changed the structure of the function to the Alastair> structure below and immediately the query performance Alastair> returned to the expected baseline. Alastair> create or replace function example_function(param1 int, param2 int) returns setof custom_type as Alastair> $$ Alastair> BEGIN Alastair> return query execute format($query$ Alastair> select * from big_table where col1 = %1$L and col2 = %1$ Alastair> $query$,param1,param2); Alastair> END; Alastair> $$ language plpgsql; Using EXECUTE in plpgsql will get you a custom plan every time (though you really should have used EXECUTE USING rather than interpolating the parameters into the query string). I suggest looking into the inlining question first. -- Andrew (irc:RhodiumToad)
Hi Andrew,
Thanks for your in-depth response.  I found that adding the stable qualifier didn't solve the issue unfortunately.  I actually encountered the same issue (or at least extremely similar) today and made a bit more progress on defining it.  
I have a carefully indexed and optimised query that runs in about 2 seconds with this structure and signature.
    create or replace function example_function(param1 int, param2 int) returns setof custom_type as
    $$
        with a_few_ctes ()
        select * from big_table where col1 = param1 and col2 = param2;
    $$ language sql stable;
This function works as expected when executed from a psql client.  I am calling this function via another framework (Postgrest) that executes the function using the following pattern:
    with args as (
        select json_to_record($1) as (param1 int,param2 int)
    ),
    output as (
        select * 
        from example_function(
            param1 := (select param1 from args),
            param2 := (select param2 from args)
        )
    )
    select * from output;
Running this query with the args coming from the CTE resulted in my query running for many tens of minutes before I gave up.  Changing the underlying function to use plpgsql fixes the issue and the query runs as expected.  Both versions work as expected when called directly, but the SQL version does not when called with the args coming from the CTE as shown above.
The new function signature is
    create or replace function example_function(param1 int, param2 int) returns setof custom_type as
    $$
        with a_few_ctes ()
        return query select * from big_table where col1 = param1 and col2 = param2;
    $$ language plpgsql stable;
I haven't been able to check the bad query plan with auto_explain as the query doesn't seem to finish.
So to summarise, changing a stable SQL function to a stable plpgsql function when called with function args from a CTE fixes a huge performance issue of uncertain origin.  I hope someone can offer some suggestions as this has really confused me.
Best regards,
Alastair
From: Andrew Gierth <andrew@tao11.riddles.org.uk>
Sent: 19 May 2019 03:48
To: Alastair McKinley
Cc: pgsql-general@lists.postgresql.org
Subject: Re: Strange performance degregation in sql function (PG11.1) 
Sent: 19 May 2019 03:48
To: Alastair McKinley
Cc: pgsql-general@lists.postgresql.org
Subject: Re: Strange performance degregation in sql function (PG11.1)
>>>>> "Alastair" == Alastair McKinley <a.mckinley@analyticsengines.com> writes:
Alastair> Hi all,
Alastair> I recently experienced a performance degradation in an
Alastair> operational system that I can't explain. I had a function
Alastair> wrapper for a aggregate query that was performing well using
Alastair> the expected indexes with the approximate structure as shown
Alastair> below.
Alastair> create or replace function example_function(param1 int, param2 int) returns setof custom_type as
Alastair> $$
Alastair> select * from big_table where col1 = param1 and col2 = param2;
Alastair> $$ language sql;
This function isn't inlinable due to missing a STABLE qualifier; that's
a pretty big issue.
Without inlining, the function will be run only with generic plans,
which means that the decision about index usage will be made without
knowledge of the parameter values.
Was your actual function inlinable? See
https://wiki.postgresql.org/wiki/Inlining_of_SQL_functions
You can get the query plan of a non-inlined function using the
auto_explain module (with its log_nested_statements option). The query
plan of non-inlined function calls is not otherwise shown by EXPLAIN.
Alastair> After creating two new indexes on this table to support a
Alastair> different use case during a migration, this unchanged
Alastair> function reduced in performance by several orders of
Alastair> magnitude. Running the query inside the function manually on
Alastair> the console however worked as expected and the query plan did
Alastair> not appear to have changed.
But when you run it manually, you'll get a custom plan, based on the
parameter values.
Alastair> On a hunch I changed the structure of the function to the
Alastair> structure below and immediately the query performance
Alastair> returned to the expected baseline.
Alastair> create or replace function example_function(param1 int, param2 int) returns setof custom_type as
Alastair> $$
Alastair> BEGIN
Alastair> return query execute format($query$
Alastair> select * from big_table where col1 = %1$L and col2 = %1$
Alastair> $query$,param1,param2);
Alastair> END;
Alastair> $$ language plpgsql;
Using EXECUTE in plpgsql will get you a custom plan every time (though
you really should have used EXECUTE USING rather than interpolating the
parameters into the query string).
I suggest looking into the inlining question first.
--
Andrew (irc:RhodiumToad)
Alastair> Hi all,
Alastair> I recently experienced a performance degradation in an
Alastair> operational system that I can't explain. I had a function
Alastair> wrapper for a aggregate query that was performing well using
Alastair> the expected indexes with the approximate structure as shown
Alastair> below.
Alastair> create or replace function example_function(param1 int, param2 int) returns setof custom_type as
Alastair> $$
Alastair> select * from big_table where col1 = param1 and col2 = param2;
Alastair> $$ language sql;
This function isn't inlinable due to missing a STABLE qualifier; that's
a pretty big issue.
Without inlining, the function will be run only with generic plans,
which means that the decision about index usage will be made without
knowledge of the parameter values.
Was your actual function inlinable? See
https://wiki.postgresql.org/wiki/Inlining_of_SQL_functions
You can get the query plan of a non-inlined function using the
auto_explain module (with its log_nested_statements option). The query
plan of non-inlined function calls is not otherwise shown by EXPLAIN.
Alastair> After creating two new indexes on this table to support a
Alastair> different use case during a migration, this unchanged
Alastair> function reduced in performance by several orders of
Alastair> magnitude. Running the query inside the function manually on
Alastair> the console however worked as expected and the query plan did
Alastair> not appear to have changed.
But when you run it manually, you'll get a custom plan, based on the
parameter values.
Alastair> On a hunch I changed the structure of the function to the
Alastair> structure below and immediately the query performance
Alastair> returned to the expected baseline.
Alastair> create or replace function example_function(param1 int, param2 int) returns setof custom_type as
Alastair> $$
Alastair> BEGIN
Alastair> return query execute format($query$
Alastair> select * from big_table where col1 = %1$L and col2 = %1$
Alastair> $query$,param1,param2);
Alastair> END;
Alastair> $$ language plpgsql;
Using EXECUTE in plpgsql will get you a custom plan every time (though
you really should have used EXECUTE USING rather than interpolating the
parameters into the query string).
I suggest looking into the inlining question first.
--
Andrew (irc:RhodiumToad)
čt 23. 5. 2019 v 23:38 odesílatel Alastair McKinley <a.mckinley@analyticsengines.com> napsal:
Hi Andrew,Thanks for your in-depth response. I found that adding the stable qualifier didn't solve the issue unfortunately. I actually encountered the same issue (or at least extremely similar) today and made a bit more progress on defining it.I have a carefully indexed and optimised query that runs in about 2 seconds with this structure and signature.create or replace function example_function(param1 int, param2 int) returns setof custom_type as$$with a_few_ctes ()select * from big_table where col1 = param1 and col2 = param2;$$ language sql stable;This function works as expected when executed from a psql client. I am calling this function via another framework (Postgrest) that executes the function using the following pattern:with args as (select json_to_record($1) as (param1 int,param2 int)),output as (select *from example_function(param1 := (select param1 from args),param2 := (select param2 from args)))select * from output;Running this query with the args coming from the CTE resulted in my query running for many tens of minutes before I gave up. Changing the underlying function to use plpgsql fixes the issue and the query runs as expected. Both versions work as expected when called directly, but the SQL version does not when called with the args coming from the CTE as shown above.The new function signature iscreate or replace function example_function(param1 int, param2 int) returns setof custom_type as$$with a_few_ctes ()return query select * from big_table where col1 = param1 and col2 = param2;$$ language plpgsql stable;I haven't been able to check the bad query plan with auto_explain as the query doesn't seem to finish.So to summarise, changing a stable SQL function to a stable plpgsql function when called with function args from a CTE fixes a huge performance issue of uncertain origin. I hope someone can offer some suggestions as this has really confused me.
SQL functions are fast when they are inlined, but when are not inlined, then they are significantly slower than plpgsql.
I am not sure, long time I didn't see this code. If I remember well, SQL functions doesn't cache plans - so creates and lost plans every time.
Best regards,AlastairFrom: Andrew Gierth <andrew@tao11.riddles.org.uk>
Sent: 19 May 2019 03:48
To: Alastair McKinley
Cc: pgsql-general@lists.postgresql.org
Subject: Re: Strange performance degregation in sql function (PG11.1)>>>>> "Alastair" == Alastair McKinley <a.mckinley@analyticsengines.com> writes:
Alastair> Hi all,
Alastair> I recently experienced a performance degradation in an
Alastair> operational system that I can't explain. I had a function
Alastair> wrapper for a aggregate query that was performing well using
Alastair> the expected indexes with the approximate structure as shown
Alastair> below.
Alastair> create or replace function example_function(param1 int, param2 int) returns setof custom_type as
Alastair> $$
Alastair> select * from big_table where col1 = param1 and col2 = param2;
Alastair> $$ language sql;
This function isn't inlinable due to missing a STABLE qualifier; that's
a pretty big issue.
Without inlining, the function will be run only with generic plans,
which means that the decision about index usage will be made without
knowledge of the parameter values.
Was your actual function inlinable? See
https://wiki.postgresql.org/wiki/Inlining_of_SQL_functions
You can get the query plan of a non-inlined function using the
auto_explain module (with its log_nested_statements option). The query
plan of non-inlined function calls is not otherwise shown by EXPLAIN.
Alastair> After creating two new indexes on this table to support a
Alastair> different use case during a migration, this unchanged
Alastair> function reduced in performance by several orders of
Alastair> magnitude. Running the query inside the function manually on
Alastair> the console however worked as expected and the query plan did
Alastair> not appear to have changed.
But when you run it manually, you'll get a custom plan, based on the
parameter values.
Alastair> On a hunch I changed the structure of the function to the
Alastair> structure below and immediately the query performance
Alastair> returned to the expected baseline.
Alastair> create or replace function example_function(param1 int, param2 int) returns setof custom_type as
Alastair> $$
Alastair> BEGIN
Alastair> return query execute format($query$
Alastair> select * from big_table where col1 = %1$L and col2 = %1$
Alastair> $query$,param1,param2);
Alastair> END;
Alastair> $$ language plpgsql;
Using EXECUTE in plpgsql will get you a custom plan every time (though
you really should have used EXECUTE USING rather than interpolating the
parameters into the query string).
I suggest looking into the inlining question first.
--
Andrew (irc:RhodiumToad)