Обсуждение: Invoking SQL function while doing CREATE OR REPLACE on it
I have a Postgres 14.4 cluster (AWS Aurora) to which I connect from my application using JDBC. I use liquibase for schema management - not only tables, but also a bunch of SQL stored procedures and functions. Basically,there is one liquibase changeSet that runs last and executes a set of SQL files which contain stored proceduresand functions. CREATE OR REPLACE FUNCTION bar(arg1 text, arg2 text) RETURNS record LANGUAGE "plpgsql" AS ' BEGIN // function body END; '; These functions / procedures are replaced ONLY when there is a change in one / more SQL files which are part of this changeSet.(runOnChange: true). Whenever I do a rolling deployment of my application (say, with a change in the function body of bar()), liquibase will executethe CREATE OR REPLACE FUNCTION bar() as part of a transaction. In the few milliseconds while bar() is being replaced, there are other ongoing transactions (from other replicas of my application)which are continuously trying to invoke bar(). Only in this tiny time window, few transactions fail with the following error: ERROR: function bar(arg1 => text, arg2 => text) does not exist Hint: No function matches the given name and argument types. You might need to add explicit type casts. Position: 4 : errorCode = 42883 I don't want any of these transactions to fail (we do not have any proper way of re-trying them from the application layer). This is seen as affecting availability. However, it is acceptable for these transactions to BLOCK (for a few hundred ms) while the SQL function body is being replaced,and then proceed with invocation. Is there a way to safely modify a stored function / procedure in PostgreSQL while that function / procedure is being invokedcontinuously by multiple transactions? Thanks!
> On 03/05/2023 20:17 CEST Nagendra Mahesh (namahesh) <namahesh@cisco.com> wrote: > > I have a Postgres 14.4 cluster (AWS Aurora) to which I connect from my > application using JDBC. > > I use liquibase for schema management - not only tables, but also a bunch of > SQL stored procedures and functions. Basically, there is one liquibase > changeSet that runs last and executes a set of SQL files which contain stored > procedures and functions. > > CREATE OR REPLACE FUNCTION bar(arg1 text, arg2 text) RETURNS record LANGUAGE "plpgsql" AS ' > BEGIN > // function body > END; > '; > > These functions / procedures are replaced ONLY when there is a change in one / > more SQL files which are part of this changeSet. (runOnChange: true). > > Whenever I do a rolling deployment of my application (say, with a change in > the function body of bar()), liquibase will execute the CREATE OR REPLACE FUNCTION bar() > as part of a transaction. > > In the few milliseconds while bar() is being replaced, there are other ongoing > transactions (from other replicas of my application) which are continuously > trying to invoke bar(). > > Only in this tiny time window, few transactions fail with the following error: > > ERROR: function bar(arg1 => text, arg2 => text) does not exist > Hint: No function matches the given name and argument types. You might need to add explicit type casts. > Position: 4 : errorCode = 42883 CREATE OR REPLACE FUNCTION should be atomic and cannot change the function signature. I don't see how a function cannot exist at some point in this case. Are you sure that Liquibase is not dropping the function before re-creating it? If Liquibase drops and re-creates the function in separate transactions, the transactions trying to execute that function may find it dropped when using the read committed isolation level. There's also a race condition bug in v14.4 that may be relevant. It got fixed in v14.5. See "Fix race condition when checking transaction visibility" in https://www.postgresql.org/docs/14/release-14-5.html. -- Erik
Erik Wienhold <ewie@ewie.name> writes: > On 03/05/2023 20:17 CEST Nagendra Mahesh (namahesh) <namahesh@cisco.com> wrote: >> I have a Postgres 14.4 cluster (AWS Aurora) to which I connect from my >> application using JDBC. Aurora uses, I believe, some other storage engine entirely than community Postgres has. >> Only in this tiny time window, few transactions fail with the following error: >> ERROR: function bar(arg1 => text, arg2 => text) does not exist >> Hint: No function matches the given name and argument types. You might need to add explicit type casts. > There's also a race condition bug in v14.4 that may be relevant. It got fixed > in v14.5. See "Fix race condition when checking transaction visibility" in > https://www.postgresql.org/docs/14/release-14-5.html. That race could easily explain this symptom: during the update, there are two versions of the function's pg_proc row, and it could be that both of them appear "dead" to an onlooker transaction if one of them is inspected during the race window. Then the onlooker would find no live version of the row and report that it doesn't exist. But having said that, it's not clear to me whether Aurora's storage engine shares this bug with community PG, or you're seeing some independent bug of theirs that happens to have a similar symptom. It's even less clear whether AWS would have applied the fix yet if it is a shared bug. You really need to discuss this with AWS support. regards, tom lane
Thanks for the insight, Erik. Turns out, it was indeed liquibase which was: - dropping the functions in one transaction (inadvertently as the result of a DROP TYPE CASCADE) - immediately followed by the next transaction which did a CREATE OR REPLACE FUNCTION which recreated these functions The tiny gap in time after the first transaction commit, before the second transaction could commit was when the functionactually ceased to exist. So, the invocations during that time failed. Thanks again. From: Erik Wienhold <ewie@ewie.name> Sent: Thursday, May 4, 2023 2:33 AM To: Nagendra Mahesh (namahesh) <namahesh@cisco.com>; pgsql-general@postgresql.org <pgsql-general@postgresql.org> Subject: Re: Invoking SQL function while doing CREATE OR REPLACE on it > On 03/05/2023 20:17 CEST Nagendra Mahesh (namahesh) <namahesh@cisco.com> wrote: > > I have a Postgres 14.4 cluster (AWS Aurora) to which I connect from my > application using JDBC. > > I use liquibase for schema management - not only tables, but also a bunch of > SQL stored procedures and functions. Basically, there is one liquibase > changeSet that runs last and executes a set of SQL files which contain stored > procedures and functions. > > CREATE OR REPLACE FUNCTION bar(arg1 text, arg2 text) RETURNS record LANGUAGE "plpgsql" AS ' > BEGIN > // function body > END; > '; > > These functions / procedures are replaced ONLY when there is a change in one / > more SQL files which are part of this changeSet. (runOnChange: true). > > Whenever I do a rolling deployment of my application (say, with a change in > the function body of bar()), liquibase will execute the CREATE OR REPLACE FUNCTION bar() > as part of a transaction. > > In the few milliseconds while bar() is being replaced, there are other ongoing > transactions (from other replicas of my application) which are continuously > trying to invoke bar(). > > Only in this tiny time window, few transactions fail with the following error: > > ERROR: function bar(arg1 => text, arg2 => text) does not exist > Hint: No function matches the given name and argument types. You might need to add explicit type casts. > Position: 4 : errorCode = 42883 CREATE OR REPLACE FUNCTION should be atomic and cannot change the function signature. I don't see how a function cannot exist at some point in this case. Are you sure that Liquibase is not dropping the function before re-creating it? If Liquibase drops and re-creates the function in separate transactions, the transactions trying to execute that function may find it dropped when using the read committed isolation level. There's also a race condition bug in v14.4 that may be relevant. It got fixed in v14.5. See "Fix race condition when checking transaction visibility" in https://www.postgresql.org/docs/14/release-14-5.html. -- Erik