Обсуждение: COMMIT IN STORED PROCEDURE WHILE IN A LOOP

Поиск
Список
Период
Сортировка

COMMIT IN STORED PROCEDURE WHILE IN A LOOP

От
Ravi Krishna
Дата:
AWS Aurora based on PG 13

I am writing a sproc to copy a schema into another.  Here is the relevant portion of the code.
Basically I want to commit after every table is created.  In big schemas with hundreds of table I do not want to run
entireoperation in one transaction. 

I am getting error at COMMIT -> cannot commit while a subtransaction is active.
Is commit not possible in a loop


 

BEGIN

  FOR object IN

    SELECT table_name::text FROM information_schema.TABLES WHERE table_schema = source_schema

          AND table_type = 'BASE TABLE'

  LOOP


      buffer := dest_schema || '.' || object;

      BEGIN

                    sql_stmt := 'CREATE TABLE ' || buffer || ' (LIKE ' || source_schema || '.' || object;

                    sql_stmt :=  sql_stmt || ' INCLUDING COMMENTS INCLUDING IDENTITY INCLUDING CONSTRAINTS INCLUDING
DEFAULTSINCLUDING GENERATED INCLUDING INDEXES)' ; 

                    --RAISE NOTICE '%' , sql_stmt ;

                    execute sql_stmt ;

                    COMMIT;

                    sql_stmt := 'INSERT INTO ' || buffer || ' OVERRIDING SYSTEM VALUE SELECT * FROM ' || source_schema
||'.' || object ; 

                    --RAISE NOTICE '%' , sql_stmt ;

                    execute sql_stmt ;

                    COMMIT;

            EXCEPTION

            WHEN OTHERS THEN

            GET STACKED DIAGNOSTICS text_var1 = MESSAGE_TEXT,

                                                                        text_var2 = PG_EXCEPTION_DETAIL,

                                                                        text_var3 = PG_EXCEPTION_HINT;

             print_msg := 'ERROR:->' || text_var1 || '|' || text_var2 || '|' || text_var3 ;

             RAISE NOTICE '%' , print_msg ;

            END ;

  END LOOP;

END;




Re: COMMIT IN STORED PROCEDURE WHILE IN A LOOP

От
Christophe Pettus
Дата:

> On Oct 18, 2022, at 13:14, Ravi Krishna <s_ravikrishna@aol.com> wrote:
>
> I am getting error at COMMIT -> cannot commit while a subtransaction is active.
> Is commit not possible in a loop

You can commit in a loop, but not in BEGIN / END block that has an exception handler: that creates a subtransaction for
theduration of the BEGIN / END. 


Re: COMMIT IN STORED PROCEDURE WHILE IN A LOOP

От
Bryn Llewellyn
Дата:
> xof@thebuild.com wrote:
>
>> s_ravikrishna@aol.com wrote:
>>
>> I am getting error at COMMIT -> cannot commit while a subtransaction is active...
>
> You can commit in a loop, but not in BEGIN / END block that has an exception handler: that creates a subtransaction
forthe duration of the BEGIN / END. 

This surprised me when I first started to use PG (after all those years with ORCL). There are some use cases where this
limitationis a nuisance. For example, if you want to implement the famous retry loop (with exponential backoff) for an
errorthat is provoked at commit time... well, you have to do it in client-side code. 

Could the limitation be lifted by making tractable internal implementation changes? Or is it rooted in profoundly deep
featuresof the architecture—meaning that it could never be lifted? 


Re: COMMIT IN STORED PROCEDURE WHILE IN A LOOP

От
Ravi Krishna
Дата:
> You can commit in a loop, but not in BEGIN / END block that has an exception handler:
> that creates a subtransaction for the duration of the BEGIN / END.

The reason I have to deal with error exception is that I want to ignore failure on a table and move on to next table. 

I thought I can trick it by moving the part of the code which executes dynamic sql and has exception handler in it,
to a separate procedure.  This way I can keep the main procedure without exception handler.  It failed too with
the same error.  So that means this restriction is not at lexical level, but at execution level.
This is so disappointing.  It seems there is no way I can copy each table and commit it individually and also capture error.

Re: COMMIT IN STORED PROCEDURE WHILE IN A LOOP

От
Christophe Pettus
Дата:

> On Oct 18, 2022, at 14:29, Ravi Krishna <s_ravikrishna@aol.com> wrote:
>
> > You can commit in a loop, but not in BEGIN / END block that has an exception handler:
> > that creates a subtransaction for the duration of the BEGIN / END.
>
> The reason I have to deal with error exception is that I want to ignore failure on a table and move on to next table.


Rather than have a loop inside the BEGIN / END, you could put the BEGIN EXCEPTION END inside the loop, catch the error,
storethe important parts of the exception in a variable, and then do the COMMIT after the END statement but before the
nextiteration of the loop.  A bit messier, but it gets the job done. 


Re: COMMIT IN STORED PROCEDURE WHILE IN A LOOP

От
Tom Lane
Дата:
Bryn Llewellyn <bryn@yugabyte.com> writes:
> xof@thebuild.com wrote:
>> You can commit in a loop, but not in BEGIN / END block that has an exception handler: that creates a subtransaction
forthe duration of the BEGIN / END. 

> This surprised me when I first started to use PG (after all those years
> with ORCL).

Really?  BEGIN with an exception block is a subtransaction because it's
defined to roll back to the database state as of the start of the block
if an exception occurs.  COMMIT in the middle fundamentally conflicts
with that, I should think.  Does Oracle interpret that differently?

            regards, tom lane



Re: COMMIT IN STORED PROCEDURE WHILE IN A LOOP

От
Christophe Pettus
Дата:

> On Oct 18, 2022, at 14:15, Bryn Llewellyn <bryn@yugabyte.com> wrote:
> Could the limitation be lifted by making tractable internal implementation changes? Or is it rooted in profoundly
deepfeatures of the architecture—meaning that it could never be lifted? 

That is a very good question.  One of the issues (as I understand it) is to be able to continue to use the same
connectionin the event of an error.  Here's the scenario: 

1. SELECT * FROM my_func();

2. my_func gets an error.  The connection is now in an error state.

3. So, catching the exception doesn't do you much good, because future operations will get an error.

This problem is solved by wrapping it in a savepoint, since rolling back to the savepoint will undo the error state of
theconnection. 

There might be a way forward, but it's hard to get the proper semantics of an exception handler without doing something
likethat.  


Re: COMMIT IN STORED PROCEDURE WHILE IN A LOOP

От
Laurenz Albe
Дата:
On Tue, 2022-10-18 at 17:33 -0400, Tom Lane wrote:
> Bryn Llewellyn <bryn@yugabyte.com> writes:
> > xof@thebuild.com wrote:
> > > You can commit in a loop, but not in BEGIN / END block that has an exception handler: that creates a
subtransactionfor the duration of the BEGIN / END.
 
> 
> > This surprised me when I first started to use PG (after all those years
> > with ORCL).
> 
> Really?  BEGIN with an exception block is a subtransaction because it's
> defined to roll back to the database state as of the start of the block
> if an exception occurs.  COMMIT in the middle fundamentally conflicts
> with that, I should think.  Does Oracle interpret that differently?

Looks like Oracle doesn't care much about that:

SQL> CREATE TABLE mytab (id integer CHECK (id > 0));

Table created.

SQL> CREATE PROCEDURE committest IS
  2  BEGIN
  3     INSERT INTO mytab VALUES (42);
  4     COMMIT;
  5     INSERT INTO mytab VALUES (-42);
  6  EXCEPTION
  7     WHEN OTHERS THEN
  8        NULL;
  9  END;
 10  /

Procedure created.

SQL> CALL committest();

Call completed.

SQL> SELECT * FROM mytab;

    ID
----------
    42

I looks like Oracle allows you to randomly interfere with its transaction handling.
If you run commit and then enter an exception handler, it simply doesn't rollback.

Yours,
Laurenz Albe



Re: COMMIT IN STORED PROCEDURE WHILE IN A LOOP

От
gogala.mladen@gmail.com
Дата:
On Tue, 2022-10-18 at 17:33 -0400, Tom Lane wrote:
Bryn Llewellyn <bryn@yugabyte.com> writes:
You can commit in a loop, but not in BEGIN / END block that has an exception handler: that creates a subtransaction for the duration of the BEGIN / END.

This surprised me when I first started to use PG (after all those years
with ORCL).

Really?  BEGIN with an exception block is a subtransaction because it's
defined to roll back to the database state as of the start of the block
if an exception occurs.  COMMIT in the middle fundamentally conflicts
with that, I should think.  Does Oracle interpret that differently?

                        regards, tom lane



Hi Tom,
Yes, Oracle does interpret that differently. Bryn may correct me if I'm wrong, but Oracle creates an implicit save point when it encounters BEGIN. Exception handler doesn't necessarily roll things back. Oracle behavior is not standard and PgSQL adheres to SQL standard better than Oracle. However, being as pervasive as it is, Oracle is de facto  standard.
Also, Oracle has something called "autonomous transaction" which, in effect, means that a session can have two concurrent transactions open, which is also non-standard:


Amazon, lead by Kevin Closson, the guy who has famously designed Oracle Exadata among other things, even came up with the recipe how to migrate it to Postgres:


I am a bit skeptical toward that recipe and I usually prefer programming solutions with opening another thread and sending a message. BTW, speaking of Kevin, he has also written pgio, which is a PostgreSQL version of his SLOB package. Kevin is the only retired Oracle ACE in existence. BTW, Bryn also used to be an Oracle ACE.
Regards

Re: COMMIT IN STORED PROCEDURE WHILE IN A LOOP

От
gogala.mladen@gmail.com
Дата:
On Tue, 2022-10-18 at 14:31 -0700, Christophe Pettus wrote:

Rather than have a loop inside the BEGIN / END, you could put the BEGIN EXCEPTION END inside the loop, catch the error, store the important parts of the exception in a variable, and then do the COMMIT after the END statement but before the next iteration of the loop.  A bit messier, but it gets the job done.

Commit within a loop is an extremely bad idea. Commit is an expensive operation which includes incrementing the XID, which is global and not local to the process. There is also a WAL write which has to be waited on. Every commit implies at least one write operation. If that was not the case, Postgres wouldn't be ACID compliant. There would be problem with the "D". Commit within a loop will have many adverse effects on the performance. Here is what Tom Kyte, who used to be a celebrity in the world of Oracle, said on this topic:


Surprisingly enough, the argument in the article is mostly portable, doesn't depend on the database type at all. Basically, transaction is a logical unit of work. If 1000 rows need to be updated, it's better to update them in a single transaction than in 1000 transactions. That is particularly true for Postgres which doesn't have problems with the undo tablespace and ORA-1555 "snapshot too old" error because of the different architecture. Also, Oracle has a secret optimization: it doesn't wait for commit, if the commit is issued within PL/SQL loop. Your idea solves the syntactic problem with commits within PLPG/SQL  loops but it doesn't solve other problems that such programming causes. Commit within loop is a very bad idea. The best way to resolve the problems with commit within the loop is to remove the programmer trying to do that from the project. In my humble opinion, programmers who do stuff like that should suffer unusual and cruel punishment.

PS:
----
I am sure, that if Tom Kyte would dare to make a suggestion on this list, there would be someone who would try to explain "the Postgres way" to him in a condescending manner. I've seen that before.

Re: COMMIT IN STORED PROCEDURE WHILE IN A LOOP

От
Christophe Pettus
Дата:

> On Oct 18, 2022, at 19:18, gogala.mladen@gmail.com wrote:
>
> Commit within a loop is an extremely bad idea.

This is an over-generalization.  There are many use-cases for this (if there were not, procedures wouldn't have been
nearlyas important a feature). 

For example, if you are processing a large update (in the hundreds of thousands or more of rows), you often want to
commitregularly so that other processes don't have to wait for the whole thing to finish due to row-level locks, and to
givevacuum a chance to deal with the dead tuples.  Similarly, while inserting one row at a time and committing is
usuallynot a great idea, it can make sense to do large inserts in batches. 

Applications do this kind of thing all the time, very successfully; it was just that the loop was in the application
ratherthan in the procedure. 

High commit rates happen all the time, and they don't break PostgreSQL.  For example, an IoT application collecting
sensordata and doing many inserts per second is also doing many commits per second, since each bare INSERT is in its
owntransaction.  PostgreSQL handles it just fine. 


Re: COMMIT IN STORED PROCEDURE WHILE IN A LOOP

От
Mladen Gogala
Дата:
Comments in-line.

On Tue, 2022-10-18 at 21:02 -0700, Christophe Pettus wrote:


On Oct 18, 2022, at 19:18, gogala.mladen@gmail.com wrote:

Commit within a loop is an extremely bad idea.

This is an over-generalization.  There are many use-cases for this (if there were not, procedures wouldn't have been nearly as important a feature).

For example, if you are processing a large update (in the hundreds of thousands or more of rows), you often want to commit regularly so that other processes don't have to wait for the whole thing to finish due to row-level locks, and to give vacuum a chance to deal with the dead tuples.  Similarly, while inserting one row at a time and committing is usually not a great idea, it can make sense to do large inserts in batches.

That depends. Multiple commits will slow down the processing. If the goal of the exercise is to let update complete as quickly as possible, then it will be a single commit. If the goal is to enable normal processing and let the enormous update complete in its due time, then you'll do what you describe.



Applications do this kind of thing all the time, very successfully; it was just that the loop was in the application rather than in the procedure.

High commit rates happen all the time, and they don't break PostgreSQL.  For example, an IoT application collecting sensor data and doing many inserts per second is also doing many commits per second, since each bare INSERT is in its own transaction.  PostgreSQL handles it just fine.

Point of my post is that the business logic, in your case it's IoT sensors, determines what is transaction and when to commit. Advice like "commit often and commit early", to paraphrase the famous Chicago mayor, is easy to find but I would take it with grain of salt.
Regards
-- 
Mladen Gogala
Database Consultant

Re: COMMIT IN STORED PROCEDURE WHILE IN A LOOP

От
Ron
Дата:
On 10/19/22 08:06, Mladen Gogala wrote:
[snip]


Applications do this kind of thing all the time, very successfully; it was just that the loop was in the application rather than in the procedure.

High commit rates happen all the time, and they don't break PostgreSQL.  For example, an IoT application collecting sensor data and doing many inserts per second is also doing many commits per second, since each bare INSERT is in its own transaction.  PostgreSQL handles it just fine.

Point of my post is that the business logic, in your case it's IoT sensors, determines what is transaction and when to commit. Advice like "commit often and commit early", to paraphrase the famous Chicago mayor, is easy to find but I would take it with grain of salt.

In the normal course of operation (i.e, not when bulk loading), you should commit at the end of every "business transaction".  We've committed after X business  when running stovepipe "batch" jobs processing input files.  In those cases, though, we had to track progress through the file; in the case of a rollback, the application had to go back to the last input file "save point" and start over.

--
Angular momentum makes the world go 'round.

Re: COMMIT IN STORED PROCEDURE WHILE IN A LOOP

От
Bryn Llewellyn
Дата:
It seems that I made a thread-discipline error when I asked a question that had nothing to do with the frequency, or
thecost, of committing when I saw this reply (paraphrased for brevity here) from Christophe: 

> You [cannot] commit in [a] BEGIN / END [block statement] that has an exception handler [because] that creates a
subtransactionfor the duration of the [block statement]. 

I asked this in response (again, paraphrased for brevity):

> Could the limitation be lifted...? [Or is the limitation] rooted in profoundly deep features of the architecture?

Sorry that I caused some distraction. Anyway, Tom replied immediately. He said:

> BEGIN with an exception block is a subtransaction because it's defined to roll back to the database state as of the
startof the block if an exception occurs. COMMIT in the middle fundamentally conflicts with that, I should think. 


Thanks, Tom. It's clear to me now that the present PG paradigm will never, ever change.

So my conclusion stands for this use case: I'm using "serializable" isolation (which luxury ORCL doesn't afford me); I
knowthat I can get a "cannot serialize" error at "commit" time. Here, I cannot take appropriate action within my
PL/pgSQLcode and hide the whole story of what this is about from client code. Rather, I must explain the business to
theauthors of the next tier, and teach them when, and how, retry is appropriate. 

Tom asked, too, if ORCL has a different paradigm... Briefly, yes—and radically so. But (all of you) do please feel free
toskip over my sketch here if it doesn't interest you. 
— — — — — — — — — — — — — — — — — — — — — — — — — — — — — —
«
The real account of which this is a sketch is of no interest at all if you work only with PG and have never used ORCL.
Butif you need to switch, intellectually, from one to the other—and maybe need to do this all the time because your job
dutiesspan both systems—then it *is* interesting. 

ORCL's PL/SQL is compiled (and optimized) at "create procedure" time. The new source code replaces the old in the
catalog.And if it failed to compile, then you're left with an invalid unit that you cannot execute. Compilation errors
arerecorded in the catalog too. Further, static dependencies (proc upon proc, proc upon table, etc) are also recorded
inthe catalog. This is intimately connected with the hard distinction between static and dynamic SQL. The latter simply
passesthe text on "as is" into the byte code to be dealt with at run-time. Only statements like "select", "insert",
"update",delete" and a few others can be static SQL. Table creation and the like must be dynamic SQL. This probably
shocksthose who move from PG to ORCL because you cannot, for example, create a table and then operate on it with static
SQLin the same procedure. 

In particular, for the present discussion, the PL/SQL block statement is a pure lexical device. (This is the case in
PL/SQL'sprogenitor, ADA. And that's where all that stuff about DIANA, that the PL/SQL programmer eventually comes to
hearabout, comes from.) All memory that you had a block statement in the source is lost in the compiled so-called byte
codethat gets interpreted at run time. On the other hand, every call from PL/SQL to SQL is done in its own
subtransaction—andif it fails, then that single statement is atomically rolled back. The effect of all the SQLs to
date,at this moment, remains intact—but uncommitted. (Of course, you might have issued "commit"(s) programmatically. So
I'mtalking about SQLs that were done since the most recent "commit".)  

Significantly, the failure of a call from PL/SQL to SQL raises an exception—so (as well as the single-statement
rollback)you now have an in-flight exception that flies up through successive scopes in search of a matching handler.
Ifit remains unhandled at the last moment before the top-level PL/SQL "call" is due to finish, then a "rollback" is
automaticallyissued. But if a handler *is* found, well... the exception is dead and you can carry on. Like everything
elsein programming, the code author must work out what "safe" is. (It could be to turn an insert that fails 'cos a
uniquekey is violated into an update.) In ORCL, just as in PG, writing "when others than null" is held to be stupid.
Andcode examples that do this are deemed to be not worthy of discussion. 

Though the paradigms are different, each allows you properly to implement mission-critical applications. It's rather
likeEnglish and Chinese. Astonishingly different. But each supports all that you need to let people communicate about
mundanedaily business, science, philosophy, epistemology, and so on.) 
»







Re: COMMIT IN STORED PROCEDURE WHILE IN A LOOP

От
Thomas Kellerer
Дата:
gogala.mladen@gmail.com schrieb am 19.10.2022 um 01:46:
> Amazon, lead by Kevin Closson, the guy who has famously designed
> Oracle Exadata among other things, even came up with the recipe how
> to migrate it to Postgres:
>
> https://aws.amazon.com/blogs/database/migrating-oracle-autonomous-transactions-to-postgresql/


The workaround to use dblink to simulate autonomous transactions is nothing new,
and has been floating around for quite a while.

Here is a blog post from 2012 (6 years before the Amazon recipe was published)

https://raghavt.blog/autonomous-transaction-in-postgresql-9-1/

and another one from 2016

https://blog.dalibo.com/2016/09/08/Pragma_Autonoumous_transactions.html




Re: COMMIT IN STORED PROCEDURE WHILE IN A LOOP

От
Benedict Holland
Дата:
This seems like a bad use of a stored procedure. Why wouldn't you spin up 40 clients with a table name and run it across 40 connections? But also, I don't like loops in stored procedures. Working with loops in a set based system hurts me but it's personal preference.

Like, I could write this in python in less than 4 hours with an extremely simple sproc. How often are you running table deletions like this to require a sproc?

Thanks,
Ben

On Wed, Oct 19, 2022, 5:39 PM Thomas Kellerer <shammat@gmx.net> wrote:
gogala.mladen@gmail.com schrieb am 19.10.2022 um 01:46:
> Amazon, lead by Kevin Closson, the guy who has famously designed
> Oracle Exadata among other things, even came up with the recipe how
> to migrate it to Postgres:
>
> https://aws.amazon.com/blogs/database/migrating-oracle-autonomous-transactions-to-postgresql/


The workaround to use dblink to simulate autonomous transactions is nothing new,
and has been floating around for quite a while.

Here is a blog post from 2012 (6 years before the Amazon recipe was published)

https://raghavt.blog/autonomous-transaction-in-postgresql-9-1/

and another one from 2016

https://blog.dalibo.com/2016/09/08/Pragma_Autonoumous_transactions.html