Обсуждение: Serializable transaction restart/re-execute

Поиск

Список

Период

Сортировка

Serializable transaction restart/re-execute

От

Filipe Pina

Дата:

03 апреля 2015 г., 17:35:24

Hello,

I come from a GTM background and once of the transactional features there are the “Transaction Restarts”.

Transaction restart is when we have two concurrent processes reading/writing to the same region/table of the database, the last process to commit will “see” that the database is not the same as it was when the transaction started and goes back to the beginning of the transactional code and re-executes it.

The closest I found to this in PGSQL is the Serializable transaction isolation mode and it does seem to work well except it simply throws an error (serialization_failure) instead of restarting.

I’m trying to make use of this exception to implement restartable functions and I have all the examples and conditions mentioned here in a question in SO (without any answer so far…):

http://stackoverflow.com/questions/29372202/postgresql-generic-handler-for-serialization-failure

So basically I have two questions:

- the restartable “wrapper” function never gets its “DB view” refreshed once it restarts, I assume it’s because of the outter transaction (at function level) so it never re-reads the new values and keeps failing with serialization_failure.. Any way to solve this?

- the ideal would be to be able to define this at database level so I wouldn’t have to implement wrappers for all functions.. Implementing a “serialization_failure” generic handler that would simply re-call the function that threw that exception (up to a number of tries). Is this possible without going into pgsql source code?

Thanks,

Filipe

Re: Serializable transaction restart/re-execute

От

Bill Moran

Дата:

04 апреля 2015 г., 01:07:16

On Fri, 3 Apr 2015 15:35:14 +0100
Filipe Pina <filipe.pina@impactzero.pt> wrote:

> Hello,
>
> I come from a GTM background and once of the transactional features there are the ?Transaction Restarts?.
>
> Transaction restart is when we have two concurrent processes reading/writing to the same region/table of the
database,the last process to commit will ?see? that the database is not the same as it was when the transaction started
andgoes back to the beginning of the transactional code and re-executes it. 
>
> The closest I found to this in PGSQL is the Serializable transaction isolation mode and it does seem to work well
exceptit simply throws an error (serialization_failure) instead of restarting. 
>
> I?m trying to make use of this exception to implement restartable functions and I have all the examples and
conditionsmentioned here in a question in SO (without any answer so far?): 
>
> http://stackoverflow.com/questions/29372202/postgresql-generic-handler-for-serialization-failure
<http://stackoverflow.com/questions/29372202/postgresql-generic-handler-for-serialization-failure>
>
> So basically I have two questions:
> - the restartable ?wrapper? function never gets its ?DB view? refreshed once it restarts, I assume it?s because of
theoutter transaction (at function level) so it never re-reads the new values and keeps failing with
serialization_failure..Any way to solve this? 
> - the ideal would be to be able to define this at database level so I wouldn?t have to implement wrappers for all
functions..Implementing a ?serialization_failure? generic handler that would simply re-call the function that threw
thatexception (up to a number of tries). Is this possible without going into pgsql source code? 

I suspect that savepoints will accomplish what you want:
http://www.postgresql.org/docs/9.4/static/sql-savepoint.html

--
Bill Moran

Re: Serializable transaction restart/re-execute

От

Filipe Pina

Дата:

06 апреля 2015 г., 12:46:38

Hi Bill, thanks for the quick reply.

I had read about SAVEPOINTs but I also read I cannot use them in PLPGSQL and
should use BEGIN/END blocks and EXCEPTIONs.

Did you check the URL I mentioned? I have the code I used there:

CREATE OR REPLACE FUNCTION myretest() RETURNS integer AS $$
DECLARE
tries integer := 5;
BEGIN
WHILE TRUE LOOP
BEGIN -- nested block for exception
RETURN mytest();
EXCEPTION
WHEN SQLSTATE '40001' THEN
IF tries > 0 THEN
tries := tries - 1;
RAISE NOTICE 'Restart! % left', tries;
ELSE
RAISE EXCEPTION 'NO RESTARTS LEFT';
END IF;
END;
END LOOP;
END
$$
LANGUAGE plpgsql;

But it doesn't work.. Every iteration fails with serialization_failure
probably because the outer transaction is not rolled back and I'm not sure
how to write this in a way I can roll it back and still have control of the
LOOP..

I find it hard to believe that PGSQL has this amazing "serializable"
isolation method but not a standard way to take advantage of it to
automatically "restart" the failed transactions...

On Fri, Apr 3, 2015 at 11:07 PM, Bill Moran <wmoran@potentialtech.com> wrote:

On Fri, 3 Apr 2015 15:35:14 +0100
Filipe Pina <filipe.pina@impactzero.pt> wrote:

> Hello,
>
> I come from a GTM background and once of the transactional features there are the ?Transaction Restarts?.
>
> Transaction restart is when we have two concurrent processes reading/writing to the same region/table of the database, the last process to commit will ?see? that the database is not the same as it was when the transaction started and goes back to the beginning of the transactional code and re-executes it.
>
> The closest I found to this in PGSQL is the Serializable transaction isolation mode and it does seem to work well except it simply throws an error (serialization_failure) instead of restarting.
>
> I?m trying to make use of this exception to implement restartable functions and I have all the examples and conditions mentioned here in a question in SO (without any answer so far?):
>
> http://stackoverflow.com/questions/29372202/postgresql-generic-handler-for-serialization-failure <http://stackoverflow.com/questions/29372202/postgresql-generic-handler-for-serialization-failure>
>
> So basically I have two questions:
> - the restartable ?wrapper? function never gets its ?DB view? refreshed once it restarts, I assume it?s because of the outter transaction (at function level) so it never re-reads the new values and keeps failing with serialization_failure.. Any way to solve this?
> - the ideal would be to be able to define this at database level so I wouldn?t have to implement wrappers for all functions.. Implementing a ?serialization_failure? generic handler that would simply re-call the function that threw that exception (up to a number of tries). Is this possible without going into pgsql source code?

I suspect that savepoints will accomplish what you want:
http://www.postgresql.org/docs/9.4/static/sql-savepoint.html

--
Bill Moran

Re: Serializable transaction restart/re-execute

От

Bill Moran

Дата:

06 апреля 2015 г., 14:42:49

On Mon, 6 Apr 2015 10:41:25 +0100
Filipe Pina <fopina@impactzero.pt> wrote:

> Hi Bill, thanks for the quick reply.
>
> I had read about SAVEPOINTs but I also read I cannot use them in PLPGSQL and
> should use BEGIN/END blocks and EXCEPTIONs.
>
> Did you check the URL I mentioned?

Yes, I did:
http://stackoverflow.com/questions/29372202/postgresql-generic-handler-for-serialization-failure

...

> But it doesn't work.. Every iteration fails with serialization_failure
> probably because the outer transaction is not rolled back and I'm not sure
> how to write this in a way I can roll it back and still have control of the
> LOOP..

Probably one of your issues is that there is no such thing as an
"outer" transaction. There's just a transaction. There is no nesting
of transactions, so the belief that there is an outer transaction
that can somehow be manipulated indepently of some other transaction
is leading you to try things that will never work.

I wasn't aware that SAVEPOINTs didn't work in pl/pgsql, thanks for
educating me on that point.

> I find it hard to believe that PGSQL has this amazing "serializable"
> isolation method but not a standard way to take advantage of it to
> automatically "restart" the failed transactions...

I've been over this ground before. You're thinking in such a
micro case that you haven't realized the inherent difficulty of
restarting large transactions with lots of data modification.
An RDBMS may have many tables updated within a transaction, and
transactions may do data processing completely outside of the
database, which means the only way to ensure consistency is to
notify the controlling process of the problem so it can decide
how best to respond.

So ... I dug into your problem a little more, and I think the
problem is that you're trying too hard to replicate GTM design
paradigms instead of learning the way that PostgreSQL is designed
to work.

If I were creating the functions you describe, I would ditch the
second one and simply have this:

CREATE OR REPLACE FUNCTION mytest() RETURNS integer AS $$
BEGIN
    update account set balance = balance+10 where id=1 RETURNING balance;
END
$$
LANGUAGE SQL;

of course, it's unlikely that you'll ever want to wrap such a
simple query in a function, so I'm supposing that you'd want
to do something else with the old value of balance before
updating it, in which case:

CREATE OR REPLACE FUNCTION mytest() RETURNS integer AS $$
DECLARE
    cc integer;
BEGIN
    SELECT INTO cc balance FROM account WHERE id = 1 FOR UPDATE;

    RAISE NOTICE 'Balance: %', cc;
    perform pg_sleep(3);

    update account set balance = cc+10 where id=1 RETURNING balance INTO cc;

    return cc;
END
$$
LANGUAGE plpgsql;

The FOR UPDATE ensures that no other process can modify the
row while this one is sleeping.

Now, I understand that you want to don't want to do row locking,
but this is (again) an insistance on your part of trying to
force PostgreSQL to do things the way GTM did instead of
understanding the RDBMS way of doing things.

Unlearn.

Keep in mind that mytest() might be called as part of a much
larger transaction that does many other things, and you can't
simply roll that back and restart it within mytest() since
mytest() doesn't know everything else that happened.

In you're case, you're trying to look at mytest() as something
that will always be used in a specific way where the
aforementioned problem won't be encountered, but you can not
guarantee that, and it doesn't hold true for all functions.

In general, it's inappropriate for a function to be able to manipulate
a transaction beyond aborting it. And the abort has to bubble up so
that other statements involved in the transaction are also notified.

--
Bill Moran

Re: Serializable transaction restart/re-execute

От

Filipe Pina

Дата:

06 апреля 2015 г., 15:41:25

Hi Bill, thanks for the quick reply.

I had read about SAVEPOINTs but I also read I cannot use them in PLPGSQL and
should use BEGIN/END blocks and EXCEPTIONs.

Did you check the URL I mentioned? I have the code I used there:

CREATE OR REPLACE FUNCTION myretest() RETURNS integer AS $$
DECLARE
    tries integer := 5;
BEGIN
    WHILE TRUE LOOP
        BEGIN -- nested block for exception
            RETURN mytest();
        EXCEPTION
            WHEN SQLSTATE '40001' THEN
                IF tries > 0 THEN
                    tries := tries - 1;
                    RAISE NOTICE 'Restart! % left', tries;
                ELSE
                    RAISE EXCEPTION 'NO RESTARTS LEFT';
                END IF;
        END;
    END LOOP;
END
$$
LANGUAGE plpgsql;

But it doesn't work.. Every iteration fails with serialization_failure
probably because the outer transaction is not rolled back and I'm not sure
how to write this in a way I can roll it back and still have control of the
LOOP..

I find it hard to believe that PGSQL has this amazing "serializable"
isolation method but not a standard way to take advantage of it to
automatically "restart" the failed transactions...

-----Original Message-----
From: Bill Moran [mailto:wmoran@potentialtech.com]
Sent: 3 de abril de 2015 23:07
To: Filipe Pina
Cc: Postgresql General
Subject: Re: [GENERAL] Serializable transaction restart/re-execute

On Fri, 3 Apr 2015 15:35:14 +0100
Filipe Pina <filipe.pina@impactzero.pt> wrote:

> Hello,
>
> I come from a GTM background and once of the transactional features there
are the ?Transaction Restarts?.
>
> Transaction restart is when we have two concurrent processes
reading/writing to the same region/table of the database, the last process
to commit will ?see? that the database is not the same as it was when the
transaction started and goes back to the beginning of the transactional code
and re-executes it.
>
> The closest I found to this in PGSQL is the Serializable transaction
isolation mode and it does seem to work well except it simply throws an
error (serialization_failure) instead of restarting.
>
> I?m trying to make use of this exception to implement restartable
functions and I have all the examples and conditions mentioned here in a
question in SO (without any answer so far?):
>
>
http://stackoverflow.com/questions/29372202/postgresql-generic-handler-for-s
erialization-failure
<http://stackoverflow.com/questions/29372202/postgresql-generic-handler-for-
serialization-failure>
>
> So basically I have two questions:
> - the restartable ?wrapper? function never gets its ?DB view? refreshed
once it restarts, I assume it?s because of the outter transaction (at
function level) so it never re-reads the new values and keeps failing with
serialization_failure.. Any way to solve this?
> - the ideal would be to be able to define this at database level so I
wouldn?t have to implement wrappers for all functions.. Implementing a
?serialization_failure? generic handler that would simply re-call the
function that threw that exception (up to a number of tries). Is this
possible without going into pgsql source code?

I suspect that savepoints will accomplish what you want:
http://www.postgresql.org/docs/9.4/static/sql-savepoint.html

--
Bill Moran

Re: Serializable transaction restart/re-execute

От

Filipe Pina

Дата:

06 апреля 2015 г., 16:26:52

Thank you very much for such detailed response.

Indeed I'm thinking too much "GTM" instead of actually changing the mindset, but the problem with LOCKs (which are also available in GTM) is that the developer does have to remind to lock what they want to use for update and if they don't, integrity/consistency issues might come up (or even data loss which would be worse..).

Serializable isolation would make sure developers don't need to remember that while keeping that consistency (though losing some performance) and also they wouldn't have to worry about deadlocks (as serializable use soft locks).

But I guess strong code review and code re-design (where needed) should be able have an even better solution, that's true.

From what I have read so far, I can't find a way to cause this "restarts" (besides patching pgsql itself which I wouldn't know where to start).

Thanks once again.

On Mon, Apr 6, 2015 at 12:42 PM, Bill Moran <wmoran@potentialtech.com> wrote:

On Mon, 6 Apr 2015 10:41:25 +0100
Filipe Pina <fopina@impactzero.pt> wrote:

> Hi Bill, thanks for the quick reply.
>
> I had read about SAVEPOINTs but I also read I cannot use them in PLPGSQL and
> should use BEGIN/END blocks and EXCEPTIONs.
>
> Did you check the URL I mentioned?

Yes, I did:
http://stackoverflow.com/questions/29372202/postgresql-generic-handler-for-serialization-failure

...

> But it doesn't work.. Every iteration fails with serialization_failure
> probably because the outer transaction is not rolled back and I'm not sure
> how to write this in a way I can roll it back and still have control of the
> LOOP..

Probably one of your issues is that there is no such thing as an
"outer" transaction. There's just a transaction. There is no nesting
of transactions, so the belief that there is an outer transaction
that can somehow be manipulated indepently of some other transaction
is leading you to try things that will never work.

I wasn't aware that SAVEPOINTs didn't work in pl/pgsql, thanks for
educating me on that point.

> I find it hard to believe that PGSQL has this amazing "serializable"
> isolation method but not a standard way to take advantage of it to
> automatically "restart" the failed transactions...

I've been over this ground before. You're thinking in such a
micro case that you haven't realized the inherent difficulty of
restarting large transactions with lots of data modification.
An RDBMS may have many tables updated within a transaction, and
transactions may do data processing completely outside of the
database, which means the only way to ensure consistency is to
notify the controlling process of the problem so it can decide
how best to respond.

So ... I dug into your problem a little more, and I think the
problem is that you're trying too hard to replicate GTM design
paradigms instead of learning the way that PostgreSQL is designed
to work.

If I were creating the functions you describe, I would ditch the
second one and simply have this:

CREATE OR REPLACE FUNCTION mytest() RETURNS integer AS $$
BEGIN
update account set balance = balance+10 where id=1 RETURNING balance;
END
$$
LANGUAGE SQL;

of course, it's unlikely that you'll ever want to wrap such a
simple query in a function, so I'm supposing that you'd want
to do something else with the old value of balance before
updating it, in which case:

CREATE OR REPLACE FUNCTION mytest() RETURNS integer AS $$
DECLARE
cc integer;
BEGIN
SELECT INTO cc balance FROM account WHERE id = 1 FOR UPDATE;

RAISE NOTICE 'Balance: %', cc;
perform pg_sleep(3);

update account set balance = cc+10 where id=1 RETURNING balance INTO cc;

return cc;
END
$$
LANGUAGE plpgsql;

The FOR UPDATE ensures that no other process can modify the
row while this one is sleeping.

Now, I understand that you want to don't want to do row locking,
but this is (again) an insistance on your part of trying to
force PostgreSQL to do things the way GTM did instead of
understanding the RDBMS way of doing things.

Unlearn.

Keep in mind that mytest() might be called as part of a much
larger transaction that does many other things, and you can't
simply roll that back and restart it within mytest() since
mytest() doesn't know everything else that happened.

In you're case, you're trying to look at mytest() as something
that will always be used in a specific way where the
aforementioned problem won't be encountered, but you can not
guarantee that, and it doesn't hold true for all functions.

In general, it's inappropriate for a function to be able to manipulate
a transaction beyond aborting it. And the abort has to bubble up so
that other statements involved in the transaction are also notified.

--
Bill Moran

Re: Serializable transaction restart/re-execute

От

Kevin Grittner

Дата:

06 апреля 2015 г., 17:22:29

Filipe Pina <filipe.pina@impactzero.pt> wrote:

> I come from a GTM background and once of the transactional
> features there are the “Transaction Restarts”.
>
> Transaction restart is when we have two concurrent processes
> reading/writing to the same region/table of the database, the
> last process to commit will “see” that the database is not the
> same as it was when the transaction started and goes back to the
> beginning of the transactional code and re-executes it.
>
> The closest I found to this in PGSQL is the Serializable
> transaction isolation mode and it does seem to work well except
> it simply throws an error (serialization_failure) instead of
> restarting.

Right, serializable transactions provide exactly what you are
looking for as long as you use some framework that starts the
transaction over when it receives an error with a SQLSTATE of 40001
or 40P01.

> I’m trying to make use of this exception to implement restartable
> functions and I have all the examples and conditions mentioned
> here in a question in SO (without any answer so far…):
> http://stackoverflow.com/questions/29372202/postgresql-generic-handler-for-serialization-failure

Perhaps once we've sorted out the issue here I can post an answer
there for the benefit of anyone finding the SO question.

> So basically I have two questions:
> - the restartable “wrapper” function never gets its “DB view”
> refreshed once it restarts, I assume it’s because of the outter
> transaction (at function level) so it never re-reads the new
> values and keeps failing with serialization_failure.. Any way to
> solve this?

In PostgreSQL a function always runs in the context of a
transaction.  You can't start a new transaction within the context
of a "wrapper" function.  That would require a slightly different
feature, which is commonly called a "stored procedure" -- something
which doesn't exist in PostgreSQL.  Therefore, you need to put the
logic to manage the restart into code which submits the transaction
to the database.  Fortunately, there are many connectors for that
-- Java, perl, python, tcl, ODBC, etc.  There is even a connector
for making a separate connection to a PostgreSQL database within
PostgreSQL procedural language, which might allow you to do
something like what you want:

http://www.postgresql.org/docs/current/static/dblink.html

> - the ideal would be to be able to define this at database level
> so I wouldn’t have to implement wrappers for all functions..

I have seen this done in various "client" frameworks.  Clearly it
is a bad idea to spread this testing around to all locations where
the application is logically dealing with the database, but there
are many good reasons to route all database requests through one
"accessor" method (or at least a very small number of them), and
most frameworks provide a way to deal with this at that layer.
(For example, in Spring you would want to create a transaction
manager using dependency injection.)

> Implementing a “serialization_failure” generic handler that would
> simply re-call the function that threw that exception (up to a
> number of tries). Is this possible without going into pgsql
> source code?

Yes, but only from the "client" side of a database connection --
although that client code.  That probably belongs in some language
you are using for your application logic, but if you really wanted
to you could use plpgsql and dblink.  It's hard for me to see a
case where that would actually be a good idea, but it is an option.

--
Kevin Grittner
EDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Re: Serializable transaction restart/re-execute

От

Filipe Pina

Дата:

06 апреля 2015 г., 18:00:13

Hi Kevin, thank you very much for reply.

We plan to have a middleware/gateway in our full solution so we could have the restart logic there but that would only apply to external interface calls.

We plan to have a few "backend processes" that we want to run directly in pgsql and those would not have "restarts"..

dblink does sound like a decent option/workaround but I'm guessing everything points toward focusing on locks instead of relying on some hacky serializable failure restart implementation..

If you post this reply in the SO post I found quite helpful and insightful and I'll definitely accept it as answer. If you have the time to elaborate on a working example using dblink it would definitely by a nice bonus :)

Thank you once again

On Mon, Apr 6, 2015 at 3:22 PM, Kevin Grittner <kgrittn@ymail.com> wrote:

Filipe Pina <filipe.pina@impactzero.pt> wrote:

> I come from a GTM background and once of the transactional
> features there are the “Transaction Restarts”.
>
> Transaction restart is when we have two concurrent processes
> reading/writing to the same region/table of the database, the
> last process to commit will “see” that the database is not the
> same as it was when the transaction started and goes back to the
> beginning of the transactional code and re-executes it.
>
> The closest I found to this in PGSQL is the Serializable
> transaction isolation mode and it does seem to work well except
> it simply throws an error (serialization_failure) instead of
> restarting.

Right, serializable transactions provide exactly what you are
looking for as long as you use some framework that starts the
transaction over when it receives an error with a SQLSTATE of 40001
or 40P01.

> I’m trying to make use of this exception to implement restartable
> functions and I have all the examples and conditions mentioned
> here in a question in SO (without any answer so far…):
> http://stackoverflow.com/questions/29372202/postgresql-generic-handler-for-serialization-failure

Perhaps once we've sorted out the issue here I can post an answer
there for the benefit of anyone finding the SO question.

> So basically I have two questions:
> - the restartable “wrapper” function never gets its “DB view”
> refreshed once it restarts, I assume it’s because of the outter
> transaction (at function level) so it never re-reads the new
> values and keeps failing with serialization_failure.. Any way to
> solve this?

In PostgreSQL a function always runs in the context of a
transaction. You can't start a new transaction within the context
of a "wrapper" function. That would require a slightly different
feature, which is commonly called a "stored procedure" -- something
which doesn't exist in PostgreSQL. Therefore, you need to put the
logic to manage the restart into code which submits the transaction
to the database. Fortunately, there are many connectors for that
-- Java, perl, python, tcl, ODBC, etc. There is even a connector
for making a separate connection to a PostgreSQL database within
PostgreSQL procedural language, which might allow you to do
something like what you want:

http://www.postgresql.org/docs/current/static/dblink.html

> - the ideal would be to be able to define this at database level
> so I wouldn’t have to implement wrappers for all functions..

I have seen this done in various "client" frameworks. Clearly it
is a bad idea to spread this testing around to all locations where
the application is logically dealing with the database, but there
are many good reasons to route all database requests through one
"accessor" method (or at least a very small number of them), and
most frameworks provide a way to deal with this at that layer.
(For example, in Spring you would want to create a transaction
manager using dependency injection.)

> Implementing a “serialization_failure” generic handler that would
> simply re-call the function that threw that exception (up to a
> number of tries). Is this possible without going into pgsql
> source code?

Yes, but only from the "client" side of a database connection --
although that client code. That probably belongs in some language
you are using for your application logic, but if you really wanted
to you could use plpgsql and dblink. It's hard for me to see a
case where that would actually be a good idea, but it is an option.

--
Kevin Grittner
EDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Re: Serializable transaction restart/re-execute

От

Filipe Pina

Дата:

06 апреля 2015 г., 21:09:36

Hi Bill, thanks for the quick reply.

I had read about SAVEPOINTs but I also read I cannot use them in PLPGSQL and
should use BEGIN/END blocks and EXCEPTIONs.

Did you check the URL I mentioned? I have the code I used there:

CREATE OR REPLACE FUNCTION myretest() RETURNS integer AS $$
DECLARE
    tries integer := 5;
BEGIN
    WHILE TRUE LOOP
        BEGIN -- nested block for exception
            RETURN mytest();
        EXCEPTION
            WHEN SQLSTATE '40001' THEN
                IF tries > 0 THEN
                    tries := tries - 1;
                    RAISE NOTICE 'Restart! % left', tries;
                ELSE
                    RAISE EXCEPTION 'NO RESTARTS LEFT';
                END IF;
        END;
    END LOOP;
END
$$
LANGUAGE plpgsql;

But it doesn't work.. Every iteration fails with serialization_failure
probably because the outer transaction is not rolled back and I'm not sure
how to write this in a way I can roll it back and still have control of the
LOOP..

I find it hard to believe that PGSQL has this amazing "serializable"
isolation method but not a standard way to take advantage of it to
automatically "restart" the failed transactions...

-----Original Message-----
From: Bill Moran [mailto:wmoran@potentialtech.com]
Sent: 3 de abril de 2015 23:07
To: Filipe Pina
Cc: Postgresql General
Subject: Re: [GENERAL] Serializable transaction restart/re-execute

On Fri, 3 Apr 2015 15:35:14 +0100
Filipe Pina <filipe.pina@impactzero.pt> wrote:

> Hello,
>
> I come from a GTM background and once of the transactional features there
are the ?Transaction Restarts?.
>
> Transaction restart is when we have two concurrent processes
reading/writing to the same region/table of the database, the last process
to commit will ?see? that the database is not the same as it was when the
transaction started and goes back to the beginning of the transactional code
and re-executes it.
>
> The closest I found to this in PGSQL is the Serializable transaction
isolation mode and it does seem to work well except it simply throws an
error (serialization_failure) instead of restarting.
>
> I?m trying to make use of this exception to implement restartable
functions and I have all the examples and conditions mentioned here in a
question in SO (without any answer so far?):
>
>
http://stackoverflow.com/questions/29372202/postgresql-generic-handler-for-s
erialization-failure
<http://stackoverflow.com/questions/29372202/postgresql-generic-handler-for-
serialization-failure>
>
> So basically I have two questions:
> - the restartable ?wrapper? function never gets its ?DB view? refreshed
once it restarts, I assume it?s because of the outter transaction (at
function level) so it never re-reads the new values and keeps failing with
serialization_failure.. Any way to solve this?
> - the ideal would be to be able to define this at database level so I
wouldn?t have to implement wrappers for all functions.. Implementing a
?serialization_failure? generic handler that would simply re-call the
function that threw that exception (up to a number of tries). Is this
possible without going into pgsql source code?

I suspect that savepoints will accomplish what you want:
http://www.postgresql.org/docs/9.4/static/sql-savepoint.html

--
Bill Moran

Re: Serializable transaction restart/re-execute

От

Jim Nasby

Дата:

08 апреля 2015 г., 00:44:02

On 4/6/15 6:42 AM, Bill Moran wrote:
> CREATE OR REPLACE FUNCTION mytest() RETURNS integer AS $$
> BEGIN
>      update account set balance = balance+10 where id=1 RETURNING balance;
> END
> $$
> LANGUAGE SQL;
>
> of course, it's unlikely that you'll ever want to wrap such a
> simple query in a function, so I'm supposing that you'd want
> to do something else with the old value of balance before
> updating it, in which case:
>
> CREATE OR REPLACE FUNCTION mytest() RETURNS integer AS $$
> DECLARE
>      cc integer;
> BEGIN
>      SELECT INTO cc balance FROM account WHERE id = 1 FOR UPDATE;
>
>      RAISE NOTICE 'Balance: %', cc;
>      perform pg_sleep(3);
>
>      update account set balance = cc+10 where id=1 RETURNING balance INTO cc;
>
>      return cc;
> END
> $$
> LANGUAGE plpgsql;
>
> The FOR UPDATE ensures that no other process can modify the
> row while this one is sleeping.
>
> Now, I understand that you want to don't want to do row locking,
> but this is (again) an insistance on your part of trying to
> force PostgreSQL to do things the way GTM did instead of
> understanding the RDBMS way of doing things.

Actually, the entire point of SERIALIZABLE is to avoid the need to mess
around with FOR UPDATE and similar. It's a trade-off. If you have a
large application that has lots of DML paths the odds of getting
explicit locking correct drop rapidly to zero. That's where SERIALIZABLE
shines; you just turn it on and stop worrying about locking.

The downside of course is that you need to be ready to deal with a
serialization failure.

I *think* what Fillpe was looking for is some way to have Postgres
magically re-try a serialization failure. While theoretically possible
(at least to a degree), that's actually a really risky thing. The whole
reason you would need any of this is if you're using a pattern where you:

1 BEGIN SERIALIZABLE;
2 Get data from database
3 Do something with that data
4 Put data back in database

If you get a serialization failure, it's because someone modified the
data underneath you, which means you can't simply repeat step 4, you
have to ROLLBACK and go back to step 1. If you design your app with that
in mind it's not a big deal. If you don't... ugh. :)
--
Jim Nasby, Data Architect, Blue Treble Consulting
Data in Trouble? Get it in Treble! http://BlueTreble.com

Re: Serializable transaction restart/re-execute

От

Filipe Pina

Дата:

08 апреля 2015 г., 13:23:07

Exactly, my point was not to repeat point 4 but the whole step.

Upon serialization failure exception it would re-read data from database, perform the same something with data and save it. And point 2 is the part that fails in my "restart wrapper" function in the code I posted in stackoverflow, it doesn't read the NEW data from database, but the old one, resulting once again in serialization_failure..

We're now actually considering moving all business logic of the project to a gateway (in Django or Java) and leave postgres in SERIALIZABLE and just for data storage. We were trying to avoid that as we assume there will be some performance impact on taking away processing from within DB through the connector, but we will do some testing to be able to measure that impact

On Tue, Apr 7, 2015 at 10:43 PM, Jim Nasby <Jim.Nasby@bluetreble.com> wrote:

On 4/6/15 6:42 AM, Bill Moran wrote:
CREATE OR REPLACE FUNCTION mytest() RETURNS integer AS $$
BEGIN
update account set balance = balance+10 where id=1 RETURNING balance;
END
$$
LANGUAGE SQL;

of course, it's unlikely that you'll ever want to wrap such a
simple query in a function, so I'm supposing that you'd want
to do something else with the old value of balance before
updating it, in which case:

CREATE OR REPLACE FUNCTION mytest() RETURNS integer AS $$
DECLARE
cc integer;
BEGIN
SELECT INTO cc balance FROM account WHERE id = 1 FOR UPDATE;

RAISE NOTICE 'Balance: %', cc;
perform pg_sleep(3);

update account set balance = cc+10 where id=1 RETURNING balance INTO cc;

return cc;
END
$$
LANGUAGE plpgsql;

The FOR UPDATE ensures that no other process can modify the
row while this one is sleeping.

Now, I understand that you want to don't want to do row locking,
but this is (again) an insistance on your part of trying to
force PostgreSQL to do things the way GTM did instead of
understanding the RDBMS way of doing things.

Actually, the entire point of SERIALIZABLE is to avoid the need to mess around with FOR UPDATE and similar. It's a trade-off. If you have a large application that has lots of DML paths the odds of getting explicit locking correct drop rapidly to zero. That's where SERIALIZABLE shines; you just turn it on and stop worrying about locking.

The downside of course is that you need to be ready to deal with a serialization failure.

I *think* what Fillpe was looking for is some way to have Postgres magically re-try a serialization failure. While theoretically possible (at least to a degree), that's actually a really risky thing. The whole reason you would need any of this is if you're using a pattern where you:

1 BEGIN SERIALIZABLE;
2 Get data from database
3 Do something with that data
4 Put data back in database

If you get a serialization failure, it's because someone modified the data underneath you, which means you can't simply repeat step 4, you have to ROLLBACK and go back to step 1. If you design your app with that in mind it's not a big deal. If you don't... ugh. :)
--
Jim Nasby, Data Architect, Blue Treble Consulting
Data in Trouble? Get it in Treble! http://BlueTreble.com

Вход в личный кабинет

Восстановление пароля

Подтверждение аккаунта

Изменение пароля

Обсуждение: Serializable transaction restart/re-execute