Обсуждение: database-level lockdown

Поиск

Список

Период

Сортировка

database-level lockdown

От

Filipe Pina

Дата:

11 июня 2015 г., 19:06:54

[ original question posted in http://stackoverflow.com/questions/30789279/django-postgresql-retry-transaction-last-try-must-go-in for easier read ]

I have a Django+PostgreSQL.
For data integrity pg is setup for serializable transactions, so I'm retrying the transaction (in a generic class) as:

def _execute_wrapper(pair):   obj, item = pair
   from time import sleep   last_error = None
   for tt in xrange(obj._max_retry):       try:           obj.execute(item)           return (None, tt)       except (utils.OperationalError) as exc:           last_error = exc           obj.handle_error(exc)           sleep(obj._retry_wait)       except Exception as exc:           obj.handle_error(exc)           raise exc
   return (last_error, tt)

Default _max_retry is 5 and _retry_wait is 0.1s.

It will try 5 times to execute each instruction (in case of OperationError) and in the last one it will raise the last error it received, aborting.

Now my problem is that aborting for the last try (on a restartable error - OperationalError code 40001) is not an option... It simply needs to get through, locking whatever other processes and queries it needs.

So I've thought about use a generic lock (at django level, not PG) that every transaction would check, but that will be a problem once the Django application runs in multiple instances (with a load balancer)...

So, I've thought of pg_advisory_lock() to have the lock on the database level, but then I'd need some way to make sure Django checks that lock in every database request (a select, an update, function call, etc)..

It would need to get the lock and release it immediately (not the try version of the lock that releases only at the end of the transaction). Then, the retry wrapper, on the 5th try will set the lock (not releasing it immediately)..

I'm open to any option, it's just that a transaction going on its 5th try of an OperationalError really has to go in, no matter if it has to fully lockdown the database to any other transaction..

Thanks,

Filipe

Re: database-level lockdown

От

Tom Lane

Дата:

11 июня 2015 г., 19:27:55

Filipe Pina <filipe.pina@impactzero.pt> writes:
> It will try 5 times to execute each instruction (in case of
> OperationError) and in the last one it will raise the last error it
> received, aborting.

> Now my problem is that aborting for the last try (on a restartable
> error - OperationalError code 40001) is not an option... It simply
> needs to get through, locking whatever other processes and queries it
> needs.

I think you need to reconsider your objectives.  What if two or more
transactions are repeatedly failing and retrying, perhaps because they
conflict?  They can't all forcibly win.

            regards, tom lane

Re: database-level lockdown

От

Filipe Pina

Дата:

12 июня 2015 г., 16:25:28

Exactly, that’s why there’s a limit on the retry number. On the last try I wanted something like full lockdown to make
surethe transaction will not fail due to serialiazation failure (if no other processes are touching the database, it
can’thappen). 

So if two transactions were retrying over and over, the first one to reach max_retries would activate that “global
lock”making the other one wait and then the second one would also be able to successfully commit... 

> On 11/06/2015, at 20:27, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>
> Filipe Pina <filipe.pina@impactzero.pt> writes:
>> It will try 5 times to execute each instruction (in case of
>> OperationError) and in the last one it will raise the last error it
>> received, aborting.
>
>> Now my problem is that aborting for the last try (on a restartable
>> error - OperationalError code 40001) is not an option... It simply
>> needs to get through, locking whatever other processes and queries it
>> needs.
>
> I think you need to reconsider your objectives.  What if two or more
> transactions are repeatedly failing and retrying, perhaps because they
> conflict?  They can't all forcibly win.
>
>             regards, tom lane

Re: database-level lockdown

От

Filipe Pina

Дата:

03 июля 2015 г., 17:25:29

So, as database level locks do not exist (as per https://wiki.postgresql.org/wiki/Lock_database), I've thought of having a function that would lock *every* table in the database (far from the same, but would probably work for me).

Something like:

CREATE OR REPLACE FUNCTION lockdown()

RETURNS void AS $$

DECLARE

t information_schema.tables.table_name%TYPE;

BEGIN

FOR t in SELECT table_name

FROM information_schema.tables

WHERE table_schema='public'

AND table_type='BASE TABLE'

LOOP

EXECUTE 'LOCK ' || t;

END LOOP;

END

$$

LANGUAGE plpgsql;

But the problem is that calling the function starts its own transaction and once it returns, locks are removed..

Basically the workflow is (pseudo code coming from Django/python/psycopg2/external, not within pgsql):

function generic_function_restarter(developer_function) {

# try 4 times to execute developer function and if all of them fail

# (due to SQLSTATE 40001 serialization failures),

# lock database and execute one last time

for 1 in [1..4] {

try {

call developer_function()

return 'success'

}

except SQLSTATE_40001 {

continue

}

except other_error {

return other_error

}

# only reaches here if all tries failed with SQLSTATE_40001

try {

START TRANSACTION

call lockdown()

call developer_function()

COMMIT TRANSACTION

return 'success'

}

except any_error {

# implicit ROLLBACK

return any_error

}

}

So, my problem here is that "call lockdown()" will place the locks and remove them upon returning... Is it possible to execute a function without creating a subtransaction?

I could place the locks from the adapter directly at the outter transaction level but I have the feeling that performance would be worse...

Thanks,

Filipe

On Sex, Jun 12, 2015 at 5:25 , Filipe Pina <filipe.pina@impactzero.pt> wrote:

Exactly, that’s why there’s a limit on the retry number. On the last try I wanted something like full lockdown to make sure the transaction will not fail due to serialiazation failure (if no other processes are touching the database, it can’t happen). So if two transactions were retrying over and over, the first one to reach max_retries would activate that “global lock” making the other one wait and then the second one would also be able to successfully commit...
On 11/06/2015, at 20:27, Tom Lane <tgl@sss.pgh.pa.us> wrote: Filipe Pina <filipe.pina@impactzero.pt> writes:
It will try 5 times to execute each instruction (in case of OperationError) and in the last one it will raise the last error it received, aborting.
Now my problem is that aborting for the last try (on a restartable error - OperationalError code 40001) is not an option... It simply needs to get through, locking whatever other processes and queries it needs.
I think you need to reconsider your objectives. What if two or more transactions are repeatedly failing and retrying, perhaps because they conflict? They can't all forcibly win. regards, tom lane

Re: database-level lockdown

От

Melvin Davidson

Дата:

03 июля 2015 г., 18:01:03

Wouldn't it be easier just to prevent connections to the database
while your transaction is executed?
EG:
<Connect to your_database>
UPDATE pg_database
SET datallowconn FALSE
WHERE datname = 'your_database' ;
START TRANSACTION;
<Do your_transaction>
COMMIT;
UPDATE pg_database
SET datallowconn TRUE
WHERE datname = 'your_database' ;

On Fri, Jul 3, 2015 at 1:25 PM, Filipe Pina <filipe.pina@impactzero.pt> wrote:

So, as database level locks do not exist (as per https://wiki.postgresql.org/wiki/Lock_database), I've thought of having a function that would lock *every* table in the database (far from the same, but would probably work for me).

Something like:

CREATE OR REPLACE FUNCTION lockdown()
RETURNS void AS $$
DECLARE
t information_schema.tables.table_name%TYPE;
BEGIN
FOR t in SELECT table_name
FROM information_schema.tables
WHERE table_schema='public'
AND table_type='BASE TABLE'
LOOP
EXECUTE 'LOCK ' || t;
END LOOP;
END
$$
LANGUAGE plpgsql;

But the problem is that calling the function starts its own transaction and once it returns, locks are removed..

Basically the workflow is (pseudo code coming from Django/python/psycopg2/external, not within pgsql):

function generic_function_restarter(developer_function) {
# try 4 times to execute developer function and if all of them fail
# (due to SQLSTATE 40001 serialization failures),
# lock database and execute one last time
for 1 in [1..4] {
try {
call developer_function()
return 'success'
}
except SQLSTATE_40001 {
continue
}
except other_error {
return other_error
}

# only reaches here if all tries failed with SQLSTATE_40001
try {
START TRANSACTION
call lockdown()
call developer_function()
COMMIT TRANSACTION
return 'success'
}
except any_error {
# implicit ROLLBACK
return any_error
}
}

So, my problem here is that "call lockdown()" will place the locks and remove them upon returning... Is it possible to execute a function without creating a subtransaction?

I could place the locks from the adapter directly at the outter transaction level but I have the feeling that performance would be worse...

Thanks,
Filipe

On Sex, Jun 12, 2015 at 5:25 , Filipe Pina <filipe.pina@impactzero.pt> wrote:
Exactly, that’s why there’s a limit on the retry number. On the last try I wanted something like full lockdown to make sure the transaction will not fail due to serialiazation failure (if no other processes are touching the database, it can’t happen). So if two transactions were retrying over and over, the first one to reach max_retries would activate that “global lock” making the other one wait and then the second one would also be able to successfully commit...
On 11/06/2015, at 20:27, Tom Lane <tgl@sss.pgh.pa.us> wrote: Filipe Pina <filipe.pina@impactzero.pt> writes:
It will try 5 times to execute each instruction (in case of OperationError) and in the last one it will raise the last error it received, aborting.
Now my problem is that aborting for the last try (on a restartable error - OperationalError code 40001) is not an option... It simply needs to get through, locking whatever other processes and queries it needs.
I think you need to reconsider your objectives. What if two or more transactions are repeatedly failing and retrying, perhaps because they conflict? They can't all forcibly win. regards, tom lane

--

Melvin Davidson
I reserve the right to fantasize. Whether or not you
wish to share my fantasy is entirely up to you.

Re: database-level lockdown

От

Filipe Pina

Дата:

04 июля 2015 г., 17:49:32

Thanks for the suggestion. I read that some people do use that strategy for
maintenance sometimes but it's no feasible in this scenario.

I would have to disallow new connections AND kill all existing connections
(as there would be an existing connection pool), but this won't have the
same impact as using LOCKs..

Terminating all sessions will break every other transaction (except for the
one doing it). Locking database will put all the other on hold.
As we're talking about quick/instant operations on hold will have impact on
performance but won't cause anything to abort..

I really can't find any other solution for what I need (in short: make sure
no transactions are left out due to serialization failures)


On 03/07/2015, at 19:00, Melvin Davidson <melvin6925@gmail.com> wrote:

Wouldn't it be easier just to prevent connections to the database
 while your transaction is executed?
EG:
<Connect to your_database>
UPDATE pg_database
  SET datallowconn FALSE
 WHERE datname =3D 'your_database' ;
 START TRANSACTION;
<Do your_transaction>
  COMMIT;
UPDATE pg_database
  SET datallowconn TRUE
 WHERE datname =3D 'your_database' ;


On Fri, Jul 3, 2015 at 1:25 PM, Filipe Pina <filipe.pina@impactzero.pt>
wrote:

> So, as database level locks do not exist (as per
> https://wiki.postgresql.org/wiki/Lock_database), I've thought of having a
> function that would lock *every* table in the database (far from the same=
,
> but would probably work for me).
>
> Something like:
>
> CREATE OR REPLACE FUNCTION lockdown()
>     RETURNS void AS $$
> DECLARE
>   t information_schema.tables.table_name%TYPE;
> BEGIN
>   FOR t in SELECT table_name
>     FROM information_schema.tables
>   WHERE table_schema=3D'public'
>    AND table_type=3D'BASE TABLE'
>   LOOP
>     EXECUTE 'LOCK ' || t;
>   END LOOP;
> END
> $$
> LANGUAGE plpgsql;
>
> But the problem is that calling the function starts its own transaction
> and once it returns, locks are removed..
>
> Basically the workflow is (pseudo code coming from
> Django/python/psycopg2/external, not within pgsql):
>
> function generic_function_restarter(developer_function) {
>   # try 4 times to execute developer function and if all of them fail
>   # (due to SQLSTATE 40001 serialization failures),
>   # lock database and execute one last time
>   for 1 in [1..4] {
>     try {
>       call developer_function()
>       return 'success'
>     }
>     except SQLSTATE_40001 {
>       continue
>     }
>     except other_error {
>       return other_error
>     }
>
>   # only reaches here if all tries failed with SQLSTATE_40001
>   try {
>     START TRANSACTION
>     call lockdown()
>     call developer_function()
>     COMMIT TRANSACTION
>     return 'success'
>   }
>   except any_error {
>     # implicit ROLLBACK
>     return any_error
>   }
> }
>
> So, my problem here is that "call lockdown()" will place the locks and
> remove them upon returning... Is it possible to execute a function withou=
t
> creating a subtransaction?
>
> I could place the locks from the adapter directly at the outter
> transaction level but I have the feeling that performance would be worse.=
..
>
> Thanks,
> Filipe
>
> On Sex, Jun 12, 2015 at 5:25 , Filipe Pina <filipe.pina@impactzero.pt>
> wrote:
>
> Exactly, that=E2=80=99s why there=E2=80=99s a limit on the retry number. =
On the last try I
> wanted something like full lockdown to make sure the transaction will not
> fail due to serialiazation failure (if no other processes are touching th=
e
> database, it can=E2=80=99t happen). So if two transactions were retrying =
over and
> over, the first one to reach max_retries would activate that =E2=80=9Cglo=
bal lock=E2=80=9D
> making the other one wait and then the second one would also be able to
> successfully commit...
>
> On 11/06/2015, at 20:27, Tom Lane <tgl@sss.pgh.pa.us> wrote: Filipe Pina =
<
> filipe.pina@impactzero.pt> writes:
>
> It will try 5 times to execute each instruction (in case of
> OperationError) and in the last one it will raise the last error it
> received, aborting.
>
>  Now my problem is that aborting for the last try (on a restartable error
> - OperationalError code 40001) is not an option... It simply needs to get
> through, locking whatever other processes and queries it needs.
>
> I think you need to reconsider your objectives. What if two or more
> transactions are repeatedly failing and retrying, perhaps because they
> conflict? They can't all forcibly win. regards, tom lane
>
>


--=20
*Melvin Davidson*
I reserve the right to fantasize.  Whether or not you
wish to share my fantasy is entirely up to you.

Re: database-level lockdown

От

John R Pierce

Дата:

04 июля 2015 г., 18:10:55

On 7/4/2015 10:49 AM, Filipe Pina wrote:
>
> Terminating all sessions will break every other transaction (except
> for the one doing it). Locking database will put all the other on hold.
> As we're talking about quick/instant operations on hold will have
> impact on performance but won't cause anything to abort..

you can't lock something thats already in use by a transaction as that
transaction already has locks on it.

better would be to design your workflow so it all can be done
transactionally and get away from this batch processing model.

--
john r pierce, recycling bits in santa cruz

Re: database-level lockdown

От

Adrian Klaver

Дата:

04 июля 2015 г., 18:23:09

On 07/04/2015 10:49 AM, Filipe Pina wrote:
> Thanks for the suggestion. I read that some people do use that strategy
> for maintenance sometimes but it's no feasible in this scenario.
>
> I would have to disallow new connections AND kill all existing
> connections (as there would be an existing connection pool), but this
> won't have the same impact as using LOCKs..
>
> Terminating all sessions will break every other transaction (except for
> the one doing it). Locking database will put all the other on hold.
> As we're talking about quick/instant operations on hold will have impact
> on performance but won't cause anything to abort..
>
> I really can't find any other solution for what I need (in short: make
> sure no transactions are left out due to serialization failures)

Which would seem to indicate you have painted yourself into a corner.
The idea of locking an entire database to get one transaction to commit
seems a little extreme to me.

What is this transaction trying to do and why is it necessary that it
commit at all costs?

>
>
> On 03/07/2015, at 19:00, Melvin Davidson <melvin6925@gmail.com
> <mailto:melvin6925@gmail.com>> wrote:
>


--
Adrian Klaver
adrian.klaver@aklaver.com

Re: database-level lockdown

От

Filipe Pina

Дата:

04 июля 2015 г., 20:49:23

Thanks for the suggestion. I read that some people do use that strategy for maintenance sometimes but it's no feasible in this scenario.

I would have to disallow new connections AND kill all existing connections (as there would be an existing connection pool), but this won't have the same impact as using LOCKs..

Terminating all sessions will break every other transaction (except for the one doing it). Locking database will put all the other on hold.

As we're talking about quick/instant operations on hold will have impact on performance but won't cause anything to abort..

I really can't find any other solution for what I need (in short: make sure no transactions are left out due to serialization failures)

On 03/07/2015, at 19:00, Melvin Davidson <melvin6925@gmail.com> wrote:

Wouldn't it be easier just to prevent connections to the database
while your transaction is executed?
EG:
<Connect to your_database>
UPDATE pg_database
SET datallowconn FALSE
WHERE datname = 'your_database' ;
START TRANSACTION;
<Do your_transaction>
COMMIT;
UPDATE pg_database
SET datallowconn TRUE
WHERE datname = 'your_database' ;

On Fri, Jul 3, 2015 at 1:25 PM, Filipe Pina <filipe.pina@impactzero.pt> wrote:
So, as database level locks do not exist (as per https://wiki.postgresql.org/wiki/Lock_database), I've thought of having a function that would lock *every* table in the database (far from the same, but would probably work for me).

Something like:

CREATE OR REPLACE FUNCTION lockdown()
RETURNS void AS $$
DECLARE
t information_schema.tables.table_name%TYPE;
BEGIN
FOR t in SELECT table_name
FROM information_schema.tables
WHERE table_schema='public'
AND table_type='BASE TABLE'
LOOP
EXECUTE 'LOCK ' || t;
END LOOP;
END
$$
LANGUAGE plpgsql;

But the problem is that calling the function starts its own transaction and once it returns, locks are removed..

Basically the workflow is (pseudo code coming from Django/python/psycopg2/external, not within pgsql):

function generic_function_restarter(developer_function) {
# try 4 times to execute developer function and if all of them fail
# (due to SQLSTATE 40001 serialization failures),
# lock database and execute one last time
for 1 in [1..4] {
try {
call developer_function()
return 'success'
}
except SQLSTATE_40001 {
continue
}
except other_error {
return other_error
}

# only reaches here if all tries failed with SQLSTATE_40001
try {
START TRANSACTION
call lockdown()
call developer_function()
COMMIT TRANSACTION
return 'success'
}
except any_error {
# implicit ROLLBACK
return any_error
}
}

So, my problem here is that "call lockdown()" will place the locks and remove them upon returning... Is it possible to execute a function without creating a subtransaction?

I could place the locks from the adapter directly at the outter transaction level but I have the feeling that performance would be worse...

Thanks,
Filipe

On Sex, Jun 12, 2015 at 5:25 , Filipe Pina <filipe.pina@impactzero.pt> wrote:
Exactly, that’s why there’s a limit on the retry number. On the last try I wanted something like full lockdown to make sure the transaction will not fail due to serialiazation failure (if no other processes are touching the database, it can’t happen). So if two transactions were retrying over and over, the first one to reach max_retries would activate that “global lock” making the other one wait and then the second one would also be able to successfully commit...
On 11/06/2015, at 20:27, Tom Lane <tgl@sss.pgh.pa.us> wrote: Filipe Pina <filipe.pina@impactzero.pt> writes:
It will try 5 times to execute each instruction (in case of OperationError) and in the last one it will raise the last error it received, aborting.
Now my problem is that aborting for the last try (on a restartable error - OperationalError code 40001) is not an option... It simply needs to get through, locking whatever other processes and queries it needs.
I think you need to reconsider your objectives. What if two or more transactions are repeatedly failing and retrying, perhaps because they conflict? They can't all forcibly win. regards, tom lane

--
Melvin Davidson
I reserve the right to fantasize. Whether or not you
wish to share my fantasy is entirely up to you.

Re: database-level lockdown

От

"Greg Sabino Mullane"

Дата:

05 июля 2015 г., 13:51:05

-----BEGIN PGP SIGNED MESSAGE-----
Hash: RIPEMD160


Filipe Pina said:

> I really can't find any other solution for what I need (in short: make sure
> no transactions are left out due to serialization failures)

I think you may have been too quick to rule out advisory locks as a solution.
Yes, you will need wrappers around all other calls, but extraordinary problems
call for extraordinary solutions.

> I could place the locks from the adapter directly at the outer transaction
> level but I have the feeling that performance would be worse...

Well, performance has really got to take a back seat, given your other
requirements. ;) Locking certainly *could* work - and is arguably the best
solution, as that's what locks are for. Just have your Very Important Transaction
retry, and upon reaching that critical number, exclusively lock just the tables
being used, then try again. If you don't know which tables are being used,
I suggest storing that somewhere your class can find it, or moving away
from such a generic class.

There are other solutions (e.g. forcing conflicting processes to quit
and sleep a second), but the locking one seems the easiest.

- --
Greg Sabino Mullane greg@turnstep.com
End Point Corporation http://www.endpoint.com/
PGP Key: 0x14964AC8 201507050943
http://biglumber.com/x/web?pk=2529DF6AB8F79407E94445B4BC9B906714964AC8
-----BEGIN PGP SIGNATURE-----

iEYEAREDAAYFAlWZNeoACgkQvJuQZxSWSshVngCgpzGg7/OXRcyE2JgwDxDTFr9X
o7UAn3ENNgmIVqPpR4j1kyooiu+Ool7A
=6FSv
-----END PGP SIGNATURE-----

Re: database-level lockdown

От

Filipe Pina

Дата:

06 июля 2015 г., 14:10:13

It's not necessary to commit at all costs, it can fail, just not due to serialization..

And the transaction can be something as simple as updating a field or inserting a record (with foreign keys which is one the serialization checks).

On Sáb, Jul 4, 2015 at 7:23 , Adrian Klaver <adrian.klaver@aklaver.com> wrote:

On 07/04/2015 10:49 AM, Filipe Pina wrote:
Thanks for the suggestion. I read that some people do use that strategy for maintenance sometimes but it's no feasible in this scenario. I would have to disallow new connections AND kill all existing connections (as there would be an existing connection pool), but this won't have the same impact as using LOCKs.. Terminating all sessions will break every other transaction (except for the one doing it). Locking database will put all the other on hold. As we're talking about quick/instant operations on hold will have impact on performance but won't cause anything to abort.. I really can't find any other solution for what I need (in short: make sure no transactions are left out due to serialization failures)
Which would seem to indicate you have painted yourself into a corner. The idea of locking an entire database to get one transaction to commit seems a little extreme to me. What is this transaction trying to do and why is it necessary that it commit at all costs?
On 03/07/2015, at 19:00, Melvin Davidson <melvin6925@gmail.com <mailto:melvin6925@gmail.com>> wrote:
--
Adrian Klaver adrian.klaver@aklaver.com

Re: database-level lockdown

От

Filipe Pina

Дата:

06 июля 2015 г., 14:15:10

Yes, I've tried to come up with guideline to enumerate tables used in each process, but it's not simple because it's python application calling pgsql functions that use other functions, so it's tricky for a developer re-using existing functions to enumerate the tables used for those. Even if everything is well documented and can be re-used seems like a nasty task...

For now, I'm locking all to be able to close the gap, but I'm also wondering if I could do it in a pgsql function as I mentioned in the question:

FUNCTION A

-> FUNCTION B

----> lock TABLE

-> FUNCTION C

----> TABLE is not locked anymore because function B frees it as soon as it returns

Is there someway to have a function that locks some tables on the "outter" transaction instead of its own subtransaction?

On Seg, Jul 6, 2015 at 3:08 , Filipe Pina <filipe.pina@impactzero.pt> wrote:

On Dom, Jul 5, 2015 at 2:50 , Greg Sabino Mullane <greg@turnstep.com> wrote:
-----BEGIN PGP SIGNED MESSAGE----- Hash: RIPEMD160 Filipe Pina said:
I really can't find any other solution for what I need (in short: make sure no transactions are left out due to serialization failures)
I think you may have been too quick to rule out advisory locks as a solution. Yes, you will need wrappers around all other calls, but extraordinary problems call for extraordinary solutions.
I could place the locks from the adapter directly at the outer transaction level but I have the feeling that performance would be worse...
Well, performance has really got to take a back seat, given your other requirements. ;) Locking certainly *could* work - and is arguably the best solution, as that's what locks are for. Just have your Very Important Transaction retry, and upon reaching that critical number, exclusively lock just the tables being used, then try again. If you don't know which tables are being used, I suggest storing that somewhere your class can find it, or moving away from such a generic class. There are other solutions (e.g. forcing conflicting processes to quit and sleep a second), but the locking one seems the easiest. - -- Greg Sabino Mullane greg@turnstep.com End Point Corporation http://www.endpoint.com/ PGP Key: 0x14964AC8 201507050943 http://biglumber.com/x/web?pk=2529DF6AB8F79407E94445B4BC9B906714964AC8 -----BEGIN PGP SIGNATURE----- iEYEAREDAAYFAlWZNeoACgkQvJuQZxSWSshVngCgpzGg7/OXRcyE2JgwDxDTFr9X o7UAn3ENNgmIVqPpR4j1kyooiu+Ool7A =6FSv -----END PGP SIGNATURE-----

Re: database-level lockdown

От

Adrian Klaver

Дата:

07 июля 2015 г., 00:33:13

On 07/06/2015 07:10 AM, Filipe Pina wrote:
> It's not necessary to commit at all costs, it can fail, just not due to
> serialization..
>
> And the transaction can be something as simple as updating a field or
> inserting a record (with foreign keys which is one the serialization
> checks).

Not following, why throw serialization at a FK?

>
> On Sáb, Jul 4, 2015 at 7:23 , Adrian Klaver <adrian.klaver@aklaver.com>
> wrote:
>> On 07/04/2015 10:49 AM, Filipe Pina wrote:
>>
>>     Thanks for the suggestion. I read that some people do use that
>>     strategy for maintenance sometimes but it's no feasible in this
>>     scenario. I would have to disallow new connections AND kill all
>>     existing connections (as there would be an existing connection
>>     pool), but this won't have the same impact as using LOCKs..
>>     Terminating all sessions will break every other transaction
>>     (except for the one doing it). Locking database will put all the
>>     other on hold. As we're talking about quick/instant operations on
>>     hold will have impact on performance but won't cause anything to
>>     abort.. I really can't find any other solution for what I need (in
>>     short: make sure no transactions are left out due to serialization
>>     failures)
>>
>> Which would seem to indicate you have painted yourself into a corner.
>> The idea of locking an entire database to get one transaction to
>> commit seems a little extreme to me. What is this transaction trying
>> to do and why is it necessary that it commit at all costs?
>>
>>     On 03/07/2015, at 19:00, Melvin Davidson <melvin6925@gmail.com
>>     <mailto:melvin6925@gmail.com>> wrote:
>>
>> --
>> Adrian Klaver adrian.klaver@aklaver.com


--
Adrian Klaver
adrian.klaver@aklaver.com

Re: database-level lockdown

От

Adrian Klaver

Дата:

07 июля 2015 г., 00:41:22

On 07/06/2015 07:15 AM, Filipe Pina wrote:
> Yes, I've tried to come up with guideline to enumerate tables used in
> each process, but it's not simple because it's python application
> calling pgsql functions that use other functions, so it's tricky for a
> developer re-using existing functions to enumerate the tables used for
> those. Even if everything is well documented and can be re-used seems
> like a nasty task...

Still not sure what is you are trying to accomplish.

Is it really necessary that every transaction be serialized?

Or to put it another way, why are you running in serializable by default?

Or yet another way, what is the problem you are trying to solve with
serialized transactions?



>
> For now, I'm locking all to be able to close the gap, but I'm also
> wondering if I could do it in a pgsql function as I mentioned in the
> question:
>
> FUNCTION A
> -> FUNCTION B
> ----> lock TABLE
> -> FUNCTION C
> ----> TABLE is not locked anymore because function B frees it as soon as
> it returns
>
> Is there someway to have a function that locks some tables on the
> "outter" transaction instead of its own subtransaction?

>


--
Adrian Klaver
adrian.klaver@aklaver.com

Re: database-level lockdown

От

Filipe Pina

Дата:

07 июля 2015 г., 10:55:44

Exactly, that's the twist I've decided after some tests yesterday (the "lock all tables on last try" degraded performance in an obscene way): giving up on serializable by default.

I wanted to use serializable so developers wouldn't have to worry about properly using locks, but implementing this "serialization failure" handler is becoming a major headache...

Thanks guys!

On Ter, Jul 7, 2015 at 1:41 , Adrian Klaver <adrian.klaver@aklaver.com> wrote:

On 07/06/2015 07:15 AM, Filipe Pina wrote:
Yes, I've tried to come up with guideline to enumerate tables used in each process, but it's not simple because it's python application calling pgsql functions that use other functions, so it's tricky for a developer re-using existing functions to enumerate the tables used for those. Even if everything is well documented and can be re-used seems like a nasty task...
Still not sure what is you are trying to accomplish. Is it really necessary that every transaction be serialized? Or to put it another way, why are you running in serializable by default? Or yet another way, what is the problem you are trying to solve with serialized transactions?
For now, I'm locking all to be able to close the gap, but I'm also wondering if I could do it in a pgsql function as I mentioned in the question: FUNCTION A -> FUNCTION B ----> lock TABLE -> FUNCTION C ----> TABLE is not locked anymore because function B frees it as soon as it returns Is there someway to have a function that locks some tables on the "outter" transaction instead of its own subtransaction?
--
Adrian Klaver adrian.klaver@aklaver.com

Re: database-level lockdown

От

Alban Hertroys

Дата:

07 июля 2015 г., 13:31:46

On 7 July 2015 at 12:55, Filipe Pina <filipe.pina@impactzero.pt> wrote:
> On Ter, Jul 7, 2015 at 1:41 , Adrian Klaver <adrian.klaver@aklaver.com>
> wrote:
>> Still not sure what is you are trying to accomplish. Is it really necessary
>> that every transaction be serialized? Or to put it another way, why are you
>> running in serializable by default? Or yet another way, what is the problem
>> you are trying to solve with serialized transactions?
>
> Exactly, that's the twist I've decided after some tests yesterday (the "lock
> all tables on last try" degraded performance in an obscene way): giving up
> on serializable by default.
>
> I wanted to use serializable so developers wouldn't have to worry about
> properly using locks, but implementing this "serialization failure" handler
> is becoming a major headache...

What Adrian was trying to get out of you is why you think you need those locks.

You're working with an RDBMS, it does the job of keeping data
integrity in a multi-user environment already. You can trust it do do
that well.
So unless you're doing something really special, you shouldn't need to
lock anything.
Among the special cases where you do need locks are things like
needing a gapless sequence (invoice numbers come to mind); things like
that.

So once again, what do you need those locks for?

P.S. Please don't top-post on this list. I had to edit your message to
fix the flow of the conversation.

--
If you can't see the forest for the trees,
Cut the trees and you'll see there is no forest.

Re: database-level lockdown

От

Filipe Pina

Дата:

07 июля 2015 г., 13:44:01

On Ter, Jul 7, 2015 at 2:31 , Alban Hertroys <haramrae@gmail.com> wrote:

On 7 July 2015 at 12:55, Filipe Pina <filipe.pina@impactzero.pt> wrote:
On Ter, Jul 7, 2015 at 1:41 , Adrian Klaver <adrian.klaver@aklaver.com> wrote:
Still not sure what is you are trying to accomplish. Is it really necessary that every transaction be serialized? Or to put it another way, why are you running in serializable by default? Or yet another way, what is the problem you are trying to solve with serialized transactions?
Exactly, that's the twist I've decided after some tests yesterday (the "lock all tables on last try" degraded performance in an obscene way): giving up on serializable by default. I wanted to use serializable so developers wouldn't have to worry about properly using locks, but implementing this "serialization failure" handler is becoming a major headache...
What Adrian was trying to get out of you is why you think you need those locks. You're working with an RDBMS, it does the job of keeping data integrity in a multi-user environment already. You can trust it do do that well. So unless you're doing something really special, you shouldn't need to lock anything. Among the special cases where you do need locks are things like needing a gapless sequence (invoice numbers come to mind); things like that. So once again, what do you need those locks for? P.S. Please don't top-post on this list. I had to edit your message to fix the flow of the conversation.
--
If you can't see the forest for the trees, Cut the trees and you'll see there is no forest.

It probably wasn't clear but the conversation did start with exposing my issue, I'll try to rephrase.

Serializable keeps data integrity of the database, that's true. But it does that by "refusing" transactions that would compromise it (with error 40001).

I need to make sure no transactions fail due to such errors, so I made an handler to retry transactions if they fail due to 40001.

But I can't leave it retrying forever, so I have this (configurable) limit of 5 retries.

5 retries might be enough in some case, but it might not be enough in others.

So how do I make sure the last try would not fail with serialization error?

I could only come up with the idea of using LOCK database (on that last try) but as it is not available in PG, I went for "lock all tables".

Does it make sense now?

I was probably having the wrong approach to the problem anyway, I've went with "read committed" mode now and normal locking (SELECT .. FOR UPDATE on the required spots) and it works better..

I just come from a GTM background (and all the other developers in the project) and we're used to the TP RESTART concept where, in the final retry, GTM locks the database region(s) that were used by that transaction (not the whole DB, just the regions).

Thanks and sorry for the top-posting, not really used to mailing lists and proper posting styles..

Re: database-level lockdown

От

Adrian Klaver

Дата:

07 июля 2015 г., 13:51:46

On 07/07/2015 06:44 AM, Filipe Pina wrote:
> On Ter, Jul 7, 2015 at 2:31 , Alban Hertroys <haramrae@gmail.com> wrote:
>> On 7 July 2015 at 12:55, Filipe Pina <filipe.pina@impactzero.pt> wrote:
>>
>>     On Ter, Jul 7, 2015 at 1:41 , Adrian Klaver
>>     <adrian.klaver@aklaver.com> wrote:
>>
>>         Still not sure what is you are trying to accomplish. Is it
>>         really necessary that every transaction be serialized? Or to
>>         put it another way, why are you running in serializable by
>>         default? Or yet another way, what is the problem you are
>>         trying to solve with serialized transactions?
>>
>>     Exactly, that's the twist I've decided after some tests yesterday
>>     (the "lock all tables on last try" degraded performance in an
>>     obscene way): giving up on serializable by default. I wanted to
>>     use serializable so developers wouldn't have to worry about
>>     properly using locks, but implementing this "serialization
>>     failure" handler is becoming a major headache...
>>
>> What Adrian was trying to get out of you is why you think you need
>> those locks. You're working with an RDBMS, it does the job of keeping
>> data integrity in a multi-user environment already. You can trust it
>> do do that well. So unless you're doing something really special, you
>> shouldn't need to lock anything. Among the special cases where you do
>> need locks are things like needing a gapless sequence (invoice numbers
>> come to mind); things like that. So once again, what do you need those
>> locks for? P.S. Please don't top-post on this list. I had to edit your
>> message to fix the flow of the conversation.
>> --
>> If you can't see the forest for the trees, Cut the trees and you'll
>> see there is no forest.
>
> It probably wasn't clear but the conversation did start with exposing my
> issue, I'll try to rephrase.
>
> Serializable keeps data integrity of the database, that's true. But it
> does that by "refusing" transactions that would compromise it (with
> error 40001).
>
> I need to make sure no transactions fail due to such errors, so I made
> an handler to retry transactions if they fail due to 40001.
> But I can't leave it retrying forever, so I have this (configurable)
> limit of 5 retries.
> 5 retries might be enough in some case, but it might not be enough in
> others.
>
> So how do I make sure the last try would not fail with serialization error?
> I could only come up with the idea of using LOCK database (on that last
> try) but as it is not available in PG, I went for "lock all tables".
>
> Does it make sense now?
> I was probably having the wrong approach to the problem anyway, I've
> went with "read committed" mode now and normal locking (SELECT .. FOR
> UPDATE on the required spots) and it works better..

You also mentioned that this for a Django application, have you looked
at its new(since 1.6) transaction management?:

https://docs.djangoproject.com/en/1.8/topics/db/transactions/

It allows you fairly fine grained control over transactions, including
access to savepoints.


> I just come from a GTM background (and all the other developers in the
> project) and we're used to the TP RESTART concept where, in the final
> retry, GTM locks the database region(s) that were used by that
> transaction (not the whole DB, just the regions).
>
> Thanks and sorry for the top-posting, not really used to mailing lists
> and proper posting styles..


--
Adrian Klaver
adrian.klaver@aklaver.com

Re: database-level lockdown

От

Filipe Pina

Дата:

07 июля 2015 г., 14:14:25

On Ter, Jul 7, 2015 at 2:51 , Adrian Klaver <adrian.klaver@aklaver.com> wrote:

On 07/07/2015 06:44 AM, Filipe Pina wrote:
On Ter, Jul 7, 2015 at 2:31 , Alban Hertroys <haramrae@gmail.com> wrote:
On 7 July 2015 at 12:55, Filipe Pina <filipe.pina@impactzero.pt> wrote: On Ter, Jul 7, 2015 at 1:41 , Adrian Klaver <adrian.klaver@aklaver.com> wrote: Still not sure what is you are trying to accomplish. Is it really necessary that every transaction be serialized? Or to put it another way, why are you running in serializable by default? Or yet another way, what is the problem you are trying to solve with serialized transactions? Exactly, that's the twist I've decided after some tests yesterday (the "lock all tables on last try" degraded performance in an obscene way): giving up on serializable by default. I wanted to use serializable so developers wouldn't have to worry about properly using locks, but implementing this "serialization failure" handler is becoming a major headache... What Adrian was trying to get out of you is why you think you need those locks. You're working with an RDBMS, it does the job of keeping data integrity in a multi-user environment already. You can trust it do do that well. So unless you're doing something really special, you shouldn't need to lock anything. Among the special cases where you do need locks are things like needing a gapless sequence (invoice numbers come to mind); things like that. So once again, what do you need those locks for? P.S. Please don't top-post on this list. I had to edit your message to fix the flow of the conversation. -- If you can't see the forest for the trees, Cut the trees and you'll see there is no forest.
It probably wasn't clear but the conversation did start with exposing my issue, I'll try to rephrase. Serializable keeps data integrity of the database, that's true. But it does that by "refusing" transactions that would compromise it (with error 40001). I need to make sure no transactions fail due to such errors, so I made an handler to retry transactions if they fail due to 40001. But I can't leave it retrying forever, so I have this (configurable) limit of 5 retries. 5 retries might be enough in some case, but it might not be enough in others. So how do I make sure the last try would not fail with serialization error? I could only come up with the idea of using LOCK database (on that last try) but as it is not available in PG, I went for "lock all tables". Does it make sense now? I was probably having the wrong approach to the problem anyway, I've went with "read committed" mode now and normal locking (SELECT .. FOR UPDATE on the required spots) and it works better..
You also mentioned that this for a Django application, have you looked at its new(since 1.6) transaction management?: https://docs.djangoproject.com/en/1.8/topics/db/transactions/ It allows you fairly fine grained control over transactions, including access to savepoints.
I just come from a GTM background (and all the other developers in the project) and we're used to the TP RESTART concept where, in the final retry, GTM locks the database region(s) that were used by that transaction (not the whole DB, just the regions). Thanks and sorry for the top-posting, not really used to mailing lists and proper posting styles..
--
Adrian Klaver adrian.klaver@aklaver.com

Yes, I was using transation.atomic in some django-only operations, but the problem is that some django operations call PG functions that call other PG functions and the purpose was to have the django code "agnostic" of the what the functions do..

But we've switched to "read committed" plus proper "FOR UPDATE" statements in the few functions we have at the moment and the performance is actually quite better (at the little extra cost of people being careful with locks).

Сайт использует файлы cookie для корректной работы и повышения удобства. Нажимая кнопку «Принять» или продолжая пользоваться сайтом, вы соглашаетесь на их использование в соответствии с Политикой в отношении обработки cookie ООО «ППГ», в том числе на передачу данных из файлов cookie сторонним статистическим и рекламным службам. Вы можете управлять настройками cookie через параметры вашего браузера