Обсуждение: Race condition while creating a new partition

Поиск
Список
Период
Сортировка

Race condition while creating a new partition

От
Andrei Zhidenkov
Дата:
We use this code in order to automatically create new partitions for a partitioned table (Postgres 10.6):

begin
  insert into <partitioned_table>;
exception when undefined_table then
  begin
    <create_unexistent_partition>
    -- A concurrent txn has created the new partition
    exception when others then end;
    -- Insert data into the new partition
    insert into <partitioned_table>;
  end;
end;

As far as I understand we should always have a new partition created either in current or in concurrent transaction but
todaywe faced the problem when we failed to insert data due to a partition nonexistence for a small period of time.
HaveI missed something? 

Thank you.

—

With best regards, Andrei Zhidenkov.


Re: Race condition while creating a new partition

От
Andrei Zhidenkov
Дата:
I think that I’ve got a deadlock (which is handled by `exception when others` statements). But the problem is it occurs
toofast. Is it possible to get a deadlock faster than deadlock_timeout? It’s set to 1s (default value) but it looks
likeI get it immidiately. Error message I’m getting after removing the second exception handling is the following: 

psycopg2.errors.DeadlockDetected: deadlock detected
DETAIL:  Process 33 waits for AccessExclusiveLock on relation 16453 of database 16384; blocked by process 37.
Process 37 waits for AccessExclusiveLock on relation 16453 of database 16384; blocked by process 33.
HINT:  See server log for query details.
CONTEXT:  SQL statement "
            CREATE TABLE IF NOT EXISTS prov_level_1.log_*__2019_12_16_10_25_46 PARTITION OF prov_level_1.log_*
                FOR VALUES FROM ('2019-12-16 10:25:46+00') TO ('2019-12-16 10:25:47+00');
        "

> On 15. Nov 2019, at 11:49, Andrei Zhidenkov <andrei.zhidenkov@n26.com> wrote:
>
> We use this code in order to automatically create new partitions for a partitioned table (Postgres 10.6):
>
> begin
>  insert into <partitioned_table>;
> exception when undefined_table then
>  begin
>    <create_unexistent_partition>
>    -- A concurrent txn has created the new partition
>    exception when others then end;
>    -- Insert data into the new partition
>    insert into <partitioned_table>;
>  end;
> end;
>
> As far as I understand we should always have a new partition created either in current or in concurrent transaction
buttoday we faced the problem when we failed to insert data due to a partition nonexistence for a small period of time.
HaveI missed something? 
>
> Thank you.
>
> —
>
> With best regards, Andrei Zhidenkov.




Re: Race condition while creating a new partition

От
Justin
Дата:
Hi Andrei, 

My gut reactions is Yes this is a deadlock caused by a race condition,  the error from psycopg2 tells us  that.  Question becomes what is causing these two process to collide, are both processes 33 and 37 python code,  As both are trying to access the same resource 16453  i would assume both sending the same command Create Table Partition.  Are these two connections from different computers or the same computer using multi threading??

What does Postgresql Log show what is going on??

On Mon, Dec 16, 2019 at 5:32 AM Andrei Zhidenkov <andrei.zhidenkov@n26.com> wrote:
I think that I’ve got a deadlock (which is handled by `exception when others` statements). But the problem is it occurs too fast. Is it possible to get a deadlock faster than deadlock_timeout? It’s set to 1s (default value) but it looks like I get it immidiately. Error message I’m getting after removing the second exception handling is the following:

psycopg2.errors.DeadlockDetected: deadlock detected
DETAIL:  Process 33 waits for AccessExclusiveLock on relation 16453 of database 16384; blocked by process 37.
Process 37 waits for AccessExclusiveLock on relation 16453 of database 16384; blocked by process 33.
HINT:  See server log for query details.
CONTEXT:  SQL statement "
            CREATE TABLE IF NOT EXISTS prov_level_1.log_*__2019_12_16_10_25_46 PARTITION OF prov_level_1.log_*
                FOR VALUES FROM ('2019-12-16 10:25:46+00') TO ('2019-12-16 10:25:47+00');
        "

> On 15. Nov 2019, at 11:49, Andrei Zhidenkov <andrei.zhidenkov@n26.com> wrote:
>
> We use this code in order to automatically create new partitions for a partitioned table (Postgres 10.6):
>
> begin
>  insert into <partitioned_table>;
> exception when undefined_table then
>  begin
>    <create_unexistent_partition>
>    -- A concurrent txn has created the new partition
>    exception when others then end;
>    -- Insert data into the new partition
>    insert into <partitioned_table>;
>  end;
> end;
>
> As far as I understand we should always have a new partition created either in current or in concurrent transaction but today we faced the problem when we failed to insert data due to a partition nonexistence for a small period of time. Have I missed something?
>
> Thank you.
>
> —
>
> With best regards, Andrei Zhidenkov.



Re: Race condition while creating a new partition

От
Andrei Zhidenkov
Дата:
Hi, Justin

I’ve managed to reproduce this deadlock (different threads) and it looks it happens while Postgres tries to insert data into unique index for pg_type table (it creates a new row-type for every new table and a new partition is a new table). 16453 is old of the parent's table for the partition. But I still can’t understand why do we have a deadlock here.

2019-12-16 11:41:17.895 UTC [79] LOG:  process 79 detected deadlock while waiting for ShareLock on transaction 599 after 1000.210 ms
2019-12-16 11:41:17.895 UTC [79] DETAIL:  Process holding the lock: 81. Wait queue: 84, 80, 82, 83, 87, 88, 85, 86.
2019-12-16 11:41:17.895 UTC [79] CONTEXT:  while inserting index tuple (10,35) in relation "pg_type_typname_nsp_index"

2019-12-16 11:41:17.895 UTC [79] STATEMENT:  INSERT INTO test (val) VAlUES ('thread 6')

2019-12-16 11:41:17.897 UTC [79] ERROR:  deadlock detected
2019-12-16 11:41:17.897 UTC [79] DETAIL:  Process 79 waits for ShareLock on transaction 599; blocked by process 81.
Process 81 waits for AccessExclusiveLock on relation 16453 of database 16384; blocked by process 79.
Process 79: INSERT INTO test (val) VAlUES ('thread 6')
Process 81: INSERT INTO test (val) VAlUES ('thread 4')

2019-12-16 11:41:17.897 UTC [79] CONTEXT:  while inserting index tuple (10,35) in relation "pg_type_typname_nsp_index"

2019-12-16 11:41:17.897 UTC [79] STATEMENT:  INSERT INTO test (val) VAlUES ('thread 6')
2019-12-16 11:41:17.982 UTC [81] LOG:  process 81 still waiting for AccessExclusiveLock on relation 16453 of database 16384 after 1036.385 ms
2019-12-16 11:41:17.982 UTC [81] DETAIL:  Process holding the lock: 86. Wait queue: 81.

2019-12-16 11:41:17.982 UTC [81] STATEMENT:  INSERT INTO test (val) VAlUES ('thread 4')
2019-12-16 11:41:17.991 UTC [81] LOG:  process 81 acquired AccessExclusiveLock on relation 16453 of database 16384 after 1044.976 ms

2019-12-16 11:41:17.991 UTC [81] STATEMENT:  INSERT INTO test (val) VAlUES ('thread 4')

On 16. Dec 2019, at 16:03, Justin <zzzzz.graf@gmail.com> wrote:

Hi Andrei, 

My gut reactions is Yes this is a deadlock caused by a race condition,  the error from psycopg2 tells us  that.  Question becomes what is causing these two process to collide, are both processes 33 and 37 python code,  As both are trying to access the same resource 16453  i would assume both sending the same command Create Table Partition.  Are these two connections from different computers or the same computer using multi threading??

What does Postgresql Log show what is going on??

On Mon, Dec 16, 2019 at 5:32 AM Andrei Zhidenkov <andrei.zhidenkov@n26.com> wrote:
I think that I’ve got a deadlock (which is handled by `exception when others` statements). But the problem is it occurs too fast. Is it possible to get a deadlock faster than deadlock_timeout? It’s set to 1s (default value) but it looks like I get it immidiately. Error message I’m getting after removing the second exception handling is the following:

psycopg2.errors.DeadlockDetected: deadlock detected
DETAIL:  Process 33 waits for AccessExclusiveLock on relation 16453 of database 16384; blocked by process 37.
Process 37 waits for AccessExclusiveLock on relation 16453 of database 16384; blocked by process 33.
HINT:  See server log for query details.
CONTEXT:  SQL statement "
            CREATE TABLE IF NOT EXISTS prov_level_1.log_*__2019_12_16_10_25_46 PARTITION OF prov_level_1.log_*
                FOR VALUES FROM ('2019-12-16 10:25:46+00') TO ('2019-12-16 10:25:47+00');
        "

> On 15. Nov 2019, at 11:49, Andrei Zhidenkov <andrei.zhidenkov@n26.com> wrote:
>
> We use this code in order to automatically create new partitions for a partitioned table (Postgres 10.6):
>
> begin
>  insert into <partitioned_table>;
> exception when undefined_table then
>  begin
>    <create_unexistent_partition>
>    -- A concurrent txn has created the new partition
>    exception when others then end;
>    -- Insert data into the new partition
>    insert into <partitioned_table>;
>  end;
> end;
>
> As far as I understand we should always have a new partition created either in current or in concurrent transaction but today we faced the problem when we failed to insert data due to a partition nonexistence for a small period of time. Have I missed something?
>
> Thank you.
>
> —
>
> With best regards, Andrei Zhidenkov.




Re: Race condition while creating a new partition

От
Justin
Дата:
Hi Andrei

General speaking any DDL (Create, Alter  Drop .etc) commands  issue exclusive locks automatically, so anything this transaction touches starts getting exclusive locks.  Assuming this is a multi-threading app these two threads are sending commands all but at the same time.  The Exclusive lock  has not been committed yet as its waiting for a previous insert to finish, and the Exclusive lock is blocking the insert  as its transaction started prior to the Insert. 

the wait queue is pretty long 84, 80, 82, 83, 87, 88, 85, 86.
The time stamps are only 2 ms apart,  lots of commands being sent in at nearly the same time...

Here are my  suggestions
Tweak  the Python app to make sure all the threads have committed their transactions place a lock on all the other threads, Then issue a DDL command, when the command is committed, then unlock the other threads.  That should kill off the race condition deadlock..

Post all the SQL code that is creating this exclusive lock,  we may be able re-write the SQL code to reduce the time and/or resources this Exclusive Lock is creating.     

Pyscopg2 default mode is to require issuing a Commit it does not automatically commit transaction.  If running in auto-commit OFF try  turning this  ON for DDL commands may fix the issue.  It may also be that DDL command takes far longer than 1 second to complete...


Re: Race condition while creating a new partition

От
Michael Lewis
Дата:
It looks like you are creating a partition for each minute of the day (when logs get inserted for a given minute at least). Would it be at all reasonable to have an hourly or daily job which creates the partitions ahead of when they are actually needed? If partitions went unused in the recent past, you could also drop those if appropriate to your workflow. I just wonder about doing the work just-in-time via 2+ connections vs doing it from a single process to avoid the chance of deadlock completely.

Re: Race condition while creating a new partition

От
Andrei Zhidenkov
Дата:
I’m creating a new partition for every second deliberately in order to faster reproduce a bug I have on the live
environment.In the live environment a new partitions are being created every one day. More to that, we create new
partitionsin advance and this procedure is only a backup mechanisum in case of something goes wrong. Now I think that
theonly way to avoid errors like this is to use explicit locking, I tried advisory locks and it worked well.  

> On 17. Dec 2019, at 00:06, Michael Lewis <mlewis@entrata.com> wrote:
>
> It looks like you are creating a partition for each minute of the day (when logs get inserted for a given minute at
least).Would it be at all reasonable to have an hourly or daily job which creates the partitions ahead of when they are
actuallyneeded? If partitions went unused in the recent past, you could also drop those if appropriate to your
workflow.I just wonder about doing the work just-in-time via 2+ connections vs doing it from a single process to avoid
thechance of deadlock completely.