Обсуждение: deadlock between "WITH agg_tmp AS ({sel_stmt}), upd AS ({upd_stmt}) {ins_stmt}" and pure UPDATE statements

Поиск
Список
Период
Сортировка
Hello.

I have two transactions (trans1 and trans2) updating tables T1 and T2 in
the same order, but in a different way.

trans1 creates temp table, copies data from a file and updates tables T1
and T2 from this temp table (using basic UPDATE form). It even commits
changes in between T1 and T2 updates to reduce locks.

trans2 creates two temp tables (T1_tmp and T2_tmp), copies data from
files and updates T1 and T2 using [WITH ... AS ... RETURNING] approach.
Unexciting rows should be inserted, existing updated (sum-ed with values
from temp tables). Both T1 and T2 must be updated in the same transaction.

There are no any FKs anywhere in these tables.



trans1:

DROP TABLE IF EXISTS trans1_T_tmp;

CREATE TABLE trans1_T_tmp (...);

COMMIT

COPY from FILE into trans1_T_tmp;

BEGIN
UPDATE T1
     SET ...
     FROM trans1_T_tmp
     WHERE ...
COMMIT

BEGIN
UPDATE T2
     SET ...
     FROM (SELECT ... FROM trans1_T_tmp)
     WHERE ...

DROP TABLE trans1_T_tmp;
COMMIT



trans2:

BEGIN

CREATE TABLE trans2_T1_tmp (...);
COPY from FILE into trans2_T1_tmp;
WITH agg_tmp AS (SELECT ... FROM trans2_T1_tmp GROUP BY ...), upd AS
(UPDATE T1 SET ... FROM agg_tmp s WHERE ... RETURNING ...) INSERT INTO
T1 (...) SELECT ... FROM agg_tmp s LEFT JOIN upd t ON (...) WHERE ...;
DROP TABLE trans2_T1_tmp;

CREATE TABLE trans2_T2_tmp (...);
COPY from FILE into trans2_T2_tmp;
WITH agg_tmp AS (SELECT ... FROM trans2_T2_tmp GROUP BY ...), upd AS
(UPDATE T2 SET ... FROM agg_tmp s WHERE ... RETURNING ...) INSERT INTO
T2 (...) SELECT ... FROM agg_tmp s LEFT JOIN upd t ON (...) WHERE ...;
DROP TABLE trans2_T2_tmp;

COMMIT

By an unknown [for me] reason trans1 and trans2 often produce deadlocks...

Could you explain what's the reason for that and how to avoid them?
And is there exist a better replacement for WITH ... AS ... RETURNING ?
Thanks!


On 07/02/2016 09:54 AM, trafdev wrote:
> Hello.
>
> I have two transactions (trans1 and trans2) updating tables T1 and T2 in
> the same order, but in a different way.
>
> trans1 creates temp table, copies data from a file and updates tables T1
> and T2 from this temp table (using basic UPDATE form). It even commits
> changes in between T1 and T2 updates to reduce locks.
>
> trans2 creates two temp tables (T1_tmp and T2_tmp), copies data from
> files and updates T1 and T2 using [WITH ... AS ... RETURNING] approach.
> Unexciting rows should be inserted, existing updated (sum-ed with values
> from temp tables). Both T1 and T2 must be updated in the same transaction.
>
> There are no any FKs anywhere in these tables.
>
>
>
> trans1:

session1:

This is actually one session with multiple transactions, at least if I
am following correctly.

Assuming you have a:

BEGIN;

here.

>
> DROP TABLE IF EXISTS trans1_T_tmp;
>
> CREATE TABLE trans1_T_tmp (...);
>
> COMMIT
>
> COPY from FILE into trans1_T_tmp;
>
> BEGIN
> UPDATE T1
>     SET ...
>     FROM trans1_T_tmp
>     WHERE ...
> COMMIT
>
> BEGIN
> UPDATE T2
>     SET ...
>     FROM (SELECT ... FROM trans1_T_tmp)
>     WHERE ...
>
> DROP TABLE trans1_T_tmp;
> COMMIT
>
>
>
> trans2:

session2:

>
> BEGIN
>
> CREATE TABLE trans2_T1_tmp (...);
> COPY from FILE into trans2_T1_tmp;
> WITH agg_tmp AS (SELECT ... FROM trans2_T1_tmp GROUP BY ...), upd AS
> (UPDATE T1 SET ... FROM agg_tmp s WHERE ... RETURNING ...) INSERT INTO
> T1 (...) SELECT ... FROM agg_tmp s LEFT JOIN upd t ON (...) WHERE ...;
> DROP TABLE trans2_T1_tmp;
>
> CREATE TABLE trans2_T2_tmp (...);
> COPY from FILE into trans2_T2_tmp;
> WITH agg_tmp AS (SELECT ... FROM trans2_T2_tmp GROUP BY ...), upd AS
> (UPDATE T2 SET ... FROM agg_tmp s WHERE ... RETURNING ...) INSERT INTO
> T2 (...) SELECT ... FROM agg_tmp s LEFT JOIN upd t ON (...) WHERE ...;
> DROP TABLE trans2_T2_tmp;
>
> COMMIT
>
> By an unknown [for me] reason trans1 and trans2 often produce deadlocks...

What do the logs show as the error message?

>
> Could you explain what's the reason for that and how to avoid them?
> And is there exist a better replacement for WITH ... AS ... RETURNING ?
> Thanks!
>
>


--
Adrian Klaver
adrian.klaver@aklaver.com


Yes, you are right about sessions.

Here is the case from the server log:

"deadlock detected","Process 2588 waits for ShareLock on transaction
1939192; blocked by process 16399. Process 16399 waits for ShareLock on
transaction 1939195; blocked by process 2588.

Process 2588:
UPDATE T1
     SET ...
     FROM trans1_T_tmp
     WHERE ...

Process 16399: WITH agg_tmp AS (SELECT ... FROM trans2_T1_tmp GROUP BY
...), upd AS (UPDATE T1 SET ... FROM agg_tmp s WHERE ... RETURNING ...)
INSERT INTO T1 (...) SELECT ... FROM agg_tmp s LEFT JOIN upd t ON (...)
WHERE ...;","See server log for query details.",,,"while updating tuple
(388225,15) in relation ""T1""","                UPDATE T1
     SET ...
     FROM trans1_T_tmp
     WHERE ...
                 ",,,""


or another one:



"deadlock detected","Process 71490 waits for ShareLock on transaction
2001693; blocked by process 71221. Process 71221 waits for ShareLock on
transaction 2001689; blocked by process 71490.
Process 71490: WITH agg_tmp AS (SELECT ... FROM trans2_T1_tmp GROUP BY
...), upd AS (UPDATE T1 SET ... FROM agg_tmp s WHERE ... RETURNING ...)
INSERT INTO T1 (...) SELECT ... FROM agg_tmp s LEFT JOIN upd t ON (...)
WHERE ...;
Process 71221:                 UPDATE T1
     SET ...
     FROM trans1_T_tmp
     WHERE ...
                 ","See server log for query details.",,,"while updating
tuple (93716,27) in relation ""T1""","WITH agg_tmp AS (SELECT ... FROM
trans2_T1_tmp GROUP BY ...), upd AS (UPDATE T1 SET ... FROM agg_tmp s
WHERE ... RETURNING ...) INSERT INTO T1 (...) SELECT ... FROM agg_tmp s
LEFT JOIN upd t ON (...) WHERE ...;
Process 71221:                 UPDATE T1
     SET ...
     FROM trans1_T_tmp
     WHERE ...",,,""



On 07/02/16 11:14, Adrian Klaver wrote:
> On 07/02/2016 09:54 AM, trafdev wrote:
> > Hello.
> >
> > I have two transactions (trans1 and trans2) updating tables T1 and T2 in
> > the same order, but in a different way.
> >
> > trans1 creates temp table, copies data from a file and updates tables T1
> > and T2 from this temp table (using basic UPDATE form). It even commits
> > changes in between T1 and T2 updates to reduce locks.
> >
> > trans2 creates two temp tables (T1_tmp and T2_tmp), copies data from
> > files and updates T1 and T2 using [WITH ... AS ... RETURNING] approach.
> > Unexciting rows should be inserted, existing updated (sum-ed with values
> > from temp tables). Both T1 and T2 must be updated in the same
> > transaction.
> >
> > There are no any FKs anywhere in these tables.
> >
> >
> >
> > trans1:
>
> session1:
>
> This is actually one session with multiple transactions, at least if I
> am following correctly.
>
> Assuming you have a:
>
> BEGIN;
>
> here.
>
> >
> > DROP TABLE IF EXISTS trans1_T_tmp;
> >
> > CREATE TABLE trans1_T_tmp (...);
> >
> > COMMIT
> >
> > COPY from FILE into trans1_T_tmp;
> >
> > BEGIN
> > UPDATE T1
> >     SET ...
> >     FROM trans1_T_tmp
> >     WHERE ...
> > COMMIT
> >
> > BEGIN
> > UPDATE T2
> >     SET ...
> >     FROM (SELECT ... FROM trans1_T_tmp)
> >     WHERE ...
> >
> > DROP TABLE trans1_T_tmp;
> > COMMIT
> >
> >
> >
> > trans2:
>
> session2:
>
> >
> > BEGIN
> >
> > CREATE TABLE trans2_T1_tmp (...);
> > COPY from FILE into trans2_T1_tmp;
> > WITH agg_tmp AS (SELECT ... FROM trans2_T1_tmp GROUP BY ...), upd AS
> > (UPDATE T1 SET ... FROM agg_tmp s WHERE ... RETURNING ...) INSERT INTO
> > T1 (...) SELECT ... FROM agg_tmp s LEFT JOIN upd t ON (...) WHERE ...;
> > DROP TABLE trans2_T1_tmp;
> >
> > CREATE TABLE trans2_T2_tmp (...);
> > COPY from FILE into trans2_T2_tmp;
> > WITH agg_tmp AS (SELECT ... FROM trans2_T2_tmp GROUP BY ...), upd AS
> > (UPDATE T2 SET ... FROM agg_tmp s WHERE ... RETURNING ...) INSERT INTO
> > T2 (...) SELECT ... FROM agg_tmp s LEFT JOIN upd t ON (...) WHERE ...;
> > DROP TABLE trans2_T2_tmp;
> >
> > COMMIT
> >
> > By an unknown [for me] reason trans1 and trans2 often produce
> > deadlocks...
>
> What do the logs show as the error message?
>
> >
> > Could you explain what's the reason for that and how to avoid them?
> > And is there exist a better replacement for WITH ... AS ... RETURNING ?
> > Thanks!
> >
> >
>
>



On 07/02/2016 11:38 AM, trafdev wrote:
> Yes, you are right about sessions.
>
> Here is the case from the server log:
>
> "deadlock detected","Process 2588 waits for ShareLock on transaction
> 1939192; blocked by process 16399. Process 16399 waits for ShareLock on
> transaction 1939195; blocked by process 2588.
>
> Process 2588:
> UPDATE T1
>     SET ...
>     FROM trans1_T_tmp
>     WHERE ...
>
> Process 16399: WITH agg_tmp AS (SELECT ... FROM trans2_T1_tmp GROUP BY
> ...), upd AS (UPDATE T1 SET ... FROM agg_tmp s WHERE ... RETURNING ...)
> INSERT INTO T1 (...) SELECT ... FROM agg_tmp s LEFT JOIN upd t ON (...)
> WHERE ...;","See server log for query details.",,,"while updating tuple
> (388225,15) in relation ""T1""","                UPDATE T1
>     SET ...
>     FROM trans1_T_tmp
>     WHERE ...
>                 ",,,""

Best guess you are running into what is described here:

https://www.postgresql.org/docs/9.5/static/explicit-locking.html#LOCKING-DEADLOCKS

Both transactions are holding locks on rows in T1 that the other wants also.

I may be missing something, but I am not sure why it is necessary to run
both sessions concurrently? Could you not do session1 and once it
completes then session2?

>
>
> or another one:
>
>
>
> "deadlock detected","Process 71490 waits for ShareLock on transaction
> 2001693; blocked by process 71221. Process 71221 waits for ShareLock on
> transaction 2001689; blocked by process 71490.
> Process 71490: WITH agg_tmp AS (SELECT ... FROM trans2_T1_tmp GROUP BY
> ...), upd AS (UPDATE T1 SET ... FROM agg_tmp s WHERE ... RETURNING ...)
> INSERT INTO T1 (...) SELECT ... FROM agg_tmp s LEFT JOIN upd t ON (...)
> WHERE ...;
> Process 71221:                 UPDATE T1
>     SET ...
>     FROM trans1_T_tmp
>     WHERE ...
>                 ","See server log for query details.",,,"while updating
> tuple (93716,27) in relation ""T1""","WITH agg_tmp AS (SELECT ... FROM
> trans2_T1_tmp GROUP BY ...), upd AS (UPDATE T1 SET ... FROM agg_tmp s
> WHERE ... RETURNING ...) INSERT INTO T1 (...) SELECT ... FROM agg_tmp s
> LEFT JOIN upd t ON (...) WHERE ...;
> Process 71221:                 UPDATE T1
>     SET ...
>     FROM trans1_T_tmp
>     WHERE ...",,,""
>
>
>
> On 07/02/16 11:14, Adrian Klaver wrote:
>> On 07/02/2016 09:54 AM, trafdev wrote:
>> > Hello.
>> >
>> > I have two transactions (trans1 and trans2) updating tables T1 and
>> T2 in
>> > the same order, but in a different way.
>> >
>> > trans1 creates temp table, copies data from a file and updates
>> tables T1
>> > and T2 from this temp table (using basic UPDATE form). It even commits
>> > changes in between T1 and T2 updates to reduce locks.
>> >
>> > trans2 creates two temp tables (T1_tmp and T2_tmp), copies data from
>> > files and updates T1 and T2 using [WITH ... AS ... RETURNING] approach.
>> > Unexciting rows should be inserted, existing updated (sum-ed with
>> values
>> > from temp tables). Both T1 and T2 must be updated in the same
>> > transaction.
>> >
>> > There are no any FKs anywhere in these tables.
>> >
>> >
>> >
>> > trans1:
>>
>> session1:
>>
>> This is actually one session with multiple transactions, at least if I
>> am following correctly.
>>
>> Assuming you have a:
>>
>> BEGIN;
>>
>> here.
>>
>> >
>> > DROP TABLE IF EXISTS trans1_T_tmp;
>> >
>> > CREATE TABLE trans1_T_tmp (...);
>> >
>> > COMMIT
>> >
>> > COPY from FILE into trans1_T_tmp;
>> >
>> > BEGIN
>> > UPDATE T1
>> >     SET ...
>> >     FROM trans1_T_tmp
>> >     WHERE ...
>> > COMMIT
>> >
>> > BEGIN
>> > UPDATE T2
>> >     SET ...
>> >     FROM (SELECT ... FROM trans1_T_tmp)
>> >     WHERE ...
>> >
>> > DROP TABLE trans1_T_tmp;
>> > COMMIT
>> >
>> >
>> >
>> > trans2:
>>
>> session2:
>>
>> >
>> > BEGIN
>> >
>> > CREATE TABLE trans2_T1_tmp (...);
>> > COPY from FILE into trans2_T1_tmp;
>> > WITH agg_tmp AS (SELECT ... FROM trans2_T1_tmp GROUP BY ...), upd AS
>> > (UPDATE T1 SET ... FROM agg_tmp s WHERE ... RETURNING ...) INSERT INTO
>> > T1 (...) SELECT ... FROM agg_tmp s LEFT JOIN upd t ON (...) WHERE ...;
>> > DROP TABLE trans2_T1_tmp;
>> >
>> > CREATE TABLE trans2_T2_tmp (...);
>> > COPY from FILE into trans2_T2_tmp;
>> > WITH agg_tmp AS (SELECT ... FROM trans2_T2_tmp GROUP BY ...), upd AS
>> > (UPDATE T2 SET ... FROM agg_tmp s WHERE ... RETURNING ...) INSERT INTO
>> > T2 (...) SELECT ... FROM agg_tmp s LEFT JOIN upd t ON (...) WHERE ...;
>> > DROP TABLE trans2_T2_tmp;
>> >
>> > COMMIT
>> >
>> > By an unknown [for me] reason trans1 and trans2 often produce
>> > deadlocks...
>>
>> What do the logs show as the error message?
>>
>> >
>> > Could you explain what's the reason for that and how to avoid them?
>> > And is there exist a better replacement for WITH ... AS ... RETURNING ?
>> > Thanks!
>> >
>> >
>>
>>
>
>


--
Adrian Klaver
adrian.klaver@aklaver.com


> Best guess you are running into what is described here:
>
> https://www.postgresql.org/docs/9.5/static/explicit-locking.html#LOCKING-DEADLOCKS
>
>
> Both transactions are holding locks on rows in T1 that the other wants
> also.
>
> I may be missing something, but I am not sure why it is necessary to run
> both sessions concurrently? Could you not do session1 and once it
> completes then session2?

Sessions are running concurrently because of flexibility - they are two
different scheduled jobs launching at different times and performing
different set of operations.

Of course I can play with scheduling timings and make them not intersect
with each other (which I've done already btw), but that's only a temp
solution.

So how in PostgreSQL-world 2 or more transactions can update the same
table without deadlocking? I can't believe it's not possible, there must
be some sort of synchronization primitive. Does it support a "named
mutex" concept from a system-programming world? I bet there is something
more suitable.


I've also replaced "WITH agg_tmp AS ({sel_stmt}), upd AS ({upd_stmt})
{ins_stmt}" to "INSERT INTO .. ON CONFLICT DO UPDATE ...", but no
success - row level deadlocks still occur...
Is there a way to tell Postgres to update rows in a specified order?
Or maybe LOCK TABLE should be used?

> Sessions are running concurrently because of flexibility - they are two
> different scheduled jobs launching at different times and performing
> different set of operations.
>
> Of course I can play with scheduling timings and make them not intersect
> with each other (which I've done already btw), but that's only a temp
> solution.
>
> So how in PostgreSQL-world 2 or more transactions can update the same
> table without deadlocking? I can't believe it's not possible, there must
> be some sort of synchronization primitive. Does it support a "named
> mutex" concept from a system-programming world? I bet there is something
> more suitable.


On 07/02/2016 09:01 PM, trafdev wrote:
> I've also replaced "WITH agg_tmp AS ({sel_stmt}), upd AS ({upd_stmt})
> {ins_stmt}" to "INSERT INTO .. ON CONFLICT DO UPDATE ...", but no
> success - row level deadlocks still occur...
> Is there a way to tell Postgres to update rows in a specified order?
> Or maybe LOCK TABLE should be used?

My little voice says the below is the answer:

https://www.postgresql.org/docs/9.5/static/sql-select.html#SQL-FOR-UPDATE-SHARE

I just do not have enough coffee in me yet to apply it your situation
directly.

>
>> Sessions are running concurrently because of flexibility - they are two
>> different scheduled jobs launching at different times and performing
>> different set of operations.
>>
>> Of course I can play with scheduling timings and make them not intersect
>> with each other (which I've done already btw), but that's only a temp
>> solution.
>>
>> So how in PostgreSQL-world 2 or more transactions can update the same
>> table without deadlocking? I can't believe it's not possible, there must
>> be some sort of synchronization primitive. Does it support a "named
>> mutex" concept from a system-programming world? I bet there is something
>> more suitable.
>


--
Adrian Klaver
adrian.klaver@aklaver.com


On 07/02/2016 09:54 AM, trafdev wrote:
> Hello.
>
> I have two transactions (trans1 and trans2) updating tables T1 and T2 in
> the same order, but in a different way.
>
> trans1 creates temp table, copies data from a file and updates tables T1
> and T2 from this temp table (using basic UPDATE form). It even commits
> changes in between T1 and T2 updates to reduce locks.
>
> trans2 creates two temp tables (T1_tmp and T2_tmp), copies data from
> files and updates T1 and T2 using [WITH ... AS ... RETURNING] approach.
> Unexciting rows should be inserted, existing updated (sum-ed with values
> from temp tables). Both T1 and T2 must be updated in the same transaction.
>
> There are no any FKs anywhere in these tables.
>
>
>
> trans1:
>
> DROP TABLE IF EXISTS trans1_T_tmp;
>
> CREATE TABLE trans1_T_tmp (...);
>
> COMMIT
>
> COPY from FILE into trans1_T_tmp;
>
> BEGIN
> UPDATE T1
>     SET ...
>     FROM trans1_T_tmp
>     WHERE ...
> COMMIT
>
> BEGIN
> UPDATE T2
>     SET ...
>     FROM (SELECT ... FROM trans1_T_tmp)
>     WHERE ...
>
> DROP TABLE trans1_T_tmp;
> COMMIT
>
>
>
> trans2:
>
> BEGIN
>
> CREATE TABLE trans2_T1_tmp (...);
> COPY from FILE into trans2_T1_tmp;
> WITH agg_tmp AS (SELECT ... FROM trans2_T1_tmp GROUP BY ...), upd AS
> (UPDATE T1 SET ... FROM agg_tmp s WHERE ... RETURNING ...) INSERT INTO
> T1 (...) SELECT ... FROM agg_tmp s LEFT JOIN upd t ON (...) WHERE ...;
> DROP TABLE trans2_T1_tmp;
>
> CREATE TABLE trans2_T2_tmp (...);
> COPY from FILE into trans2_T2_tmp;
> WITH agg_tmp AS (SELECT ... FROM trans2_T2_tmp GROUP BY ...), upd AS
> (UPDATE T2 SET ... FROM agg_tmp s WHERE ... RETURNING ...) INSERT INTO
> T2 (...) SELECT ... FROM agg_tmp s LEFT JOIN upd t ON (...) WHERE ...;
> DROP TABLE trans2_T2_tmp;
>
> COMMIT
>
> By an unknown [for me] reason trans1 and trans2 often produce deadlocks...
>
> Could you explain what's the reason for that and how to avoid them?
> And is there exist a better replacement for WITH ... AS ... RETURNING ?

You did not mention what version of Postgres you are using, if it is
9.5+ you have the 'UPSERT' option available instead of using the WITH
construct:


https://www.postgresql.org/docs/9.5/static/sql-insert.html

ON CONFLICT Clause

https://www.depesz.com/2015/05/10/waiting-for-9-5-add-support-for-insert-on-conflict-do-nothingupdate/

If not you might to take a look at:

http://stackoverflow.com/questions/17267417/how-to-upsert-merge-insert-on-duplicate-update-in-postgres

Bulk upsert with lock



> Thanks!
>
>


--
Adrian Klaver
adrian.klaver@aklaver.com


I've already switched to 'UPSERT', it didn't resolved deadlock issue by
itself... Added LOCK TABLE ... IN EXCLUSIVE MODE; to one session, hope
it will help.

> You did not mention what version of Postgres you are using, if it is
> 9.5+ you have the 'UPSERT' option available instead of using the WITH
> construct:
>
>
> https://www.postgresql.org/docs/9.5/static/sql-insert.html
>
> ON CONFLICT Clause
>
> https://www.depesz.com/2015/05/10/waiting-for-9-5-add-support-for-insert-on-conflict-do-nothingupdate/
>
>
> If not you might to take a look at:
>
> http://stackoverflow.com/questions/17267417/how-to-upsert-merge-insert-on-duplicate-update-in-postgres
>
>
> Bulk upsert with lock
>
>