Re: BUG #8434: Why does dead lock occur many times ?

Поиск
Список
Период
Сортировка
От Alvaro Herrera
Тема Re: BUG #8434: Why does dead lock occur many times ?
Дата
Msg-id 20131004172217.GW5235@eldon.alvh.no-ip.org
обсуждение исходный текст
Ответ на BUG #8434: Why does dead lock occur many times ?  (katsumata.tomonari@po.ntts.co.jp)
Ответы Re: BUG #8434: Why does dead lock occur many times ?  (Alvaro Herrera <alvherre@2ndquadrant.com>)
Список pgsql-bugs
I spent a lot of time looking at this issue.  Your problem scenario can
be reduced to the following test case:

CREATE TABLE t (a INT);
INSERT INTO t VALUES (1);

The three sessions need to execute the following commands, in this
sequence:

-- Session 1
/* 1 */ BEGIN;
SELECT * FROM t FOR UPDATE;

-- Session 2
/* 2 */ BEGIN;
UPDATE t SET a = 2;        -- blocks waiting on S1

-- Session 3
/* 3 */ BEGIN;
UPDATE t SET a = 3;        -- blocks
-- this is waiting on S1, but since S2 is also waiting on S1 and will acquire a
-- conflicting lock first, this session will only be released after S2 releases
-- its own lock.

Session 1
/* 4 */ UPDATE t SET a = 4;
COMMIT;                -- releases session 2
BEGIN;
SELECT * FROM t FOR UPDATE;    -- blocks waiting on S2 (*)

Session 2:
/* 5 */ COMMIT;                -- releases session 1
BEGIN;
UPDATE t SET a = 5;            -- blocks waiting on S3

Session 1:
/* 6 */ UPDATE t SET a = 6;        -- blocks waiting on S3

At this point, all sessions are blocked on their UPDATE commands, and
deadlock is reported.  (For debugging, it's useful to set a very large
deadlock_timeout and examine pg_locks, etc.)

The point at which things started to go wrong was where the session 1's
SELECT FOR UPDATE with the (*) was allowed to continue after session 2
commits.  In 9.2 that session remains blocked, and instead we release
session 3.

The problem is the handling of following of the update chain during a
lock command; the question is should sessions be blocked waiting for
locks on future versions of the row?  They currently don't block, which
is what leads the SELECT FOR UPDATE to continue; but if we make them
block, other things break.  I initially toyed with the following patch,
which removes the deadlock the above report and also your original test
case:

--- a/src/backend/access/heap/heapam.c
+++ b/src/backend/access/heap/heapam.c
@@ -4832,15 +4832,12 @@ l4:
         xmax = HeapTupleHeaderGetRawXmax(mytup.t_data);

         /*
-         * If this tuple is updated and the key has been modified (or
-         * deleted), what we do depends on the status of the updating
-         * transaction: if it's live, we sleep until it finishes; if it has
-         * committed, we have to fail (i.e. return HeapTupleUpdated); if it
-         * aborted, we ignore it. For updates that didn't touch the key, we
-         * can just plough ahead.
+         * If this tuple is updated, what we do depends on the status of the
+         * updating transaction: if it's live, we sleep until it finishes; if
+         * it has committed, we have to fail (i.e. return HeapTupleUpdated); if
+         * it aborted, we ignore it.
          */
-        if (!(old_infomask & HEAP_XMAX_INVALID) &&
-            (mytup.t_data->t_infomask2 & HEAP_KEYS_UPDATED))
+        if (!(old_infomask & HEAP_XMAX_INVALID))
         {
             TransactionId update_xid;


However, the differences in behavior this causes (visible by running the
isolation tests) don't look good to me.  It's quite possible that there
are other bugs elsewhere.

Sadly, I have to look at other things now and I won't have time to
research this further until after next week's minor releases; so
whatever we decide to do, if anything, will not be in 9.3.1.

Thanks for the report.

--
Álvaro Herrera                http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

В списке pgsql-bugs по дате отправления:

Предыдущее
От: Christoph Berg
Дата:
Сообщение: Re: pg_upgrade 9.0->9.2 failure: Mismatch of relation OID in database
Следующее
От: kyrimis@alumni.princeton.edu
Дата:
Сообщение: BUG #8500: Upgrade to postgis 2.1 breaks existing databases