Re: BUG #13681: Serialization failures caused by new multixact code of 9.3 (back-patch request)

Поиск
Список
Период
Сортировка
От Alvaro Herrera
Тема Re: BUG #13681: Serialization failures caused by new multixact code of 9.3 (back-patch request)
Дата
Msg-id 20151218185335.GU2618@alvherre.pgsql
обсуждение исходный текст
Ответ на Re: BUG #13681: Serialization failures caused by new multixact code of 9.3 (back-patch request)  (Alvaro Herrera <alvherre@2ndquadrant.com>)
Ответы Re: BUG #13681: Serialization failures caused by new multixact code of 9.3 (back-patch request)  (Kevin Grittner <kgrittn@gmail.com>)
Re: BUG #13681: Serialization failures caused by new multixact code of 9.3 (back-patch request)  (Olivier Dony <odo+pgbugs@odoo.com>)
Список pgsql-bugs
Alvaro Herrera wrote:

> diff --git a/src/backend/access/heap/heapam.c b/src/backend/access/heap/heapam.c
> index 559970f..aaf8e8e 100644
> --- a/src/backend/access/heap/heapam.c
> +++ b/src/backend/access/heap/heapam.c
> @@ -4005,7 +4005,7 @@ l3:
>          UnlockReleaseBuffer(*buffer);
>          elog(ERROR, "attempted to lock invisible tuple");
>      }
> -    else if (result == HeapTupleBeingUpdated)
> +    else if (result == HeapTupleBeingUpdated || result == HeapTupleUpdated)
>      {
>          TransactionId xwait;
>          uint16        infomask;
>
> I think heap_lock_rows had that shape (only consider BeingUpdated as a
> reason to check/wait) only because it was possible to lock a row that
> was being locked by someone else, but it wasn't possible to lock a row
> that had been updated by someone else -- which became possible in 9.3.
> So this patch is necessary, and not just to fix this one bug.
>
> I need to study the implications of this patch more closely, but I think
> in theory it is correct.

Hmm.  So one thing against this patch is that some results of existing
isolation tests change.  For instance, this one (lock-update-delete):

starting permutation: s2b s1l s2u s2_blocker3 s2c s2_unlock
pg_advisory_lock


step s2b: BEGIN;
step s1l: SELECT * FROM foo WHERE pg_advisory_xact_lock(0) IS NOT NULL AND key = 1 FOR KEY SHARE; <waiting ...>
step s2u: UPDATE foo SET value = 2 WHERE key = 1;
step s2_blocker3: UPDATE foo SET value = 2 WHERE key = 1;
step s2c: COMMIT;
step s2_unlock: SELECT pg_advisory_unlock(0);
pg_advisory_unlock

t
step s1l: <... completed>
key            value

1              2

The initial SELECT is blocked and returns the updated tuple; if I patch
as proposed above, the returned tuple is the *original* tuple (i.e. it
returns value=1 instead).  I think there's room for arguing that that is
the correct result; since the tuple lock acquired by FOR KEY SHARE does
not conflict with the UPDATE, what would be the reason to reject the
original tuple?  In fact, in the permutation that does the
advisory_unlock before the commit, that's exactly what happens (value=1
is returned).

Also, in the repeatable read isolation level, the expected file has this
permutation as failing with "could not serialize" -- but with the patch,
it no longer fails and instead it returns the value=1 tuple instead.
Kevin, what's your opinion on that change?  Is this case supposed to
raise an error or not?

I have a hard time convincing myself that it's acceptable to back-patch
such a change, in any case.  I observed no other regression failure, but
what did change does make me a bit uncomfortable.

--
Álvaro Herrera                http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

В списке pgsql-bugs по дате отправления:

Предыдущее
От: "Peter J. Holzer"
Дата:
Сообщение: Re: BUG #13817: Query planner strange choose while select/count small part of big table - complete
Следующее
От: Kevin Grittner
Дата:
Сообщение: Re: BUG #13681: Serialization failures caused by new multixact code of 9.3 (back-patch request)