Обсуждение: Serialization errors despite KEY SHARE/NO KEY UPDATE

Поиск
Список
Период
Сортировка

Serialization errors despite KEY SHARE/NO KEY UPDATE

От
Olivier Dony
Дата:
Using 9.3.9 and 9.4.4 I'm experiencing serialization errors that I
have trouble explaining based on the documentation.

The concurrent transactions are framework-driven and all of them
are running in REPEATABLE READ isolation level.

The error can be reproduced as shown below. T1 fails when acquiring
the second KEY SHARE lock for the same row, but does not fail if
only one of the T1 UPDATEs is executed, either the first or the second
one.

Is there some hidden read/write dependency at play here?

My understanding of the KEY SHARE/NO KEY UPDATE locks introduced in 9.3
was that they would avoid side-effects/blocking between transactions
that are only linked via FK constraints, as long as the target PK was
not touched. Isn't it the case here?

If not, is there a reliable way to make T2 fail instead of T1 in such
situations? I've tried adding an explicit
     "SELECT date FROM users WHERE id = 1 FOR UPDATE NOWAIT"
at the beginning of T2 but that doesn't block at all.

Thanks for the clarifications!


-- Setup tables
CREATE TABLE users ( id serial PRIMARY KEY,
                       name varchar,
                       date timestamp );
CREATE TABLE orders ( id serial PRIMARY KEY,
                        name varchar,
                        user_id int REFERENCES users (id) );
INSERT INTO users (id, name) VALUES (1, 'foo');
INSERT INTO orders (id, name) VALUES (1, 'order 1');


-- Run 2 concurrent transactions: T1 and T2
              T1                                T2
|-----------------------------|----------------------------------|
    BEGIN ISOLATION LEVEL
          REPEATABLE READ;

    UPDATE orders
    SET name = 'order of foo',
        user_id = 1
    WHERE id = 1;

                                      BEGIN ISOLATION LEVEL
                                            REPEATABLE READ;

                                      UPDATE users
                                      SET date = now()
                                      WHERE id = 1;

                                      COMMIT;

    UPDATE orders
    SET name = 'order of foo (2)',
        user_id = 1
    WHERE id = 1;

T1 fails with:
ERROR:  could not serialize access due to concurrent update
CONTEXT:  SQL statement "SELECT 1 FROM ONLY "public"."users" x WHERE "id"
OPERATOR(pg_catalog.=) $1 FOR KEY SHARE OF x"



--
Olivier


Re: Serialization errors despite KEY SHARE/NO KEY UPDATE

От
Jim Nasby
Дата:
On 9/29/15 9:47 AM, Olivier Dony wrote:
> My understanding of the KEY SHARE/NO KEY UPDATE locks introduced in 9.3
> was that they would avoid side-effects/blocking between transactions
> that are only linked via FK constraints, as long as the target PK was
> not touched. Isn't it the case here?

Not quite. Any unique index that isn't partial and isn't a functional
index can satisfy a foreign key. That means that if you change a field
that is in ANY unique index that update becomes a FOR KEY UPDATE.

> If not, is there a reliable way to make T2 fail instead of T1 in such
> situations? I've tried adding an explicit
>      "SELECT date FROM users WHERE id = 1 FOR UPDATE NOWAIT"
> at the beginning of T2 but that doesn't block at all.
>
> Thanks for the clarifications!
>
>
> -- Setup tables
> CREATE TABLE users ( id serial PRIMARY KEY,
>                        name varchar,
>                        date timestamp );
> CREATE TABLE orders ( id serial PRIMARY KEY,
>                         name varchar,
>                         user_id int REFERENCES users (id) );
> INSERT INTO users (id, name) VALUES (1, 'foo');
> INSERT INTO orders (id, name) VALUES (1, 'order 1');
>
>
> -- Run 2 concurrent transactions: T1 and T2
>               T1                                T2
> |-----------------------------|----------------------------------|
>     BEGIN ISOLATION LEVEL
>           REPEATABLE READ;
>
>     UPDATE orders
>     SET name = 'order of foo',
>         user_id = 1
>     WHERE id = 1;
>
>                                       BEGIN ISOLATION LEVEL
>                                             REPEATABLE READ;
>
>                                       UPDATE users
>                                       SET date = now()
>                                       WHERE id = 1;
>
>                                       COMMIT;
>
>     UPDATE orders
>     SET name = 'order of foo (2)',
>         user_id = 1
>     WHERE id = 1;
>
> T1 fails with:
> ERROR:  could not serialize access due to concurrent update
> CONTEXT:  SQL statement "SELECT 1 FROM ONLY "public"."users" x WHERE
> "id" OPERATOR(pg_catalog.=) $1 FOR KEY SHARE OF x"

This isn't a locking failure, it's a serialization failure. I'm not sure
why it's happening though... is there an index on date?
--
Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX
Experts in Analytics, Data Architecture and PostgreSQL
Data in Trouble? Get it in Treble! http://BlueTreble.com


Re: Serialization errors despite KEY SHARE/NO KEY UPDATE

От
Olivier Dony
Дата:
On 10/02/2015 12:28 AM, Jim Nasby wrote:
> On 9/29/15 9:47 AM, Olivier Dony wrote:
>> My understanding of the KEY SHARE/NO KEY UPDATE locks introduced in 9.3
>> was that they would avoid side-effects/blocking between transactions
>> that are only linked via FK constraints, as long as the target PK was
>> not touched. Isn't it the case here?
>
> Not quite. Any unique index that isn't partial and isn't a functional
> index can satisfy a foreign key. That means that if you change a field
> that is in ANY unique index that update becomes a FOR KEY UPDATE.

Interesting, do you know if that is mentioned in the documentation somewhere?
(I couldn't find it)


>> -- Setup tables
>> CREATE TABLE users ( id serial PRIMARY KEY,
>>                         name varchar,
>>                         date timestamp );
>> CREATE TABLE orders ( id serial PRIMARY KEY,
>>                          name varchar,
>>                          user_id int REFERENCES users (id) );
>> INSERT INTO users (id, name) VALUES (1, 'foo');
>> INSERT INTO orders (id, name) VALUES (1, 'order 1');
>>
>>
>> -- Run 2 concurrent transactions: T1 and T2
>>                T1                                T2
>> |-----------------------------|----------------------------------|
>>      BEGIN ISOLATION LEVEL
>>            REPEATABLE READ;
>>
>>      UPDATE orders
>>      SET name = 'order of foo',
>>          user_id = 1
>>      WHERE id = 1;
>>
>>                                        BEGIN ISOLATION LEVEL
>>                                              REPEATABLE READ;
>>
>>                                        UPDATE users
>>                                        SET date = now()
>>                                        WHERE id = 1;
>>
>>                                        COMMIT;
>>
>>      UPDATE orders
>>      SET name = 'order of foo (2)',
>>          user_id = 1
>>      WHERE id = 1;
>>
>> T1 fails with:
>> ERROR:  could not serialize access due to concurrent update
>> CONTEXT:  SQL statement "SELECT 1 FROM ONLY "public"."users" x WHERE
>> "id" OPERATOR(pg_catalog.=) $1 FOR KEY SHARE OF x"
>
> This isn't a locking failure, it's a serialization failure. I'm not sure
> why it's happening though... is there an index on date?

I don't think so. I can reproduce the problem with the queries quoted above,
and the only index that seems to be present is the PK (sorry for the wrapping):

9=# \d users
                                    Table "public.users"
  Column |            Type             |                     Modifiers

--------+-----------------------------+----------------------------------------------------
  id     | integer                     | not null default
nextval('users_id_seq'::regclass)
  name   | character varying           |
  date   | timestamp without time zone |
Indexes:
     "users_pkey" PRIMARY KEY, btree (id)
Referenced by:
     TABLE "orders" CONSTRAINT "orders_user_id_fkey" FOREIGN KEY (user_id)
REFERENCES users(id)

9=# \d orders
                                Table "public.orders"
  Column  |       Type        |                      Modifiers

---------+-------------------+-----------------------------------------------------
  id      | integer           | not null default nextval('orders_id_seq'::regclass)
  name    | character varying |
  user_id | integer           |
Indexes:
     "orders_pkey" PRIMARY KEY, btree (id)
Foreign-key constraints:
     "orders_user_id_fkey" FOREIGN KEY (user_id) REFERENCES users(id)


--
Olivier


Re: Serialization errors despite KEY SHARE/NO KEY UPDATE

От
Jim Nasby
Дата:
On 10/2/15 11:44 AM, Olivier Dony wrote:
> On 10/02/2015 12:28 AM, Jim Nasby wrote:
>> On 9/29/15 9:47 AM, Olivier Dony wrote:
>>> My understanding of the KEY SHARE/NO KEY UPDATE locks introduced in 9.3
>>> was that they would avoid side-effects/blocking between transactions
>>> that are only linked via FK constraints, as long as the target PK was
>>> not touched. Isn't it the case here?
>>
>> Not quite. Any unique index that isn't partial and isn't a functional
>> index can satisfy a foreign key. That means that if you change a field
>> that is in ANY unique index that update becomes a FOR KEY UPDATE.
>
> Interesting, do you know if that is mentioned in the documentation
> somewhere? (I couldn't find it)

http://www.postgresql.org/docs/9.4/static/ddl-constraints.html#DDL-CONSTRAINTS-FK
does say this:

"A foreign key must reference columns that either are a primary key or
form a unique constraint."

So you can kind of infer it. It could probably be more explicitly
mentioned somewhere though. Care to suggest a doc change?

>>> -- Setup tables
>>> CREATE TABLE users ( id serial PRIMARY KEY,
>>>                         name varchar,
>>>                         date timestamp );
>>> CREATE TABLE orders ( id serial PRIMARY KEY,
>>>                          name varchar,
>>>                          user_id int REFERENCES users (id) );
>>> INSERT INTO users (id, name) VALUES (1, 'foo');
>>> INSERT INTO orders (id, name) VALUES (1, 'order 1');
>>>
>>>
>>> -- Run 2 concurrent transactions: T1 and T2
>>>                T1                                T2
>>> |-----------------------------|----------------------------------|
>>>      BEGIN ISOLATION LEVEL
>>>            REPEATABLE READ;
>>>
>>>      UPDATE orders
>>>      SET name = 'order of foo',
>>>          user_id = 1
>>>      WHERE id = 1;
>>>
>>>                                        BEGIN ISOLATION LEVEL
>>>                                              REPEATABLE READ;
>>>
>>>                                        UPDATE users
>>>                                        SET date = now()
>>>                                        WHERE id = 1;
>>>
>>>                                        COMMIT;
>>>
>>>      UPDATE orders
>>>      SET name = 'order of foo (2)',
>>>          user_id = 1
>>>      WHERE id = 1;
>>>
>>> T1 fails with:
>>> ERROR:  could not serialize access due to concurrent update
>>> CONTEXT:  SQL statement "SELECT 1 FROM ONLY "public"."users" x WHERE
>>> "id" OPERATOR(pg_catalog.=) $1 FOR KEY SHARE OF x"
>>
>> This isn't a locking failure, it's a serialization failure. I'm not sure
>> why it's happening though... is there an index on date?
>
> I don't think so. I can reproduce the problem with the queries quoted
> above, and the only index that seems to be present is the PK (sorry for
> the wrapping):

I'm not sure. Perhaps Kevin Grittner (author of serializable patch) can
shed some light.
--
Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX
Experts in Analytics, Data Architecture and PostgreSQL
Data in Trouble? Get it in Treble! http://BlueTreble.com


Re: Serialization errors despite KEY SHARE/NO KEY UPDATE

От
Kevin Grittner
Дата:
[Sorry for the delay in responding -- was on vacation.]


Jim Nasby <Jim.Nasby@BlueTreble.com> wrote:
> On 10/2/15 11:44 AM, Olivier Dony wrote:
>> On 10/02/2015 12:28 AM, Jim Nasby wrote:
>>> On 9/29/15 9:47 AM, Olivier Dony wrote:

>>>> -- Setup tables
>>>> CREATE TABLE users ( id serial PRIMARY KEY,
>>>>                      name varchar,
>>>>                      date timestamp );
>>>> CREATE TABLE orders ( id serial PRIMARY KEY,
>>>>                       name varchar,
>>>>                       user_id int REFERENCES users (id) );
>>>> INSERT INTO users (id, name) VALUES (1, 'foo');
>>>> INSERT INTO orders (id, name) VALUES (1, 'order 1');
>>>>
>>>>
>>>> -- Run 2 concurrent transactions: T1 and T2
>>>>                              T1                               T2
>>>> |-----------------------------|----------------------------------|
>>>>        BEGIN ISOLATION LEVEL
>>>>                   REPEATABLE READ;
>>>>
>>>>   UPDATE orders
>>>>        SET name = 'order of foo',
>>>>                user_id = 1
>>>>        WHERE id = 1;
>>>>
>>>>                                 BEGIN ISOLATION LEVEL
>>>>                                       REPEATABLE READ;
>>>>
>>>>                                 UPDATE users
>>>>                                 SET date = now()
>>>>                                 WHERE id = 1;
>>>>
>>>>                                 COMMIT;
>>>>
>>>>   UPDATE orders
>>>>   SET name = 'order of foo (2)',
>>>>       user_id = 1
>>>>   WHERE id = 1;
>>>>
>>>> T1 fails with:
>>>> ERROR:  could not serialize access due to concurrent update
>>>> CONTEXT:  SQL statement "SELECT 1 FROM ONLY "public"."users" x WHERE "id" OPERATOR(pg_catalog.=) $1 FOR KEY SHARE
OFx" 
>>>
>>> This isn't a locking failure, it's a serialization failure. I'm not sure
>>> why it's happening though... is there an index on date?
>>
>> I don't think so. I can reproduce the problem with the queries quoted
>> above, and the only index that seems to be present is the PK

> I'm not sure. Perhaps Kevin Grittner (author of serializable patch) can
> shed some light.

This has nothing to do with the serializable patch[1], but it is
related to the new multixact code in 9.3.  I thought I remembered
that this was a regression from 9.2 behavior which was fixed in a
patch after being reported, and sure enough tests of the tips of
branches showed that prior to 9.3 T2 blocked until T1 committed,
rather than generating a serialization failure, while on 9.5 and
master there was no blocking or error.

Using `git bisect` to see where this was fixed shows:

commit 05315498012530d44cd89a209242a243374e274d
Author: Alvaro Herrera <alvherre@alvh.no-ip.org>
Date:   Tue Jul 29 15:41:06 2014 -0400

    Avoid uselessly looking up old LOCK_ONLY multixacts

    Commit 0ac5ad5134f2 removed an optimization in multixact.c that skipped
    fetching members of MultiXactId that were older than our
    OldestVisibleMXactId value.  The reason this was removed is that it is
    possible for multixacts that contain updates to be older than that
    value.  However, if the caller is certain that the multi does not
    contain an update (because the infomask bits say so), it can pass this
    info down to GetMultiXactIdMembers, enabling it to use the old
    optimization.

    Pointed out by Andres Freund in 20131121200517.GM7240@alap2.anarazel.de

I don't know whether, after being in 9.5 for over a year it could
be considered for a back-patch to 9.3 and 9.4; I leave that to
Álvaro and others who have been repairing the multixact problems to
determine.

--
Kevin Grittner
EDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

[1]  The serializable patch has no effect if there are not
serializable transactions in use, and it doesn't generate errors
"due to concurrent update" in any new situations.

Re: Serialization errors despite KEY SHARE/NO KEY UPDATE

От
Olivier Dony
Дата:
On 10/05/2015 11:17 PM, Kevin Grittner wrote:
> Jim Nasby <Jim.Nasby@BlueTreble.com> wrote:
>> On 10/2/15 11:44 AM, Olivier Dony wrote:
>>> On 10/02/2015 12:28 AM, Jim Nasby wrote:
>>>> On 9/29/15 9:47 AM, Olivier Dony wrote:
>
>>>>> -- Setup tables
>>>>> CREATE TABLE users ( id serial PRIMARY KEY,
>>>>>                       name varchar,
>>>>>                       date timestamp );
>>>>> CREATE TABLE orders ( id serial PRIMARY KEY,
>>>>>                        name varchar,
>>>>>                        user_id int REFERENCES users (id) );
>>>>> INSERT INTO users (id, name) VALUES (1, 'foo');
>>>>> INSERT INTO orders (id, name) VALUES (1, 'order 1');
>>>>>
>>>>>
>>>>> -- Run 2 concurrent transactions: T1 and T2
>>>>>                               T1                               T2
>>>>> |-----------------------------|----------------------------------|
>>>>>         BEGIN ISOLATION LEVEL
>>>>>                    REPEATABLE READ;
>>>>>
>>>>>    UPDATE orders
>>>>>         SET name = 'order of foo',
>>>>>                 user_id = 1
>>>>>         WHERE id = 1;
>>>>>
>>>>>                                  BEGIN ISOLATION LEVEL
>>>>>                                        REPEATABLE READ;
>>>>>
>>>>>                                  UPDATE users
>>>>>                                  SET date = now()
>>>>>                                  WHERE id = 1;
>>>>>
>>>>>                                  COMMIT;
>>>>>
>>>>>    UPDATE orders
>>>>>    SET name = 'order of foo (2)',
>>>>>        user_id = 1
>>>>>    WHERE id = 1;
>>>>>
>>>>> T1 fails with:
>>>>> ERROR:  could not serialize access due to concurrent update
>>>>> CONTEXT:  SQL statement "SELECT 1 FROM ONLY "public"."users" x WHERE "id" OPERATOR(pg_catalog.=) $1 FOR KEY SHARE
OFx" 
>>>>
>>>> This isn't a locking failure, it's a serialization failure. I'm not sure
>>>> why it's happening though... is there an index on date?
>>>
>>> I don't think so. I can reproduce the problem with the queries quoted
>>> above, and the only index that seems to be present is the PK
>
>> I'm not sure. Perhaps Kevin Grittner (author of serializable patch) can
>> shed some light.
>
> This has nothing to do with the serializable patch[1], but it is
> related to the new multixact code in 9.3.  I thought I remembered
> that this was a regression from 9.2 behavior which was fixed in a
> patch after being reported, and sure enough tests of the tips of
> branches showed that prior to 9.3 T2 blocked until T1 committed,
> rather than generating a serialization failure, while on 9.5 and
> master there was no blocking or error.
>
> Using `git bisect` to see where this was fixed shows:
>
> commit 05315498012530d44cd89a209242a243374e274d
> Author: Alvaro Herrera <alvherre@alvh.no-ip.org>
> Date:   Tue Jul 29 15:41:06 2014 -0400
>
>      Avoid uselessly looking up old LOCK_ONLY multixacts
>
>      Commit 0ac5ad5134f2 removed an optimization in multixact.c that skipped
>      fetching members of MultiXactId that were older than our
>      OldestVisibleMXactId value.  The reason this was removed is that it is
>      possible for multixacts that contain updates to be older than that
>      value.  However, if the caller is certain that the multi does not
>      contain an update (because the infomask bits say so), it can pass this
>      info down to GetMultiXactIdMembers, enabling it to use the old
>      optimization.
>
>      Pointed out by Andres Freund in 20131121200517.GM7240@alap2.anarazel.de

That's exactly it, thanks a lot for tracking down the root cause!
I understand (a little bit) better now.


> I don't know whether, after being in 9.5 for over a year it could
> be considered for a back-patch to 9.3 and 9.4; I leave that to
> Álvaro and others who have been repairing the multixact problems to
> determine.

 From what I understand of the original discussion between Andres, Álvaro and
Tom, a back-patch seemed desirable because this issue has performance
consequences. I have to add that in our case it also causes spurious
serialization failures that were not present in 9.2 and aren't either in 9.5.
The backwards-compatibility problem of the patch might however apply to both
9.3 and 9.4 now...

For what it's worth, our system uses many long-running transactions (background
jobs, batch data imports, etc.) that are frequently interrupted and rolled back
by micro-transactions coming from users who just happen to update minor data on
their records (such as their last login date).

Having to replay those long transactions is particularly expensive and they
stand a good chance of being interrupted again during the replay, etc.

We would happily skip the micro-transactions (as a perf workaround) if there
was a way to detect this situation, but we couldn't find a way to do that in
9.3. <9.3 we used SELECT FOR UPDATE NOWAIT to guard similar cases.

If there is any way I could help to make the back-patch happen, please let me know!


--
Olivier



>
> [1]  The serializable patch has no effect if there are not
> serializable transactions in use, and it doesn't generate errors
> "due to concurrent update" in any new situations.
>


Re: Serialization errors despite KEY SHARE/NO KEY UPDATE

От
Jim Nasby
Дата:
On 10/6/15 12:18 PM, Olivier Dony wrote:
>
> We would happily skip the micro-transactions (as a perf workaround) if
> there was a way to detect this situation, but we couldn't find a way to
> do that in 9.3. <9.3 we used SELECT FOR UPDATE NOWAIT to guard similar
> cases.
>
> If there is any way I could help to make the back-patch happen, please
> let me know!

I'd say you should probably open a bug about this to make sure it's
visible if you want it fixed. Or start a thread on -hackers.
--
Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX
Experts in Analytics, Data Architecture and PostgreSQL
Data in Trouble? Get it in Treble! http://BlueTreble.com


Re: Serialization errors despite KEY SHARE/NO KEY UPDATE

От
Olivier Dony
Дата:
On 10/12/2015 03:59 PM, Jim Nasby wrote:
> On 10/6/15 12:18 PM, Olivier Dony wrote:
>>
>> We would happily skip the micro-transactions (as a perf workaround) if
>> there was a way to detect this situation, but we couldn't find a way to
>> do that in 9.3. <9.3 we used SELECT FOR UPDATE NOWAIT to guard similar
>> cases.
>>
>> If there is any way I could help to make the back-patch happen, please
>> let me know!
>
> I'd say you should probably open a bug about this to make sure it's
> visible if you want it fixed. Or start a thread on -hackers.

Good point, I've submitted bug #13681, thanks! :-)