Re: Question about RI checks

Поиск
Список
Период
Сортировка
От Florian Pflug
Тема Re: Question about RI checks
Дата
Msg-id AC19E024-578A-4E1A-87DC-BF068094785A@phlo.org
обсуждение исходный текст
Ответ на Re: Question about RI checks  (Kevin Grittner <kgrittn@ymail.com>)
Список pgsql-hackers
On Oct23, 2014, at 17:45 , Kevin Grittner <kgrittn@ymail.com> wrote:
> Every way I look at it, inside a REPEATABLE READ or SERIALIZABLE
> transaction a check for child rows when validating a parent DELETE
> should consider both rows which exist according to the transaction
> snapshot and according to a "current" snapshot.

I've pondered this further, and unfortunately it seems that this
isn't sufficient to guarantee true serializability :-(

Verifying that both snapshots contain exactly the same rows does not
prevent a child row from being inserted and immediately deleted again,
not if both actions happen *after* the parent-updating transaction
took its snapshot, but *before* it takes the crosscheck snapshot.

Let parent(id) and child(id, parent_id) again be two tables with a
FK constraint between them, let <child> be initially empty, and let
<parent> contain a single row (1).

Assume PD is a transaction which deletes all rows from <parent>,
CI a transaction which inserts the row (1, 1) into <child>, and
CD a transaction which deletes that row again.

Even with the extended cross-checking you propose, we'd still allow
the following concurrent schedule

 1. PD takes snapshot
 2. CI starts and completes
 3. CD starts and completes
 4. PD deletes from <parent> without complaining, since there were
    no conflicting rows at time (1), and none at time (4).

So far, all is well. But add two more tables, called <ci_before_pd>
and <pd_before_cd>, both initially containing one row. Let CI scan
<ci_before_pd>, let PD delete from <ci_before_pd> and scan <pd_before_cd>,
and let CD delete from <pd_before_cd>. In the concurrent schedule from
above, CI will see the row in <ci_before_pd>, and PD will delete it, and
PD will see the row in <pd_before_cd> that CD deletes. Note that SSI *will*
allow that schedule to occur without raising a serialization error
The only serial schedule which yields the same results for the various
queries pertaining <ci_before_pd> and <pd_before_cd> is

  CI -> PD -> CD,

i.e. PD has to run *between* CI and CD. But in that serial schedule,
PD *should* see a FK key violation, since CI has inserted a child which
CD hasn't yet deleted.

There is thus *no* serial schedule which yields the same results as the
concurrent schedule above for the queries pertaining <parent> and <child>,
*and* for the queries pertaining <ci_before_pd> and <pd_before_cd>, i.e
the concurrent schedule is *not* serializable. Yet even the extended cross-
check won't detect this.

Attached is an isolationtester spec file which implements this example,
and the corresponding out-file which shows that SSI permits the concurrent
schedule. Since SSI doesn't concern itself with RI enforcement queries,
it would also permit that schedule if we extended the cross-check, I think.

(I used REL9_4_STABLE as of today to try this, commit
 1cf54b00ba2100083390223a8244430643c1ec07)

best regards,
Florian Pflug

Вложения

В списке pgsql-hackers по дате отправления:

Предыдущее
От: Michael Paquier
Дата:
Сообщение: Re: Function array_agg(array)
Следующее
От: Pavel Stehule
Дата:
Сообщение: Re: Function array_agg(array)