Обсуждение: deadlock with truncate and foreing keys

Поиск
Список
Период
Сортировка

deadlock with truncate and foreing keys

От
Alexey Nalbat
Дата:
Hello.

I've encountered deadlock with first transaction updating information
field in the primary table, and second trasaction reloading secondary
table using TRUNCATE and INSERT. Here is simple example:

create table t1 ( id integer primary key, name text );
create table t2 ( id integer references t1 );
insert into t1 values ( 1 );
insert into t2 values ( 1 );

Then two concurrent transactions start.

/* 1 */ begin;
/* 1 */ truncate t2;
        /* 2 */ begin;
        /* 2 */ update t1 set name='foo' where id=1;
/* 1 */ insert into t2 values ( 1 );

Here we have deadlock.

ERROR:  deadlock detected
DETAIL:  Process 21581 waits for ShareLock on transaction 464; blocked by process 21619.
Process 21619 waits for AccessShareLock on relation 16456 of database 16385; blocked by process 21581.
CONTEXT:  SQL statement "SELECT 1 FROM ONLY "public"."t1" x WHERE "id" OPERATOR(pg_catalog.=) $1 FOR SHARE OF x"

When substituting TRUNCATE with DELETE, second transaction is not
waiting for first one, and there is no deadlock error. But I prefer to use
TRUNCATE because it is much faster. (In fact I use COPY instead of
INSERT to speed up reloading of secondary table.)

Is it bug? And if so, will it be fixed?

Thank you. Sorry for bad english.

--
Alexey A. Nalbat

Price Express
http://www.price.ru/
http://www.tyndex.ru/

Re: deadlock with truncate and foreing keys

От
Tom Lane
Дата:
Alexey Nalbat <nalbat@price.ru> writes:
> create table t1 ( id integer primary key, name text );
> create table t2 ( id integer references t1 );
> insert into t1 values ( 1 );
> insert into t2 values ( 1 );

> Then two concurrent transactions start.

> /* 1 */ begin;
> /* 1 */ truncate t2;
>         /* 2 */ begin;
>         /* 2 */ update t1 set name='foo' where id=1;
> /* 1 */ insert into t2 values ( 1 );

> Here we have deadlock.

Hmm, this happens because RI_FKey_keyequal_upd_pk does

    fk_rel = heap_open(riinfo.fk_relid, AccessShareLock);

but right offhand I see no reason for it to do so --- it doesn't
*do* anything with fk_rel except close it again.  Likewise
RI_FKey_keyequal_upd_fk doesn't seem to really need to touch the
pk_rel.  Is there something I'm missing in that?  Maybe this is
a vestige of earlier coding that did need to touch both rels
to perform the keysequal check?

            regards, tom lane

Re: deadlock with truncate and foreing keys

От
Gregory Stark
Дата:
"Alexey Nalbat" <nalbat@price.ru> writes:

> Hello.
>
> I've encountered deadlock with first transaction updating information
> field in the primary table, and second trasaction reloading secondary
> table using TRUNCATE and INSERT. Here is simple example:
>
> create table t1 ( id integer primary key, name text );
> create table t2 ( id integer references t1 );
> insert into t1 values ( 1 );
> insert into t2 values ( 1 );
>
> Then two concurrent transactions start.
>
> /* 1 */ begin;
> /* 1 */ truncate t2;
>         /* 2 */ begin;
>         /* 2 */ update t1 set name='foo' where id=1;
> /* 1 */ insert into t2 values ( 1 );
>
> Here we have deadlock.
>
> ERROR:  deadlock detected
> DETAIL:  Process 21581 waits for ShareLock on transaction 464; blocked by process 21619.
> Process 21619 waits for AccessShareLock on relation 16456 of database 16385; blocked by process 21581.
> CONTEXT:  SQL statement "SELECT 1 FROM ONLY "public"."t1" x WHERE "id" OPERATOR(pg_catalog.=) $1 FOR SHARE OF x"
>
> When substituting TRUNCATE with DELETE, second transaction is not
> waiting for first one, and there is no deadlock error. But I prefer to use
> TRUNCATE because it is much faster. (In fact I use COPY instead of
> INSERT to speed up reloading of secondary table.)
>
> Is it bug? And if so, will it be fixed?

The reason TRUNCATE is so much faster is because it's quite different from
DELETE. That difference is why there's a deadlock here. So essentially it's
not a bug, it's an inherent feature of TRUNCATE.

I think what's going on here is that in transaction 1 the TRUNCATE locks t2
and then the INSERT tries to take a lock on a record in t1 to enforce the
referential integrity. In transaction 2 you've locked that record already to
do the update and are waiting on a lock on t2 to enforce the referential
integrity (actually on a specific record but it amounts to the same thing here
I think). So you have two transactions taking two locks in different orders.

I think a LOCK TABLE against t1 (and any other referencing tables) before
doing the truncate would effectively solve the problem by enforcing the lock
order being t1 then t2. Effectively it would mean waiting until any pending
updates to t1 are committed and then blocking any subsequent updates before
starting to rewrite t2. Assuming the truncates are rare compared to the
updates this is probably the best combination.

Alternatively putting a LOCK TABLE against t2 in transaction 2 might also
solve the problem at the expense of serializing all the updates.

--
  Gregory Stark
  EnterpriseDB          http://www.enterprisedb.com
  Ask me about EnterpriseDB's On-Demand Production Tuning

Re: deadlock with truncate and foreing keys

От
Tom Lane
Дата:
Gregory Stark <stark@enterprisedb.com> writes:
> "Alexey Nalbat" <nalbat@price.ru> writes:
>> create table t1 ( id integer primary key, name text );
>> create table t2 ( id integer references t1 );
>> /* 1 */ truncate t2;
>> /* 2 */ update t1 set name='foo' where id=1;

> I think what's going on here is that in transaction 1 the TRUNCATE locks t2
> and then the INSERT tries to take a lock on a record in t1 to enforce the
> referential integrity.

I'm not sure whether the OP recognizes that this is a critical point,
but that UPDATE isn't changing t1.id and therefore there is no need
for it to look into t2 at all.  The current code in fact does not look
into t2, much less take any row-level lock there; but nonetheless it
transiently takes AccessShareLock on t2.  I'm thinking that might be
unnecessary ...

            regards, tom lane

Re: [HACKERS] deadlock with truncate and foreing keys

От
Stephan Szabo
Дата:
On Mon, 18 Feb 2008, Tom Lane wrote:

> Alexey Nalbat <nalbat@price.ru> writes:
> > create table t1 ( id integer primary key, name text );
> > create table t2 ( id integer references t1 );
> > insert into t1 values ( 1 );
> > insert into t2 values ( 1 );
>
> > Then two concurrent transactions start.
>
> > /* 1 */ begin;
> > /* 1 */ truncate t2;
> >         /* 2 */ begin;
> >         /* 2 */ update t1 set name='foo' where id=1;
> > /* 1 */ insert into t2 values ( 1 );
>
> > Here we have deadlock.
>
> Hmm, this happens because RI_FKey_keyequal_upd_pk does
>
>     fk_rel = heap_open(riinfo.fk_relid, AccessShareLock);
>
> but right offhand I see no reason for it to do so --- it doesn't
> *do* anything with fk_rel except close it again.  Likewise
> RI_FKey_keyequal_upd_fk doesn't seem to really need to touch the
> pk_rel.  Is there something I'm missing in that?  Maybe this is
> a vestige of earlier coding that did need to touch both rels
> to perform the keysequal check?

Probably something like that - maybe ri_BuildQueryKeyFull might have
needed it open. Actually, I'm wondering if the ri_BuildQueryKeyFull call
is also unnecessary now - I don't think we ever use the qkey that comes
out of it unless I'm missing some code.

Re: [HACKERS] deadlock with truncate and foreing keys

От
Tom Lane
Дата:
Stephan Szabo <sszabo@megazone.bigpanda.com> writes:
> On Mon, 18 Feb 2008, Tom Lane wrote:
>> but right offhand I see no reason for it to do so --- it doesn't
>> *do* anything with fk_rel except close it again.  Likewise
>> RI_FKey_keyequal_upd_fk doesn't seem to really need to touch the
>> pk_rel.  Is there something I'm missing in that?  Maybe this is
>> a vestige of earlier coding that did need to touch both rels
>> to perform the keysequal check?

> Probably something like that - maybe ri_BuildQueryKeyFull might have
> needed it open.

Yeah, looking back at 8.2 and earlier confirms this --- it used to need
access to the rel in order to interpret the trigger arguments
(specifically, convert column names to numbers).  In the 8.3 rewrite
that got rid of the trigger arguments, I removed the no-longer-needed
Relation arguments to ri_BuildQueryKeyFull, but didn't take the next
step of not opening the relation where it wasn't being used otherwise.

I was thinking there might be some arcane locking reason for transiently
locking the other table, but I can't imagine what it would be.  We have
a writer's lock on the table we are modifying, and that is sufficient to
ensure that the RI constraint isn't changing, so ...

> Actually, I'm wondering if the ri_BuildQueryKeyFull call
> is also unnecessary now - I don't think we ever use the qkey that comes
> out of it unless I'm missing some code.

Good point --- ri_KeysEqual only needs the RI_ConstraintInfo not the
qkey.  In 8.2 and before it used the qkey to get the info.

In short, this is easy to improve in HEAD and 8.3, but not so readily
fixable in prior releases.  I'll go make it so.

            regards, tom lane