Обсуждение: ALTER TABLE, find_composite_type_dependencies and locking

Поиск
Список
Период
Сортировка

ALTER TABLE, find_composite_type_dependencies and locking

От
"Florian G. Pflug"
Дата:
Hi

I'm currently investigating how much work it'd be to implement arrays of
domains since I have a client who might be interested in sponsoring that
work.

The comments around the code handling ALTER DOMAIN ADD CONSTRAINT are
pretty clear about the lack of proper locking in that code - altering a
domain while simultaneously add a column with that domain as a type
might result in inconsistencies between the data in that column and the
domain's constraints after both transactions committed.

I do, however, suspect that ALTER TABLE is plagued by similar problems.
Currently, during the rewrite phase of ALTER TABLE,
find_composite_type_dependencies is used to verify that the table's row
type (or any type directly or indirectly depending on that type) is not
used as a column's type anywhere in the database.

But since this code does not take any permanent locks on the visited
types, it seems that adding such a column concurrently is not prevented. If the original ALTER TABLE changed a column's
type,data inserted into
 
the newly added column before the original ALTER TABLE committed will
have a type different from what the catalog says after the original
ALTER TABLE commits. Or at least so I think - I haven't yet tested that
theory...

I am aware that since a commit fest is currently running, now might not
be the best time to bring up this topic. Since I feared forgetting this
all together, I decided to still post now, though. I figured people
still have to option to ignore this for now if they're busy with getting
those patches committed.

best regards,
Florian Pflug


Re: ALTER TABLE, find_composite_type_dependencies and locking (Confirmed)

От
"Florian G. Pflug"
Дата:
Florian G. Pflug wrote:
> I do, however, suspect that ALTER TABLE is plagued by similar 
> problems. Currently, during the rewrite phase of ALTER TABLE, 
> find_composite_type_dependencies is used to verify that the table's 
> row type (or any type directly or indirectly depending on that type) 
> is not used as a column's type anywhere in the database.
> 
> But since this code does not take any permanent locks on the visited
>  types, it seems that adding such a column concurrently is not 
> prevented. If the original ALTER TABLE changed a column's type, data 
> inserted into the newly added column before the original ALTER TABLE 
> committed will have a type different from what the catalog says after
>  the original ALTER TABLE commits. Or at least so I think - I haven't
>  yet tested that theory...

I was able to confirm that this is an actual bug in 8.5. I did, however,
need to use an array-of-composite type. With only nested composite types
it seems that CheckAttributeType() protects against the race, because it
follows the dependency chain and opens each type's relation in
AccessShareLock mode. This blocks once the traversal hits the type which
is being altered, hence forcing the table creation to wait for the
concurrent alter table to complete.

Create two types in session 1
session 1: create table t1 (t1_i int);
session 1: create type t2 as (t2_t1 t1);

Warm the type cache in session 2
(A simple select array[row(row(-1))::t2] would probably suffice)
session 2: create table bug (bug_t2s t2[]);
session 2: insert into bug (bug_t2s) values (array[row(row(-1))::t2]);
session 2: select bug.bug_t2s[1].t2_t1.t1_i from bug;
[select correctly returns one row containing -1]
session 2: drop table bug;

Alter type of t1_i in session 1
session 1: alter table t1 alter column t1_i type varchar;
[Pause session 1 using gdb *right* after the call to find_composite_type_dependencies in ATRewriteTable returned]

Create the bug table in session 2, and insert record
session 2: create table bug (bug_t2s t2[]);
session 2: insert into bug (bug_t2s) values (array[row(row(-1))::t2]);
session 2: select bug.bug_t2s[1].t2_t1.t1_i from bug;
[select correctly returns one row containing -1]

Complete the alter table in session 1
[Resume session 1 using gdb]
session 1: select bug.bug_t2s[1].t2_t1.t1_i from bug;
[select returns bogus string. On my 8.5 debug+cassert build, its a long chain of \x7F\x7F\x7F\x...]

Don't have any good idea how to fix this, yet. If CheckAttributeType()
really *does* offer sufficient protected in the non-array case,
extending that to the general case might work. But OTOH it might equally
well be that a more sophisticated race exists even in the non-array
case, and I simply didn't manage to trigger it...

best regards, Florian Pflug