Re: BUG #14124: ON UPDATE CASCADE failure on repeated foreign key column

Поиск
Список
Период
Сортировка
От Manuel Gómez
Тема Re: BUG #14124: ON UPDATE CASCADE failure on repeated foreign key column
Дата
Msg-id CAJWnFaMinZQOVQiZVuYJQJLp=gmLo6qThrN7otWM-cC6P1AHuA@mail.gmail.com
обсуждение исходный текст
Ответ на Re: BUG #14124: ON UPDATE CASCADE failure on repeated foreign key column  (Tom Lane <tgl@sss.pgh.pa.us>)
Список pgsql-bugs
On Mon, May 2, 2016 at 7:39 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Hmm.  Such a constraint is specifically forbidden by the SQL standard,
> see SQL:2011 11.8 <referential constraint definition> syntax rule 8:
>
>         Each referencing column shall identify a column of the referencing
>         table, and the same column shall not be identified more than once.

That’s a shame.  I suppose it’s a reasonable rule to simplify
semantics, especially given that it can be worked around by adding
another column guaranteed by a CHECK constraint to be identical,
although the performance overhead is regrettable, of course.

> Can you produce a convincing use-case for this structure?

We have a certain sort of object in our database that comes in two
flavors: a primitive sort of object, and a derived sort of object.
Derived objects are always associated through a derivation
relationship to some primitive object, and primitive objects are not
associated to any object through derivation.  Additionally, these
objects are attributed with a label that is independent of whether
they are primitive or derived, but a derived object's label must match
the label of the primitive object to which it is associated through
derivation.  Many other database objects must make references to
objects of either flavor, so it’s rather important that they all live
in a single table, or it would be complicated to associate other
database objects to these using foreign key constraints.  Our solution
is thus:

create type label_type as enum ( 'foo', 'bar', 'baz'
);

create table T ( id uuid primary key, prototype uuid not null, label label_type not null, unique (id, label,
prototype),foreign key (prototype, label, prototype)   references T (id, label, prototype)   on update cascade 
);

create function set_T_id() returns trigger language 'plpgsql' as $$ begin   if new.id is null then     new.id :=
uuid_generate_v4();  end if;   if new.prototype is null then     new.prototype := new.id;   end if;   return new; end; 
$$;

create trigger set_T_id_trigger before insert on T
for each row execute procedure set_T_id();

The prototype field serves as the association between an object and
the one from whom it has been derived.  Objects of the primitive
flavor are represented by rows for which id = prototype (in this
sense, being primitive means they are their own prototype); objects of
the derived flavor are represented by rows for which id <> prototype,
and for such derived objects, the associated primitive object is
represented by another row whose id is the value of the prototype
attribute.  The trigger allows us to easily insert either primitive or
derived objects: primitive objects are inserted by not specifying a
prototype identifier, and derived objects specify the identifier of
their prototype; both get their own identifiers generated
automatically if it was not specified.

This is still a somewhat simplified example as our actual T table has
several more columns, but in any case, many other tables have foreign
keys to T(id).  This matches up quite precisely with the semantics we
require for the concept represented by T.  However, the «on update
cascade» bit causes issues:

ERROR:  multiple assignments to same column "prototype"
CONTEXT:  SQL statement "UPDATE ONLY "T" SET "prototype" = $1, "label"
= $2, "prototype" = $3 WHERE $4 OPERATOR(pg_catalog.=) "prototype" AND
$5::pg_catalog.anyenum OPERATOR(pg_catalog.=)
"type"::pg_catalog.anyenum AND $6 OPERATOR(pg_catalog.=) "prototype""
SQL statement "update "T"     set "prototype" = "prototype id parameter"     where "id" = "T id variable""
PL/pgSQL function "create derived T"(uuid) line […] at SQL statement

Indeed, we have worked around this for now by removing «on update
cascade», and we could further avoid this by not performing this
update or by implementing the logic for this update in a custom
trigger.  This isn’t a show-stopper for us.  The only reason we
perform that update in the «create derived T» function is that we have
another similar «create primitive T» function, and creating a derived
T must first perform the same steps to create a primitive T (in terms
of other features of our schema not present in this simplified
version), so we simply call one function from the other, and at the
end, make the new T derived.  In any case, this is not really relevant
to the crux of the issue; it’s just context for that specific context
for the error and we can easily work around it.

> It would be difficult to support that, AFAICS, because the FK enforcement
> queries do not depend on things like whether the specific values involved
> are equal.  Nor would one generally expect that an FK constraint should be
> allowed to restrict what can be stored in the referenced table.

Not generally, but this can certainly happen with the combination of
FK and CHECK.  I don’t mean to imply this is a reasonable way to
design a schema, only that it’s possible. :)

> Moreover, it's not even real clear to me that there is a well defined
> notion of "equal" in such a situation.  If x and y are of different types,
> it's possible they would have different equality rules.  (This is somewhat
> constrained by the fact that z would have to have equality operators in
> common with both; but I do not think that's quite sufficient to remove the
> concern.)
>
> It could be argued that we should hew strictly to the letter of the SQL
> standard and forbid foreign-key constraints with duplicate referencing
> columns.  Or maybe allow it only if the FK action is not one where we'd
> run into this problem.  I'm not terribly excited about trying to support
> the case, though.

Indeed, I had not considered issues with missing equality operators,
and looking at the logic in ATAddForeignKeyConstraint and
RI_FKey_cascade_upd, I see that it’s a problem.  It could certainly be
supported by checking for appropriate operators or even cheaply
restricting this to all the types matching up exactly.

As this is all sadly nonstandard, I understand if there’s no intent to
change the current state of things.  It would certainly help to
clarify these things in the documentation, but that would require
deciding if and how to support such nonstandard foreign key
constraints.



В списке pgsql-bugs по дате отправления:

Предыдущее
От: Tom Lane
Дата:
Сообщение: Re: BUG #14124: ON UPDATE CASCADE failure on repeated foreign key column
Следующее
От: Pavel Golub
Дата:
Сообщение: Re: BUG #14064: Sort order of bytea, etc. not defined