Re: [HACKERS] GSoC 2017: Foreign Key Arrays

Поиск
Список
Период
Сортировка
От Andreas Karlsson
Тема Re: [HACKERS] GSoC 2017: Foreign Key Arrays
Дата
Msg-id 76a8d3d8-a22e-3299-7c4e-6e115dbf3762@proxel.se
обсуждение исходный текст
Ответ на Re: [HACKERS] GSoC 2017: Foreign Key Arrays  (Mark Rofail <markm.rofail@gmail.com>)
Ответы Re: [HACKERS] GSoC 2017: Foreign Key Arrays  (Mark Rofail <markm.rofail@gmail.com>)
Список pgsql-hackers
Sorry for the very late review.

I like this feature and have needed it myself in the past, and the 
current syntax seems pretty good. One could argue for if the syntax 
could be generalized to support other things like json and hstore, but I 
do not think it would be fair to block this patch due to that.

== Limitations of the current design

1) Array element foreign keys can only be specified at the table level 
(not at columns): I think this limitation is fine. Other PostgreSQL 
specific features like exclusion contraints can also only be specified 
at the table level.

2) Lack of support for SET NULL and SET DEFAULT: these do not seem very 
useful for arrays.

3) Lack of support for specifiying multiple arrays in the foreign key: 
seems like a good thing to me since it is not obvious what such a thing 
even would do.

4) That you need to add a cast to the index if you have different types: 
due to there not being a int4[] <@ int2[] operator you need to add an 
index on (col::int4[]) to speed up deletes and updates. This one i 
annoying since EXPLAIN wont give you the query plans for the foreign key 
queries, but I do not think fixing this should be within the scope of 
the patch and that having a smaller interger in the referring table is rare.

5) The use of count(DISTINCT) requiring types to support btree equality: 
this has been discussed a lot up-thread and I think the current state is 
good enough.

== Functional review

I have played around some with it and things seem to work and the test 
suite passes, but I noticed a couple of strange behaviors.

1) MATCH FULL does not seem to care about NULLS in arrays. In the 
example below I expected both inserts into the referring table to fail.

CREATE TABLE t (x int, y int, PRIMARY KEY (x, y));
CREATE TABLE fk (x int, ys int[], FOREIGN KEY (x, EACH ELEMENT OF ys) 
REFERENCES t MATCH FULL);
INSERT INTO t VALUES (10, 1);
INSERT INTO fk VALUES (10, '{1,NULL}');
INSERT INTO fk VALUES (NULL, '{1}');

CREATE TABLE
CREATE TABLE
INSERT 0 1
INSERT 0 1
ERROR:  insert or update on table "fk" violates foreign key constraint 
"fk_x_fkey"
DETAIL:  MATCH FULL does not allow mixing of null and nonnull key values.

2) To me it was not obvious that ON DELETE CASCADE would delete the 
whole rows rather than delete the members from the array, and this kind 
of misunderstanding can lead to pretty bad surprises in production. I am 
leaning towards not supporting CASCADE.

== The @>> operator

A previous version of your patch added the "anyelement <<@ anyarray" 
operator to avoid having to build arrays, but that part was reverted due 
to a bug.

I am not expert on the gin code, but as far as I can tell it would be 
relatively simple to fix that bug. Just allocate an array of Datums of 
length one where you put the element you are searching for (or maybe a 
copy of it).

Potential issues with adding the operators:

1) Do we really want to add an operator just for array element foreign 
keys? I think this is not an issue since it seems like it should be 
useful in general. I know I have wanted it myself.

2) I am not sure, but the committers might prefer if adding the 
operators is done in a separate patch.

3) Bikeshedding about operator names. I personally think @>> is clear 
enough and as far as I know it is not used for anything else.

== Code review

The patch no longer applies to HEAD, but the conflicts are small.

I think we should be more consistent in the naming, both in code and in 
the documentation. Right now we have "array foreign keys", "element 
foreign keys", "ELEMENT foreign keys", etc.

+       /*
+        * If this is an array foreign key, we must look up the 
operators for
+        * the array element type, not the array type itself.
+        */
+       if (fkreftypes[i] != FKCONSTR_REF_PLAIN)

+           if (fkreftypes[i] != FKCONSTR_REF_PLAIN)
+           {
+               old_fktype = get_base_element_type(old_fktype);
+               /* this shouldn't happen ... */
+               if (!OidIsValid(old_fktype))
+                   elog(ERROR, "old foreign key column is not an array");
+           }

+       if (riinfo->fk_reftypes[i] != FKCONSTR_REF_PLAIN)
+       {
+           riinfo->has_array = true;
+           riinfo->ff_eq_oprs[i] = ARRAY_EQ_OP;
+       }

In the three diffs above it would be much cleaner to check for "== 
FKCONSTR_REF_EACH_ELEMENT" since that better conveys the intent and is 
safer for adding new types in the future.

+           /* We look through any domain here */
+           fktype = get_base_element_type(fktype);

What does the comment above mean?
        if (!(OidIsValid(pfeqop) && OidIsValid(ffeqop)))            ereport(ERROR,
(errcode(ERRCODE_DATATYPE_MISMATCH),
-                    errmsg("foreign key constraint \"%s\" "
-                           "cannot be implemented",
-                           fkconstraint->conname),
-                    errdetail("Key columns \"%s\" and \"%s\" "
-                              "are of incompatible types: %s and %s.",
+               errmsg("foreign key constraint \"%s\" cannot be 
implemented",
+                      fkconstraint->conname),
+                    errdetail("Key columns \"%s\" and \"%s\" are of 
incompatible types: %s and %s.",                               strVal(list_nth(fkconstraint->fk_attrs, i)),
                 strVal(list_nth(fkconstraint->pk_attrs, i)),
 
-                              format_type_be(fktype),
+                              format_type_be(fktypoid[i]),                               format_type_be(pktype))));

The above diff looks like pointless code churn which possibly introduces 
a bug in how it changed fktype to fktypoid[i].

I think the code in RI_Initial_Check() would be cleaner if you used 
"CROSS JOIN LATERAL unnest(col)" rather than having unnest() in the 
target list. This way you would not need to rename all columns and the 
code paths for the array case could look more like the code path for the 
normal case.

== Minor things in the code

+           {
+               pk_attrs = lappend(pk_attrs, arg);
+               fk_reftypes = lappend_int(fk_reftypes, FKCONSTR_REF_PLAIN);
+           }

This is incorrectly indented.

I suggest that you should only allocate countbuf in RI_FKey_check() if 
has_array is set.

I think the code would be more readable if both branches of 
ri_GenerateQual() used the same pattern for whitespace so the 
differences are easier to spot.

You can use DROP TABLE IF EXISTS to silence the "ERROR:  table 
"fktableforarray" does not exist" errors.

I am not sure that you need to test all combinations of ON UPDATE and ON 
DELETE. I think it should be enough to test one of each ON UPDATE and 
one of each ON DELETE should be enough.

+-- Create the foreign table

It is probably a bad idea to call the referencing table the foreign table.

+-- Repeat a similar test using INT4 keys coerced from INT2

This comment is repeated twice in the test suite.

== Documentation

Remove the ELEMENT REFERENCES form from 
doc/src/sgml/ref/create_table.sgml since we no longer support it.

-  FOREIGN KEY ( <replaceable 
class="parameter">column_name</replaceable> [, ... ] ) REFERENCES 
<replaceable class="parameter">reftable</replaceable> [ ( <replaceable 
class="parameter">refcolumn</replaceable> [, ... ] ) ]
+  FOREIGN KEY ( [ELEMENT] <replaceable 
class="parameter">column_name</replaceable> [, ... ] ) REFERENCES 
<replaceable class="parameter">reftable</replaceable> [ ( <replaceable 
class="parameter">refcolumn</replaceable> [, ... ] ) ]

Change this to "FOREIGN KEY ( [EACH ELEMENT OF] ...".

-   <term><literal>FOREIGN KEY ( <replaceable 
class="parameter">column_name</replaceable> [, ... ] )
+   <term><literal>FOREIGN KEY ( [ELEMENT] <replaceable 
class="parameter">column_name</replaceable> [, ... ] )

Change this to "... FOREIGN KEY ( [EACH ELEMENT OF] ...".

+        <literal>ELEMENT</literal> keyword and <replaceable

Change this to "... <literal>EACH ELEMENT OF</literal> ...". Maybe you 
should also fix other instances of ELEMENT in the same paragraph but 
these are less obvious.

You should remove the "ELEMENT REFERENCES" section of 
doc/src/sgml/ref/create_table.sgml, and move any still relevant bits 
elsewhere since we do not support this syntax.

The sql-createtable-foreign-key-arrays section need to be updated to 
remove references to "ELEMENT REFERENCES".

Your indentation in doc/src/sgml/catalogs.sgml is two spaces but the 
rest of the file looks like it uses 1 space indentation. Additionally 
you seem to have accidentally reindented something which was correctly 
indented.

Same with the idnentation in doc/src/sgml/ddl.sgml and 
doc/src/sgml/ref/create_table.sgml.

-   <varlistentry>
+  <varlistentry id="sql-createtable-foreign-key" xreflabel="FOREIGN KEY">

You have accidentally reindented this in doc/src/sgml/ref/create_table.sgml.

The paragraph in doc/src/sgml/ref/create_table.sgml which starts with 
"In case the column name" seems to actually be multiple paragraphs. Is 
that intentional or a mistake?

The documentation in doc/src/sgml/ddl.sgml mentions that "it must be 
written in table constraint form" for when you have multiple columns, 
but I feel this is just redundant and confusing since this is always 
true, both for array foreign keys and for when you have multiple columns.

The documentation in doc/src/sgml/ddl.sgml should mention that we only 
support one array reference per foreign key.

Maybe the documentation in doc/src/sgml/ddl.sgml should mention that we 
only support the table constraint form.

Andreas


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

В списке pgsql-hackers по дате отправления:

Предыдущее
От: Thomas Munro
Дата:
Сообщение: Re: [HACKERS] Parallel Hash take II
Следующее
От: Andreas Karlsson
Дата:
Сообщение: Re: [HACKERS] git down