Fix initdb's unsafe not-null-marking rule

Поиск
Список
Период
Сортировка
От Tom Lane
Тема Fix initdb's unsafe not-null-marking rule
Дата
Msg-id 204760.1595181800@sss.pgh.pa.us
обсуждение исходный текст
Список pgsql-hackers
Part of the blame for the pg_subscription.subslotname fiasco can be laid
at the feet of initdb's default rule for marking columns NOT NULL; that
rule is fairly arbitrary and does not guarantee to make safe choices.
I propose that we change it so that it *is* safe, ie it will only mark
fields NOT NULL if they'd certainly be safe to access as C struct fields.

Keeping the end results the same requires a few more manual applications
of BKI_FORCE_NOT_NULL than we had before.  But I think that that's fine,
because it reduces the amount of poorly-documented magic in this area.
I note in particular that bki.sgml was entirely failing to tell the full
truth.

(Note: this would allow reverting the manual BKI_FORCE_NULL label that
I just added to pg_subscription.subslotname, but I feel no great desire
to do that.)

I propose this only for HEAD, not the back branches.

            regards, tom lane

diff --git a/doc/src/sgml/bki.sgml b/doc/src/sgml/bki.sgml
index 6776c4a3c1..5b871721d1 100644
--- a/doc/src/sgml/bki.sgml
+++ b/doc/src/sgml/bki.sgml
@@ -119,7 +119,8 @@
    require all columns that should be non-nullable to be marked so
    in <structname>pg_attribute</structname>.  The bootstrap code will
    automatically mark catalog columns as <literal>NOT NULL</literal>
-   if they are fixed-width and are not preceded by any nullable column.
+   if they are fixed-width and are not preceded by any nullable or
+   variable-width column.
    Where this rule is inadequate, you can force correct marking by using
    <literal>BKI_FORCE_NOT_NULL</literal>
    and <literal>BKI_FORCE_NULL</literal> annotations as needed.  But note
diff --git a/src/backend/bootstrap/bootstrap.c b/src/backend/bootstrap/bootstrap.c
index 5480a024e0..45b7efbe46 100644
--- a/src/backend/bootstrap/bootstrap.c
+++ b/src/backend/bootstrap/bootstrap.c
@@ -770,25 +770,18 @@ DefineAttr(char *name, char *type, int attnum, int nullness)

         /*
          * Mark as "not null" if type is fixed-width and prior columns are
-         * too.  This corresponds to case where column can be accessed
-         * directly via C struct declaration.
-         *
-         * oidvector and int2vector are also treated as not-nullable, even
-         * though they are no longer fixed-width.
+         * likewise fixed-width and not-null.  This corresponds to case where
+         * column can be accessed directly via C struct declaration.
          */
-#define MARKNOTNULL(att) \
-        ((att)->attlen > 0 || \
-         (att)->atttypid == OIDVECTOROID || \
-         (att)->atttypid == INT2VECTOROID)
-
-        if (MARKNOTNULL(attrtypes[attnum]))
+        if (attrtypes[attnum]->attlen > 0)
         {
             int            i;

             /* check earlier attributes */
             for (i = 0; i < attnum; i++)
             {
-                if (!attrtypes[i]->attnotnull)
+                if (attrtypes[i]->attlen <= 0 ||
+                    !attrtypes[i]->attnotnull)
                     break;
             }
             if (i == attnum)
diff --git a/src/backend/catalog/genbki.pl b/src/backend/catalog/genbki.pl
index b07537fbba..dc5f442397 100644
--- a/src/backend/catalog/genbki.pl
+++ b/src/backend/catalog/genbki.pl
@@ -713,8 +713,8 @@ sub gen_pg_attribute
         push @tables_needing_macros, $table_name;

         # Generate entries for user attributes.
-        my $attnum       = 0;
-        my $priornotnull = 1;
+        my $attnum          = 0;
+        my $priorfixedwidth = 1;
         foreach my $attr (@{ $table->{columns} })
         {
             $attnum++;
@@ -722,8 +722,12 @@ sub gen_pg_attribute
             $row{attnum}   = $attnum;
             $row{attrelid} = $table->{relation_oid};

-            morph_row_for_pgattr(\%row, $schema, $attr, $priornotnull);
-            $priornotnull &= ($row{attnotnull} eq 't');
+            morph_row_for_pgattr(\%row, $schema, $attr, $priorfixedwidth);
+
+            # Update $priorfixedwidth --- must match morph_row_for_pgattr
+            $priorfixedwidth &=
+              ($row{attnotnull} eq 't'
+                  && ($row{attlen} eq 'NAMEDATALEN' || $row{attlen} > 0));

             # If it's bootstrapped, put an entry in postgres.bki.
             print_bki_insert(\%row, $schema) if $table->{bootstrap};
@@ -765,13 +769,13 @@ sub gen_pg_attribute

 # Given $pgattr_schema (the pg_attribute schema for a catalog sufficient for
 # AddDefaultValues), $attr (the description of a catalog row), and
-# $priornotnull (whether all prior attributes in this catalog are not null),
+# $priorfixedwidth (all prior columns are fixed-width and not null),
 # modify the $row hashref for print_bki_insert.  This includes setting data
 # from the corresponding pg_type element and filling in any default values.
 # Any value not handled here must be supplied by caller.
 sub morph_row_for_pgattr
 {
-    my ($row, $pgattr_schema, $attr, $priornotnull) = @_;
+    my ($row, $pgattr_schema, $attr, $priorfixedwidth) = @_;
     my $attname = $attr->{name};
     my $atttype = $attr->{type};

@@ -801,19 +805,18 @@ sub morph_row_for_pgattr
     {
         $row->{attnotnull} = 'f';
     }
-    elsif ($priornotnull)
+    elsif ($priorfixedwidth)
     {

         # attnotnull will automatically be set if the type is
-        # fixed-width and prior columns are all NOT NULL ---
-        # compare DefineAttr in bootstrap.c. oidvector and
-        # int2vector are also treated as not-nullable.
+        # fixed-width and prior columns are likewise fixed-width
+        # and NOT NULL --- compare DefineAttr in bootstrap.c.
+        # At this point the width of type name is still symbolic,
+        # so we need a special test.
         $row->{attnotnull} =
-            $type->{typname} eq 'oidvector'  ? 't'
-          : $type->{typname} eq 'int2vector' ? 't'
-          : $type->{typlen} eq 'NAMEDATALEN' ? 't'
-          : $type->{typlen} > 0              ? 't'
-          :                                    'f';
+            $row->{attlen} eq 'NAMEDATALEN' ? 't'
+          : $row->{attlen} > 0              ? 't'
+          :                                   'f';
     }
     else
     {
diff --git a/src/include/catalog/genbki.h b/src/include/catalog/genbki.h
index 4a6c8636da..8cac7ec878 100644
--- a/src/include/catalog/genbki.h
+++ b/src/include/catalog/genbki.h
@@ -46,8 +46,8 @@
 /*
  * Variable-length catalog fields (except possibly the first not nullable one)
  * should not be visible in C structures, so they are made invisible by #ifdefs
- * of an undefined symbol.  See also MARKNOTNULL in bootstrap.c for how this is
- * handled.
+ * of an undefined symbol.  See also the BOOTCOL_NULL_AUTO code in bootstrap.c
+ * for how this is handled.
  */
 #undef CATALOG_VARLEN

diff --git a/src/include/catalog/pg_index.h b/src/include/catalog/pg_index.h
index d3d7ea77fb..4a642f336f 100644
--- a/src/include/catalog/pg_index.h
+++ b/src/include/catalog/pg_index.h
@@ -44,12 +44,14 @@ CATALOG(pg_index,2610,IndexRelationId) BKI_SCHEMA_MACRO
     bool        indisreplident; /* is this index the identity for replication? */

     /* variable-length fields start here, but we allow direct access to indkey */
-    int2vector    indkey;            /* column numbers of indexed cols, or 0 */
+    int2vector    indkey BKI_FORCE_NOT_NULL;    /* column numbers of indexed cols,
+                                             * or 0 */

 #ifdef CATALOG_VARLEN
-    oidvector    indcollation;    /* collation identifiers */
-    oidvector    indclass;        /* opclass identifiers */
-    int2vector    indoption;        /* per-column flags (AM-specific meanings) */
+    oidvector    indcollation BKI_FORCE_NOT_NULL;    /* collation identifiers */
+    oidvector    indclass BKI_FORCE_NOT_NULL;    /* opclass identifiers */
+    int2vector    indoption BKI_FORCE_NOT_NULL;    /* per-column flags
+                                                 * (AM-specific meanings) */
     pg_node_tree indexprs;        /* expression trees for index attributes that
                                  * are not simple column references; one for
                                  * each zero entry in indkey[] */
diff --git a/src/include/catalog/pg_partitioned_table.h b/src/include/catalog/pg_partitioned_table.h
index a73cd0d3a4..7ee0419373 100644
--- a/src/include/catalog/pg_partitioned_table.h
+++ b/src/include/catalog/pg_partitioned_table.h
@@ -41,13 +41,17 @@ CATALOG(pg_partitioned_table,3350,PartitionedRelationId)
      * field of a heap tuple can be reliably accessed using its C struct
      * offset, as previous fields are all non-nullable fixed-length fields.
      */
-    int2vector    partattrs;        /* each member of the array is the attribute
-                                 * number of a partition key column, or 0 if
-                                 * the column is actually an expression */
+    int2vector    partattrs BKI_FORCE_NOT_NULL;    /* each member of the array is
+                                                 * the attribute number of a
+                                                 * partition key column, or 0
+                                                 * if the column is actually
+                                                 * an expression */

 #ifdef CATALOG_VARLEN
-    oidvector    partclass;        /* operator class to compare keys */
-    oidvector    partcollation;    /* user-specified collation for keys */
+    oidvector    partclass BKI_FORCE_NOT_NULL;    /* operator class to compare
+                                                 * keys */
+    oidvector    partcollation BKI_FORCE_NOT_NULL;    /* user-specified
+                                                     * collation for keys */
     pg_node_tree partexprs;        /* list of expressions in the partition key;
                                  * one item for each zero entry in partattrs[] */
 #endif
diff --git a/src/include/catalog/pg_proc.h b/src/include/catalog/pg_proc.h
index 65e8c9f054..b50fa25dbd 100644
--- a/src/include/catalog/pg_proc.h
+++ b/src/include/catalog/pg_proc.h
@@ -92,7 +92,7 @@ CATALOG(pg_proc,1255,ProcedureRelationId) BKI_BOOTSTRAP BKI_ROWTYPE_OID(81,Proce
      */

     /* parameter types (excludes OUT params) */
-    oidvector    proargtypes BKI_LOOKUP(pg_type);
+    oidvector    proargtypes BKI_LOOKUP(pg_type) BKI_FORCE_NOT_NULL;

 #ifdef CATALOG_VARLEN

diff --git a/src/include/catalog/pg_statistic_ext.h b/src/include/catalog/pg_statistic_ext.h
index a8cb16997a..8747903fc7 100644
--- a/src/include/catalog/pg_statistic_ext.h
+++ b/src/include/catalog/pg_statistic_ext.h
@@ -47,7 +47,7 @@ CATALOG(pg_statistic_ext,3381,StatisticExtRelationId)
      * variable-length fields start here, but we allow direct access to
      * stxkeys
      */
-    int2vector    stxkeys;        /* array of column keys */
+    int2vector    stxkeys BKI_FORCE_NOT_NULL; /* array of column keys */

 #ifdef CATALOG_VARLEN
     char        stxkind[1] BKI_FORCE_NOT_NULL;    /* statistics kinds requested
diff --git a/src/include/catalog/pg_trigger.h b/src/include/catalog/pg_trigger.h
index 9612b9bdd6..fa5761b784 100644
--- a/src/include/catalog/pg_trigger.h
+++ b/src/include/catalog/pg_trigger.h
@@ -54,7 +54,8 @@ CATALOG(pg_trigger,2620,TriggerRelationId)
      * Variable-length fields start here, but we allow direct access to
      * tgattr. Note: tgattr and tgargs must not be null.
      */
-    int2vector    tgattr;            /* column numbers, if trigger is on columns */
+    int2vector    tgattr BKI_FORCE_NOT_NULL;    /* column numbers, if trigger is
+                                             * on columns */

 #ifdef CATALOG_VARLEN
     bytea        tgargs BKI_FORCE_NOT_NULL;    /* first\000second\000tgnargs\000 */

В списке pgsql-hackers по дате отправления:

Предыдущее
От: "David G. Johnston"
Дата:
Сообщение: Re: Default setting for enable_hashagg_disk
Следующее
От: Tom Lane
Дата:
Сообщение: Re: pg_subscription.subslotname is wrongly marked NOT NULL