Обсуждение: Make PG's "NOT NULL"s and attnotnull ("is_nullable") conform to SQL-2011

Поиск
Список
Период
Сортировка

Make PG's "NOT NULL"s and attnotnull ("is_nullable") conform to SQL-2011

От
Vitaly Burovoy
Дата:
Hello, Hackers!

TODO list has an entry "Move NOT NULL constraint information to
pg_constraint" with four links and without two with the newest
work[1][2].

I rebased the patch from [2] (in attachment). At least it applies
cleanly on top of c477e84fe2471cb675234fce75cd6bb4bc2cf481 and does
not generate a core dump during "make check". There are no tests for
it and it fails "make check" (by difference) which leads inability to
run "make check-world".


But before starting working on it I had a look at the SQL-2011
standard (ISO/IEC 9075-2)[3] and found that:

1. A name for a "NOT NULL" constraint <NNC> can be given by a table
definition (subcl. 11.4, "Format"->"column constraint definition").
2. The standard splits NNC and CHECK constraints (subcl. 11.4,
"Format"-> "column constraint")
3. A _descriptor_ of a column must include a "nullability
characteristic" (subcl. 11.4 GR 4.d and 4.L).
4. At the same time the _descriptor_ of the column must include an
_indication_ whether the column is defined as "NOT NULL" or not (near
subcl. 4.13 Note 41) and a name of the constraint.
5. "Nullability characteristic" is set only by nondeferrable
constraints (subcl. 4.13) but there can be several constraints which
have influence on it (e.g. PK + CHECK but without NNC => is_nullable).
6. If an SQL-implementation can deduce CHECK constraint and/or DOMAIN
constraint to "column IS NULL" can never be TRUE it can claim support
for the feature "T101. Enhanced nullability determination" (subcl.
4.13, Note 38). See also (subcl. 6.35 SR 4.b)
7. NNC can be "deferrable" via "constraint characteristics" which are
set in addition to "column constraint" (i.e. the column can be "NOT
NULL" _and_ not "is_nullable").
8. NNC is an _equivalent_ to a table CHECK constraint (subcl. 11.4 SR 17.a).
9. There is no way to set NNC for a table except "... ALTER COLUMN ...
SET NOT NULL" clause (subcl. 11.6).
10. "... ALTER COLUMN ... SET NOT NULL" clause doesn't allow to
specify name of the constraint (subcl. 11.15).
11. There is no way to specify more than one NNC per column via "SET
NOT NULL" (subcl. 11.15 GR 1)
12. At the same time in (subcl. 4.13) mentioned there can be "at least
one NNC" (may be via inheritance?).
13. "... ALTER COLUMN ... SET NOT NULL" _must_ add a table CHECK
constraint (subcl. 11.15 GR 1.d).
14. "DROP NOT NULL" clause must drop cascading all NNC, but leave
other constraints that may affect "nullability characteristic" (subcl.
11.16 GR 1.*).
15. PK can have NULL values if its "constraint characteristics" is "deferrable".
16. There is no mention of "NOT NULL" constraints in the (ISO/IEC
9075-11) at all.

===
Shortcuts:
subcl: Subclause
GR: General Rule
SR: Syntax Rule


Conclusion:
I. NNC implies CHECK constraints (p.13) but it is not a constraint
itself (p.4, p.9, p.13, p.16; opposite to p.2)
II. CHECK constraint does not imply NNC but it has influence on
"attnotnull" (deep check of CHECK constraints allows to move the
feature "T101" to the supporting features list).
III. "pg_attribute" table should have an "attnotnullid oid" as an
indicator of "NOT NULL" (p.4) and points to a CHECK constraint; It is
in addition to a "Nullability characteristic" "attnotnull" (p.3).
IV. "pg_constraint" should have a column "connotnullkey int2[]" as a
"list of the nullable columns" which references to
"pg_attribute.attnum" for fast checking whether a column is still
nullable after deleting/updating constraints or not. Array is
necessary for cases like "CHECK ((col1 IS NOT NULL) AND (col2 IS NOT
NULL))" and for nondeferrable PKs.
V. Inherited tables inherit CHECK constraints (from p.I), but that
constraints are not written to the "attnotnullid" (from p.II) even if
they have NULL values.
VI. "pg_constraint" _can_ have a column "connotnullpure BOOLEAN" to
skip CHECK constraints which define "NOT NULL" only (for one or
several columns) because a row has already checked for NULLs via
"attnotnull" just before ExecRelCheck is executed.
VII. "connotnullkey" is NULL for deferrable and "NOT VALID" constraints.
VIII. "connotnullkey" is recalculated after "VALIDATE CONSTRAINT" is done.
IX. "attnotnull" is recalculated if a constraint with nonempty
"connotnullkey" is inserted, deleted or "connotnullkey" is changed.
X. Pure CHECK constraint doesn't do full scan if the appropriate
table's column(s) has(ve) "attnotnull" as "TRUE".
XI. pg_dump shows "NOT NULL" iff "attnotnullid IS NOT NULL" and skip
CHECK statement with oid matched with attnotnullid.

What do you think about that design?

P.S.:
Since the SQL standard defines that "col NOT NULL" as an equivalent to
"CHECK (col IS NOT NULL)" (p.8) what to do with that behavior:

postgres=# create type t as (x int);
CREATE TYPE
postgres=# SELECT v, v IS NOT NULL AS should_be_in_table FROM
(VALUES('(1)'::t),('()'),(NULL)) AS x(v);
  v  | should_be_in_table
-----+--------------------
 (1) | t
 ()  | f
     | f
(3 rows)

"attnotnull" in such case is stricter, like "CHECK (col IS DISTINCT FROM NULL)".

Should such values (with NULL in each attribute of a composite type)
violate NOT NULL constraints?

===
[1]http://www.postgresql.org/message-id/flat/1343682669-sup-2532@alvh.no-ip.org
[2]http://www.postgresql.org/message-id/20160109030002.GA671800@alvherre.pgsql
[3]http://www.wiscorp.com/sql20nn.zip

Вложения

Re: Make PG's "NOT NULL"s and attnotnull ("is_nullable") conform to SQL-2011

От
Vitaly Burovoy
Дата:
On 2/7/16, Vitaly Burovoy <vitaly.burovoy@gmail.com> wrote:
> Hello, Hackers!
>
> TODO list has an entry "Move NOT NULL constraint information to
> pg_constraint" with four links and without two with the newest
> work[1][2].
>
> I rebased the patch from [2] (in attachment). At least it applies
> cleanly on top of c477e84fe2471cb675234fce75cd6bb4bc2cf481 and does
> not generate a core dump during "make check". There are no tests for
> it and it fails "make check" (by difference) which leads inability to
> run "make check-world".

It seems the file I attached has more than necessary changes.

Please, find a correct patch attached.

> ===
> [1]http://www.postgresql.org/message-id/flat/1343682669-sup-2532@alvh.no-ip.org
> [2]http://www.postgresql.org/message-id/20160109030002.GA671800@alvherre.pgsql

--
Best regards,
Vitaly Burovoy

Вложения

Re: Make PG's "NOT NULL"s and attnotnull ("is_nullable") conform to SQL-2011

От
Alvaro Herrera
Дата:
Vitaly Burovoy wrote:

Hi,

> But before starting working on it I had a look at the SQL-2011
> standard (ISO/IEC 9075-2)[3] and found that:
> 
> 1. A name for a "NOT NULL" constraint <NNC> can be given by a table
> definition (subcl. 11.4, "Format"->"column constraint definition").
> 2. The standard splits NNC and CHECK constraints (subcl. 11.4,
> "Format"-> "column constraint")

Point 2 is where things differ from what I remember; my (possibly
flawed) understanding was that there's no difference between those
things.  Many (maybe all) of the things from this point on are probably
fallout from that one change.

> III. "pg_attribute" table should have an "attnotnullid oid" as an
> indicator of "NOT NULL" (p.4) and points to a CHECK constraint; It is
> in addition to a "Nullability characteristic" "attnotnull" (p.3).
> IV. "pg_constraint" should have a column "connotnullkey int2[]" as a
> "list of the nullable columns" which references to
> "pg_attribute.attnum" for fast checking whether a column is still
> nullable after deleting/updating constraints or not. Array is
> necessary for cases like "CHECK ((col1 IS NOT NULL) AND (col2 IS NOT
> NULL))" and for nondeferrable PKs.

I think these points warrant some more consideration.  I don't like the
idea that pg_attribute and pg_constraint are both getting considerably
bloated to support this.

> P.S.:
> Since the SQL standard defines that "col NOT NULL" as an equivalent to
> "CHECK (col IS NOT NULL)" (p.8) what to do with that behavior:
> 
> postgres=# create type t as (x int);
> CREATE TYPE
> postgres=# SELECT v, v IS NOT NULL AS should_be_in_table FROM
> (VALUES('(1)'::t),('()'),(NULL)) AS x(v);
>   v  | should_be_in_table
> -----+--------------------
>  (1) | t
>  ()  | f
>      | f
> (3 rows)
> 
> "attnotnull" in such case is stricter, like "CHECK (col IS DISTINCT FROM NULL)".
> 
> Should such values (with NULL in each attribute of a composite type)
> violate NOT NULL constraints?

I wonder if the standard has a concept of null composite values.  If
not, then there is no difference between IS NOT NULL and IS DISTINCT
FROM NULL, which explains why they define NNC in terms of the former.


I think your email was too hard to read because of excessive density,
which would explain the complete lack of response.  I haven't had the
chance to work on this topic again, but I encourage you to, if you have
the resources.  (TBH I haven't had the chance to study your proposed
design in detail, either).

-- 
Álvaro Herrera                http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



Re: Make PG's "NOT NULL"s and attnotnull ("is_nullable") conform to SQL-2011

От
Vitaly Burovoy
Дата:
I'm sorry for the late answer.

On 4/27/16, Alvaro Herrera <alvherre@2ndquadrant.com> wrote:
> Vitaly Burovoy wrote:
>
> Hi,
>
>> But before starting working on it I had a look at the SQL-2011
>> standard (ISO/IEC 9075-2)[3] and found that:
>>
>> 1. A name for a "NOT NULL" constraint <NNC> can be given by a table
>> definition (subcl. 11.4, "Format"->"column constraint definition").
>> 2. The standard splits NNC and CHECK constraints (subcl. 11.4,
>> "Format"-> "column constraint")
>
> Point 2 is where things differ from what I remember; my (possibly
> flawed) understanding was that there's no difference between those
> things.  Many (maybe all) of the things from this point on are probably
> fallout from that one change.

It is just mentioning that CHECK constraints have influence on
nullability characteristic, but it differs from NNC.
NNC creates CHECK constraint, but not vice versa. You can create
several CHECK "col IS NOT NULL" constraints, but only one NNC (several
ones by inheritance only?). And DROP NOT NULL should drop only those
CHECK that is linked with NNC (and inherited), but no more (full
explanation is in my initial letter).

>> III. "pg_attribute" table should have an "attnotnullid oid" as an
>> indicator of "NOT NULL" (p.4) and points to a CHECK constraint; It is
>> in addition to a "Nullability characteristic" "attnotnull" (p.3).
>> IV. "pg_constraint" should have a column "connotnullkey int2[]" as a
>> "list of the nullable columns" which references to
>> "pg_attribute.attnum" for fast checking whether a column is still
>> nullable after deleting/updating constraints or not. Array is
>> necessary for cases like "CHECK ((col1 IS NOT NULL) AND (col2 IS NOT
>> NULL))" and for nondeferrable PKs.
>
> I think these points warrant some more consideration. I don't like the
> idea that pg_attribute and pg_constraint are both getting considerably
> bloated to support this.

Ok, I'm ready for a discussion.

Two additional columns are necessary: one for pointing to an
underlying CHECK constraint (or boolean column whether current CHECK
is NNC or not) and second for fast computation of "attnotnull" (which
means "nullable characteristic") and ability to skip check if
"attnotnull" is set but not triggered (I think it'll improve
performance for inherited tables).

I think placing the first column (attnotnullid) to pg_attribute is
better because you can't have more than one value in it.

The second is obviously should be placed in pg_constraint.

>> P.S.:
>> Since the SQL standard defines that "col NOT NULL" as an equivalent to
>> "CHECK (col IS NOT NULL)" (p.8) what to do with that behavior:
>>
>> postgres=# create type t as (x int);
>> CREATE TYPE
>> postgres=# SELECT v, v IS NOT NULL AS should_be_in_table FROM
>> (VALUES('(1)'::t),('()'),(NULL)) AS x(v);
>>   v  | should_be_in_table
>> -----+--------------------
>>  (1) | t
>>  ()  | f
>>      | f
>> (3 rows)
>>
>> "attnotnull" in such case is stricter, like "CHECK (col IS DISTINCT FROM
>> NULL)".
>>
>> Should such values (with NULL in each attribute of a composite type)
>> violate NOT NULL constraints?
>
> I wonder if the standard has a concept of null composite values.  If
> not, then there is no difference between IS NOT NULL and IS DISTINCT
> FROM NULL, which explains why they define NNC in terms of the former.

Yes, it has. The PG's composite type is "Row types" (subcl.4.8) in the standard.

The standard also differentiates IS [NOT] NULL and IS [NOT] DISTINCT FROM:

>>> Subcl. 8.8 <null predicate>:
>>> ...
>>> 1) Let R be the <row value predicand> and let V be the value of R.
>>> 2) Case:
>>>  a) If V is the null value, then “R IS NULL” is True and
>>>       the value of “R IS NOT NULL” is False.
>>>  b) Otherwise:
>>>   i) The value of “R IS NULL” is
>>>    Case:
>>>    1) If the value of every field of V is the null value, then True.
>>>    2) Otherwise, False.
>>> ...
>>>
>>> Subcl. 8.15 <distinct predicate>
>>> ...
>>> 1) Let V1 be the value of <row value predicand 3> and let V2 be the value of <row value predicand 4>.
>>> ...
>>>  b) If V1 is the null value and V2 is not the null value, or if V1 is not the null value and V2 is the null
>>> value, then the result is True.
>>> ...

In subcl.8.8 "each column" is mentioned, in 8.15 if one of value is
the null value and the other is not then nothing more is checked and
True is returned.

> I think your email was too hard to read because of excessive density,
> which would explain the complete lack of response.

Hmm. I decided it was "silently approved". =)

> I haven't had the chance to work on this topic again, but I encourage you to,
> if you have the resources.

Thank you, I think I'll find a time for it no earlier than the summer.

> (TBH I haven't had the chance to study your proposed design in detail, either).

I hope somebody find a time to study it before someone sends a proposal.

--
Best regards,
Vitaly Burovoy



Re: Make PG's "NOT NULL"s and attnotnull ("is_nullable") conform to SQL-2011

От
Tom Lane
Дата:
Vitaly Burovoy <vitaly.burovoy@gmail.com> writes:
> On 4/27/16, Alvaro Herrera <alvherre@2ndquadrant.com> wrote:
>> Point 2 is where things differ from what I remember; my (possibly
>> flawed) understanding was that there's no difference between those
>> things.  Many (maybe all) of the things from this point on are probably
>> fallout from that one change.

> It is just mentioning that CHECK constraints have influence on
> nullability characteristic, but it differs from NNC.
> NNC creates CHECK constraint, but not vice versa. You can create
> several CHECK "col IS NOT NULL" constraints, but only one NNC (several
> ones by inheritance only?). And DROP NOT NULL should drop only those
> CHECK that is linked with NNC (and inherited), but no more (full
> explanation is in my initial letter).

This seems to me to be a most curious reading of the standard.
SQL:2011 11.4 <column definition> syntax rule 17a says
 If a <column constraint definition> is specified that contains the <column constraint> NOT NULL, then it is equivalent
tothe following <table constraint definition>:
 
    CND CHECK ( C IS NOT NULL ) CA

As a rule, when the SQL spec says "equivalent", they do not mean "it's
sort of like this", they mean the effects are indistinguishable.  In
particular, I see nothing whatsoever saying that you're not allowed to
write more than one per column.

So I don't like the proposal to add an attnotnullid column to
pg_attribute.  What we'd talked about earlier was converting attnotnull
into, effectively, a hint flag saying that there's at least one NOT NULL
constraint attached to the column.  That still seems like a good approach
to me.  When we're actually ready to throw an error for a null value,
we could root through the table's constraint list for a not-null
constraint name to report.  It doesn't matter which one we select, because
constraint application order has never been promised to be deterministic;
and a few extra cycles at that point don't seem like a big problem to me.
        regards, tom lane



Re: Make PG's "NOT NULL"s and attnotnull ("is_nullable") conform to SQL-2011

От
"David G. Johnston"
Дата:
Quick flyby here...

On Tuesday, May 3, 2016, Tom Lane <tgl@sss.pgh.pa.us> wrote:
Vitaly Burovoy <vitaly.burovoy@gmail.com> writes:
> On 4/27/16, Alvaro Herrera <alvherre@2ndquadrant.com> wrote:
>> Point 2 is where things differ from what I remember; my (possibly
>> flawed) understanding was that there's no difference between those
>> things.  Many (maybe all) of the things from this point on are probably
>> fallout from that one change.

> It is just mentioning that CHECK constraints have influence on
> nullability characteristic, but it differs from NNC.
> NNC creates CHECK constraint, but not vice versa. You can create
> several CHECK "col IS NOT NULL" constraints, but only one NNC (several
> ones by inheritance only?). And DROP NOT NULL should drop only those
> CHECK that is linked with NNC (and inherited), but no more (full
> explanation is in my initial letter).

Either it's one, or it's not...


This seems to me to be a most curious reading of the standard.
SQL:2011 11.4 <column definition> syntax rule 17a says

         If a <column constraint definition> is specified that contains
         the <column constraint> NOT NULL, then it is equivalent to the
         following <table constraint definition>:

                CND CHECK ( C IS NOT NULL ) CA

As a rule, when the SQL spec says "equivalent", they do not mean "it's
sort of like this", they mean the effects are indistinguishable.  In
particular, I see nothing whatsoever saying that you're not allowed to
write more than one per column.

Does it define how DROP NOT NULL is supposed to behave?

I agree that the behavior of a column NNC is identical to a similar constraint defined on the table: but if drop not null doesn't impact table constraints then the concept of perfect equality is already lost.


So I don't like the proposal to add an attnotnullid column to
pg_attribute.  What we'd talked about earlier was converting attnotnull
into, effectively, a hint flag saying that there's at least one NOT NULL
constraint attached to the column.


Have we considered making it a table constraint and giving it a name?  We already handle that case without difficulty.

Not looking for a detailed explanation.

David J.

Re: Make PG's "NOT NULL"s and attnotnull ("is_nullable") conform to SQL-2011

От
"David G. Johnston"
Дата:
On Monday, February 8, 2016, Vitaly Burovoy <vitaly.burovoy@gmail.com> wrote:

12. At the same time in (subcl. 4.13) mentioned there can be "at least
one NNC" (may be via inheritance?).


This is a bit hard to reason about given that our implementation of inheritance is non-standard.

Are we close to the standard semantics with regard to this particular dynamic?

David J. 

Re: Make PG's "NOT NULL"s and attnotnull ("is_nullable") conform to SQL-2011

От
Vitaly Burovoy
Дата:
On 5/3/16, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Vitaly Burovoy <vitaly.burovoy@gmail.com> writes:
>> On 4/27/16, Alvaro Herrera <alvherre@2ndquadrant.com> wrote:
>>> Point 2 is where things differ from what I remember; my (possibly
>>> flawed) understanding was that there's no difference between those
>>> things.  Many (maybe all) of the things from this point on are probably
>>> fallout from that one change.
>
>> It is just mentioning that CHECK constraints have influence on
>> nullability characteristic, but it differs from NNC.
>> NNC creates CHECK constraint, but not vice versa. You can create
>> several CHECK "col IS NOT NULL" constraints, but only one NNC (several
>> ones by inheritance only?). And DROP NOT NULL should drop only those
>> CHECK that is linked with NNC (and inherited), but no more (full
>> explanation is in my initial letter).
>
> This seems to me to be a most curious reading of the standard.
> SQL:2011 11.4 <column definition> syntax rule 17a says
>
>      If a <column constraint definition> is specified that contains
>      the <column constraint> NOT NULL, then it is equivalent to the
>      following <table constraint definition>:
>
>         CND CHECK ( C IS NOT NULL ) CA
>
> As a rule, when the SQL spec says "equivalent", they do not mean "it's
> sort of like this", they mean the effects are indistinguishable.  In
> particular, I see nothing whatsoever saying that you're not allowed to
> write more than one per column.

1. SQL:2011 4.13 <Columns, fields, and attributes>:
 — If C is a column of a base table, then an indication of whether it is defined as NOT NULL and, if so, the constraint
nameof the associated table constraint definition.     NOTE 41 — This indication and the associated constraint name
existfor     definitional purposes only and are not exposed through the COLUMNS view     in the Information Schema. 

There is only "constraint name", not "constraint names".

2. SQL:2011 11.15  <set column not null clause> General Rule 1:
... If the column descriptor of C does not contain an indication that
C is defined as NOT NULL, then:

And there is no rule 2. I.e. if the column is already set as NOT NULL
you can't specify it as NOT NULL again.

3. SQL:2011 11.15  <set column not null clause> General Rule 1.d:
 The following <alter table statement> is executed without further
Access Rule checking: ALTER TABLE TN ADD CONSTRAINT IDCN CHECK ( CN IS NOT NULL )


> So I don't like the proposal to add an attnotnullid column to
> pg_attribute.

Why and where to place it?

> What we'd talked about earlier was converting attnotnull
> into, effectively, a hint flag saying that there's at least one NOT NULL
> constraint attached to the column.  That still seems like a good approach
> to me.

Ok. But not only NOT NULL constraint, but also non-deferrable PK,
CHECK, domains, may be the strictest FK.

> When we're actually ready to throw an error for a null value,
> we could root through the table's constraint list for a not-null
> constraint name to report.

attnotnullid is not for reporting, it is for DROP NOT NULL and
recreating "CREATE TABLE" statements via pg_dump.

>  It doesn't matter which one we select, because
> constraint application order has never been promised to be deterministic;
> and a few extra cycles at that point don't seem like a big problem to me.
>
>             regards, tom lane

--
Best regards,
Vitaly Burovoy



Re: Make PG's "NOT NULL"s and attnotnull ("is_nullable") conform to SQL-2011

От
Tom Lane
Дата:
"David G. Johnston" <david.g.johnston@gmail.com> writes:
> This is a bit hard to reason about given that our implementation of
> inheritance is non-standard.

Yeah, that's a fairly key point.  We've solved those problems with
respect to inherited CHECK constraints, and it seems like what we
ought to do with NOT NULL is make it work the same as CHECK, rather
than invent some new concepts.
        regards, tom lane