Обсуждение: BUG #14235: inconsistencies with IS NULL / IS NOT NULL

Поиск
Список
Период
Сортировка

BUG #14235: inconsistencies with IS NULL / IS NOT NULL

От
andrew@tao11.riddles.org.uk
Дата:
VGhlIGZvbGxvd2luZyBidWcgaGFzIGJlZW4gbG9nZ2VkIG9uIHRoZSB3ZWJz
aXRlOgoKQnVnIHJlZmVyZW5jZTogICAgICAxNDIzNQpMb2dnZWQgYnk6ICAg
ICAgICAgIEFuZHJldyBHaWVydGgKRW1haWwgYWRkcmVzczogICAgICBhbmRy
ZXdAdGFvMTEucmlkZGxlcy5vcmcudWsKUG9zdGdyZVNRTCB2ZXJzaW9uOiA5
LjZiZXRhMgpPcGVyYXRpbmcgc3lzdGVtOiAgIGFueQpEZXNjcmlwdGlvbjog
ICAgICAgIAoKV2hpbGUgZGlzY3Vzc2luZyB0aGUgaXNzdWVzIHJlc3VsdGlu
ZyBmcm9tIHRoZSBzcGVjJ3MgZGVmaW5pdGlvbiBvZiBJUyBOVUxMCmFuZCBJ
UyBOT1QgTlVMTCBvbiBjb21wb3NpdGUgdmFsdWVzLCB3ZSBkaXNjb3ZlcmVk
IHRoZSBmb2xsb3dpbmcKaW5jb25zaXN0ZW50IGNhc2VzLCBhbGwgb2Ygd2hp
Y2ggYXBwZWFyIHRvIGJlIHRoZSBmYXVsdCBvZgpldmFsX2NvbnN0X2V4cHJl
c3Npb25zLg0KDQpJbiBzaG9ydCwgdGhlIHRyYW5zZm9ybWF0aW9ucyBhcHBs
aWVkIHRvIE51bGxUZXN0IG5vZGVzIGNoYW5nZSB0aGUKc2VtYW50aWNzLCB3
aGljaCBjYW4gdGhlbiBtYWtlIHRoZSBmaW5hbCByZXN1bHQgZGVwZW5kIG9u
LCBmb3IgZXhhbXBsZSwKd2hldGhlciBmdW5jdGlvbiBjYWxscyB3ZXJlIGlu
bGluZWQgb3Igbm90Lg0KDQoNCmNyZWF0ZSB0eXBlIGN0MSBhcyAoYSBpbnRl
Z2VyLCBiIGludGVnZXIpOw0KY3JlYXRlIHR5cGUgY3QyIGFzIChhIGludGVn
ZXIsIGIgY3QxKTsNCmNyZWF0ZSB0eXBlIGN0MyBhcyAoYSBpbnRlZ2VyLCBi
IGN0Mik7DQoNCmNyZWF0ZSBmdW5jdGlvbiBtYWtlX2N0MShpbnRlZ2VyLGlu
dGVnZXIpIHJldHVybnMgY3QxDQogIGxhbmd1YWdlIHNxbCBpbW11dGFibGUN
CiAgYXMgJGYkIHNlbGVjdCByb3coJDEsJDIqJDIpOjpjdDE7ICRmJDsNCg0K
Y3JlYXRlIGZ1bmN0aW9uIG1ha2VfY3QyKGludGVnZXIsaW50ZWdlcikgcmV0
dXJucyBjdDINCiAgbGFuZ3VhZ2Ugc3FsIGltbXV0YWJsZQ0KICBhcyAkZiQg
c2VsZWN0IHJvdygkMSxyb3coJDIsJDIpOjpjdDEpOjpjdDI7ICRmJDsNCg0K
Y3JlYXRlIGZ1bmN0aW9uIG1ha2VfY3QzKGludGVnZXIsaW50ZWdlcikgcmV0
dXJucyBjdDMNCiAgbGFuZ3VhZ2Ugc3FsIGltbXV0YWJsZQ0KICBhcyAkZiQg
c2VsZWN0IHJvdygkMSxyb3coJDIscm93KCQyLCQyKTo6Y3QxKTo6Y3QyKTo6
Y3QzOyAkZiQ7DQoNCmNyZWF0ZSBmdW5jdGlvbiBzdGFibGVfbnVsbCgpIHJl
dHVybnMgaW50ZWdlcg0KICBsYW5ndWFnZSBwbHBnc3FsIHN0YWJsZSBjb3N0
IDENCiAgYXMgJGYkIGJlZ2luIHJldHVybiBudWxsOyBlbmQ7ICRmJDsNCg0K
Y3JlYXRlIGZ1bmN0aW9uIHZvbGF0aWxlX251bGwoKSByZXR1cm5zIGludGVn
ZXINCiAgbGFuZ3VhZ2UgcGxwZ3NxbCB2b2xhdGlsZSBjb3N0IDEwMDANCiAg
YXMgJGYkIGJlZ2luIHJldHVybiBudWxsOyBlbmQ7ICRmJDsNCg0Kc2VsZWN0
IHJvdyhudWxsLG51bGwpOjpjdDEgaXMgbnVsbCBhcyBjdDEsDQogICAgICAg
cm93KG51bGwscm93KG51bGwsbnVsbCk6OmN0MSk6OmN0MiBpcyBudWxsIGFz
IGN0MiwNCiAgICAgICByb3cobnVsbCxyb3cobnVsbCxyb3cobnVsbCxudWxs
KTo6Y3QxKTo6Y3QyKTo6Y3QzIGlzIG51bGwgYXMgY3QzOw0KDQpzZWxlY3Qg
cm93KDEsbnVsbCk6OmN0MSBpcyBub3QgbnVsbCBhcyByb3dfY29ucywNCiAg
ICAgICAnKDEsKSc6OmN0MSBpcyBub3QgbnVsbCBhcyByb3dfbGl0ZXJhbCwN
CiAgICAgICBtYWtlX2N0MSgxLG51bGwpIGlzIG5vdCBudWxsIGFzIGlubGlu
ZWRfY29uc3QsDQogICAgICAgbWFrZV9jdDEoMSxzdGFibGVfbnVsbCgpKSBp
cyBub3QgbnVsbCBhcyBpbmxpbmVkX2Z1bmMsDQogICAgICAgbWFrZV9jdDEo
MSx2b2xhdGlsZV9udWxsKCkpIGlzIG5vdCBudWxsIGFzIGNhbGxlZF9mdW5j
Ow0KDQpzZWxlY3Qgcm93KDEscm93KG51bGwsbnVsbCk6OmN0MSk6OmN0MiBp
cyBub3QgbnVsbCBhcyByb3dfY29ucywNCiAgICAgICAnKDEsIigsKSIpJzo6
Y3QyIGlzIG5vdCBudWxsIGFzIHJvd19saXRlcmFsLA0KICAgICAgIG1ha2Vf
Y3QyKDEsbnVsbCkgaXMgbm90IG51bGwgYXMgaW5saW5lZF9jb25zdCwNCiAg
ICAgICBtYWtlX2N0MigxLHN0YWJsZV9udWxsKCkpIGlzIG5vdCBudWxsIGFz
IGlubGluZWRfZnVuYywNCiAgICAgICBtYWtlX2N0MigxLHZvbGF0aWxlX251
bGwoKSkgaXMgbm90IG51bGwgYXMgY2FsbGVkX2Z1bmM7DQoNCnNlbGVjdCBy
b3coMSxyb3cobnVsbCxyb3cobnVsbCxudWxsKTo6Y3QxKTo6Y3QyKTo6Y3Qz
IGlzIG5vdCBudWxsIGFzCnJvd19jb25zLA0KICAgICAgICcoMSwiKCwiIigs
KSIiKSIpJzo6Y3QzIGlzIG5vdCBudWxsIGFzIHJvd19saXRlcmFsLA0KICAg
ICAgIG1ha2VfY3QzKDEsbnVsbCkgaXMgbm90IG51bGwgYXMgaW5saW5lZF9j
b25zdCwNCiAgICAgICBtYWtlX2N0MygxLHN0YWJsZV9udWxsKCkpIGlzIG5v
dCBudWxsIGFzIGlubGluZWRfZnVuYywNCiAgICAgICBtYWtlX2N0MygxLHZv
bGF0aWxlX251bGwoKSkgaXMgbm90IG51bGwgYXMgY2FsbGVkX2Z1bmM7DQoN
CnNlbGVjdCByb3cobnVsbCxudWxsKTo6Y3QxIGlzIG51bGwgYXMgcm93X2Nv
bnMsDQogICAgICAgJygsKSc6OmN0MSBpcyBudWxsIGFzIHJvd19saXRlcmFs
LA0KICAgICAgIG1ha2VfY3QxKG51bGwsbnVsbCkgaXMgbnVsbCBhcyBpbmxp
bmVkX2NvbnN0LA0KICAgICAgIG1ha2VfY3QxKG51bGwsc3RhYmxlX251bGwo
KSkgaXMgbnVsbCBhcyBpbmxpbmVkX2Z1bmMsDQogICAgICAgbWFrZV9jdDEo
bnVsbCx2b2xhdGlsZV9udWxsKCkpIGlzIG51bGwgYXMgY2FsbGVkX2Z1bmM7
DQoNCnNlbGVjdCByb3cobnVsbCxyb3cobnVsbCxudWxsKTo6Y3QxKTo6Y3Qy
IGlzIG51bGwgYXMgcm93X2NvbnMsDQogICAgICAgJygsIigsKSIpJzo6Y3Qy
IGlzIG51bGwgYXMgcm93X2xpdGVyYWwsDQogICAgICAgbWFrZV9jdDIobnVs
bCxudWxsKSBpcyBudWxsIGFzIGlubGluZWRfY29uc3QsDQogICAgICAgbWFr
ZV9jdDIobnVsbCxzdGFibGVfbnVsbCgpKSBpcyBudWxsIGFzIGlubGluZWRf
ZnVuYywNCiAgICAgICBtYWtlX2N0MihudWxsLHZvbGF0aWxlX251bGwoKSkg
aXMgbnVsbCBhcyBjYWxsZWRfZnVuYzsNCg0Kc2VsZWN0IHJvdyhudWxsLHJv
dyhudWxsLHJvdyhudWxsLG51bGwpOjpjdDEpOjpjdDIpOjpjdDMgaXMgbnVs
bCBhcwpyb3dfY29ucywNCiAgICAgICAnKCwiKCwiIigsKSIiKSIpJzo6Y3Qz
IGlzIG51bGwgYXMgcm93X2xpdGVyYWwsDQogICAgICAgbWFrZV9jdDMobnVs
bCxudWxsKSBpcyBudWxsIGFzIGlubGluZWRfY29uc3QsDQogICAgICAgbWFr
ZV9jdDMobnVsbCxzdGFibGVfbnVsbCgpKSBpcyBudWxsIGFzIGlubGluZWRf
ZnVuYywNCiAgICAgICBtYWtlX2N0MyhudWxsLHZvbGF0aWxlX251bGwoKSkg
aXMgbnVsbCBhcyBjYWxsZWRfZnVuYzsNCg0KDQoKCg==

Re: BUG #14235: inconsistencies with IS NULL / IS NOT NULL

От
Andrew Gierth
Дата:
>>>>> "andrew" == andrew  <andrew@tao11.riddles.org.uk> writes:

 andrew> While discussing the issues resulting from the spec's
 andrew> definition of IS NULL and IS NOT NULL on composite values, we
 andrew> discovered the following inconsistent cases, all of which
 andrew> appear to be the fault of eval_const_expressions.

 andrew> In short, the transformations applied to NullTest nodes change
 andrew> the semantics, which can then make the final result depend on,
 andrew> for example, whether function calls were inlined or not.

And here's my analysis of what seems to be going on:

The executor, when doing IS [NOT] NULL on a composite value, looks at
each column to see if it is the null value. It does NOT recurse into
nested composite values, and my reading of the spec suggests that this
is correct.

But eval_const_expressions thinks it has license to change

  ROW(a,b) IS NULL

into

  (a IS NULL) AND (b IS NULL)

which has the effect of recursing one level into a nested composite
value.

It seems possible that this could be fixed by simply setting
argisrow=false for all the null tests generated in such cases.
Specifically I've tried this, which seems to fix all the inconsistent
cases:

--- a/src/backend/optimizer/util/clauses.c
+++ b/src/backend/optimizer/util/clauses.c
@@ -3308,7 +3308,13 @@ eval_const_expressions_mutator(Node *node,
                                                newntest = makeNode(NullTest);
                                                newntest->arg = (Expr *) relem;
                                                newntest->nulltesttype = ntest->nulltesttype;
-                                               newntest->argisrow = type_is_rowtype(exprType(relem));
+                                               /*
+                                                * We must unconditionally set argisrow false here.
+                                                * Otherwise, we'd be treating a nested composite
+                                                * structure as though it were at top level, which
+                                                * would change the semantics of the test.
+                                                */
+                                               newntest->argisrow = false;
                                                newntest->location = ntest->location;
                                                newargs = lappend(newargs, newntest);
                                        }


--
Andrew (irc:RhodiumToad)

Re: BUG #14235: inconsistencies with IS NULL / IS NOT NULL

От
Andrew Gierth
Дата:
Here's that change as a proper patch with the original testcase added as
a regression test.

--
Andrew (irc:RhodiumToad)


Вложения

Re: BUG #14235: inconsistencies with IS NULL / IS NOT NULL

От
Andrew Gierth
Дата:
>>>>> "Andrew" == Andrew Gierth <andrew@tao11.riddles.org.uk> writes:

 Andrew> It seems possible that this could be fixed by simply setting
 Andrew> argisrow=false for all the null tests generated in such cases.

This turns out to be more somplicated than I thought. A lot of places
look at argisrow to distinguish "simple" null tests from the
standard-required logic for composite values. In some of the cases I've
looked at, fixing the bug actually looks like it improves things (for
example, in (row(1,a) IS NOT NULL), a can be deduced to be not null in
the "not the null value" sense); but some places like
match_clause_to_indexcol I'm less sure of.

--
Andrew (irc:RhodiumToad)

Re: BUG #14235: inconsistencies with IS NULL / IS NOT NULL

От
Tom Lane
Дата:
Andrew Gierth <andrew@tao11.riddles.org.uk> writes:
> And here's my analysis of what seems to be going on:

> The executor, when doing IS [NOT] NULL on a composite value, looks at
> each column to see if it is the null value. It does NOT recurse into
> nested composite values, and my reading of the spec suggests that this
> is correct.

Hmm.  Of course the $64 question is whether that really is correct, or
sensible.

I went to look at the spec, and discovered that SQL:2011 actually has
wording that is different from SQL99, which I think is what we relied
on last time we considered this issue.  Specifically, in 2011,
section 8.8 <null predicate> quoth:

<null predicate> ::= <row value predicand> <null predicate part 2>
<null predicate part 2> ::= IS [ NOT ] NULL

(Oddly, SQL does not seem to allow IS [NOT] NULL on non-composite values,
which is just silly.)

1) Let R be the <row value predicand> and let V be the value of R.

2) Case:
  a) If V is the null value, then “R IS NULL” is True and the value of
     “R IS NOT NULL” is False.
  b) Otherwise:
    i) The value of “R IS NULL” is
       Case:
       1) If the value of every field of V is the null value, then True.
       2) Otherwise, False.
    ii) The value of “R IS NOT NULL” is
       Case:
       1) If the value of no field of V is the null value, then True.
       2) Otherwise, False.

NOTE 267 — For all R, “R IS NOT NULL” has the same result as “NOT R IS
NULL” if and only if R is of degree 1.

Rule (2a) was not there in SQL99.  But look at what this is doing: it
is admitting straight out that a null composite value is not the same
as a composite value all of whose fields are null.  It is only asserting
that a <null predicate> will not distinguish them.  The implication is
that it's just fine if, say, COALESCE() doesn't act that way.  Previously,
we thought this part of the spec was supposed to define what "V is null"
means everywhere else in the spec if V is composite; but now it seems
clear that "V is null" is a primitive test that is not the same as the
<null predicate> construct.

> It seems possible that this could be fixed by simply setting
> argisrow=false for all the null tests generated in such cases.

I concur that this is an appropriate fix if we believe that
ExecEvalNullTest's behavior is correct.

            regards, tom lane

Re: BUG #14235: inconsistencies with IS NULL / IS NOT NULL

От
Andrew Gierth
Дата:
>>>>> "Tom" == Tom Lane <tgl@sss.pgh.pa.us> writes:

 >> The executor, when doing IS [NOT] NULL on a composite value, looks
 >> at each column to see if it is the null value. It does NOT recurse
 >> into nested composite values, and my reading of the spec suggests
 >> that this is correct.

 Tom> Hmm.  Of course the $64 question is whether that really is
 Tom> correct, or sensible.

 Tom> I went to look at the spec, and discovered that SQL:2011 actually
 Tom> has wording that is different from SQL99, which I think is what we
 Tom> relied on last time we considered this issue.

I've been mainly using the 2008 spec, but it seems to have identical
wording to the 2011 one.

My reading is as follows, based on these parts of the spec:

  [Framework] 4.4.2 The null value
    Every data type includes a special value, called the null value,
    sometimes denoted by the keyword NULL. This value differs from other
    values in the following respects: [...]

  [Foundation] 8.8 <null predicate>  (as you quoted)

It seems to me that where the spec refers to "the null value", this is
exactly what we do with the "isnull" flag, and that <null predicate> is
a separate thing.

Where the spec refers to the "fields" of a row, it at least strongly
implies that these are never taken recursively:

  [Framework] 4.4.5.2 Row types
    A row type is a sequence of one or more (field name, data type)
    pairs, known as fields. A value of a row type consists of one value
    for each of its fields.

Also in 6.2 <field definition> the "degree" of a row type is defined in
such a way that a row type with one declared field, whose type is also a
row type, has a degree of 1. So the comment in 8.8 about degree-1 rows
only makes sense if 8.8 isn't supposed to recurse.

Plus, if 8.8 had been intended to be recursive, it could easily have
been written that way.

Most of the references to nullity in the spec talk specifically about
"the null value" rather than referencing IS [NOT] NULL. In particular,
constructs like count(x) refer to "eliminating the null value", which in
turn implies that count(x) is not the same thing as count(*) filter
(where x is not null) if x is a row type. Also, IS DISTINCT FROM only
refers to "the null value", so (x IS DISTINCT FROM NULL) is not the same
as either (x IS NOT NULL) or (NOT (x IS NULL)). Likewise the rules for
<routine invocation> only talk about "the null value" in deciding
whether a routine declared RETURNS NULL ON NULL INPUT (i.e. STRICT) is
actually called.

The places that explicitly use IS [NOT] NULL are:

  - some definitions about whether values are known-not-null, which is
    ok because (x IS NOT NULL) clearly implies that x is not the null
    value

  - some rules for multiset operations where the operands must be
    multisets, not rows

  - NOT NULL constraints on columns are transformed into CHECK (col IS
    NOT NULL), which isn't what we do: our not-null constraint currently
    only excludes the null value, though this has been discussed in some
    past threads

  - COALESCE is implemented as CASE WHEN x IS NOT NULL THEN ... which
    again is not what we do. The logic of the spec's version of this is
    truly bizarre, and I can't imagine any possible use for it; having
    COALESCE(row(1,null),row(2,null)) return (2,null) as the spec
    demands seems to be all kinds of wrong.

 Tom> <null predicate> ::= <row value predicand> <null predicate part 2>
 Tom> <null predicate part 2> ::= IS [ NOT ] NULL

 Tom> (Oddly, SQL does not seem to allow IS [NOT] NULL on non-composite
 Tom> values, which is just silly.)

<row value predicand> can be a <row value constructor predicand> which
can be a <common value expression> or a <boolean predicand>, either of
which can be a <value expression primary>, and in both of these cases
the syntax rules specify that they are to be replaced by ROW(X).  So
non-composite values are allowed and are indistinguishable from
composite values of degree 1.

(Note that <row value predicand> can also be a <row value special case>
which can be a <nonparenthesized value expression primary>, but that is
required to be of a row type. The spec here seems to rely on constraints
about the declared types to disambiguate between the case of a <row
value special case> and a <common value expression> which is a
<nonparenthesized value expression primary>.)

 Tom> Rule (2a) was not there in SQL99.  But look at what this is doing:
 Tom> it is admitting straight out that a null composite value is not
 Tom> the same as a composite value all of whose fields are null.  It is
 Tom> only asserting that a <null predicate> will not distinguish them.

Right, that's my reading too.

 Tom> The implication is that it's just fine if, say, COALESCE() doesn't
 Tom> act that way.

Not COALESCE, because COALESCE is defined as a syntactic transform to a
CASE expression that uses IS NOT NULL.

But I really don't think we should follow the spec on that one.

 >> It seems possible that this could be fixed by simply setting
 >> argisrow=false for all the null tests generated in such cases.

 Tom> I concur that this is an appropriate fix if we believe that
 Tom> ExecEvalNullTest's behavior is correct.

I believe it is, as explained above.

That said, there may also be merit in arguing that the spec is just
broken on this point. I think back when IS NULL was originally changed
to conform to the spec, it might have been better to just say "no,
that's silly" and stick with the old behavior.

We wouldn't strictly speaking be violating the spec if we did that,
because we don't claim to support feature T051 "Row types", and so we
can just say that our row types are a nonstandard extension without the
standard IS NULL behavior. We do in fact violate the spec here in many
ways; for example we treat types from CREATE TYPE foo AS (fields) as
being row types, but in fact the spec says these are user-defined
structured types and not row types.

But whether it would be a good idea to actually revert to the old
way... that's another question. Does anyone actually _want_ the spec
behavior of IS [NOT] NULL?

--
Andrew (irc:RhodiumToad)

Re: BUG #14235: inconsistencies with IS NULL / IS NOT NULL

От
Tom Lane
Дата:
Andrew Gierth <andrew@tao11.riddles.org.uk> writes:
> That said, there may also be merit in arguing that the spec is just
> broken on this point. I think back when IS NULL was originally changed
> to conform to the spec, it might have been better to just say "no,
> that's silly" and stick with the old behavior.

> We wouldn't strictly speaking be violating the spec if we did that,
> because we don't claim to support feature T051 "Row types", and so we
> can just say that our row types are a nonstandard extension without the
> standard IS NULL behavior. We do in fact violate the spec here in many
> ways; for example we treat types from CREATE TYPE foo AS (fields) as
> being row types, but in fact the spec says these are user-defined
> structured types and not row types.

> But whether it would be a good idea to actually revert to the old
> way... that's another question. Does anyone actually _want_ the spec
> behavior of IS [NOT] NULL?

Good question.  The existing behavior seems to be traceable as far back as
68d9fbeb5511d846ce3a6f66b8955d3ca55a4b76, which implemented the specific
syntax ROW( ... ) IS [NOT] NULL by dint of breaking the ROW construct into
its field expressions and applying scalar NullTest to each one.  I can't
find anything in the mail archives discussing it specifically, and it was
only a very minor element of that patch.  I strongly suspect that Lockhart
just did what the spec said without thinking about it much; there is
certainly no trace of user requests for this behavior.

Later on we improved it to be less inconsistent with other ways of
specifying a rowtype argument, but again I don't see much indication of
user requests driving that, see here:
https://www.postgresql.org/message-id/flat/451BDB28.4050201%40sigaev.ru

But the other side of the coin is that the behavior does date back ten
years or more.  Between backwards compatibility worries and the quite
clear language of the spec, it's a bit hard to summon the conviction to
say we ought to change it, much as I'd like to.

            regards, tom lane

Re: BUG #14235: inconsistencies with IS NULL / IS NOT NULL

От
Tom Lane
Дата:
I wrote:
> Good question.  The existing behavior seems to be traceable as far back as
> 68d9fbeb5511d846ce3a6f66b8955d3ca55a4b76, which implemented the specific
> syntax ROW( ... ) IS [NOT] NULL by dint of breaking the ROW construct into
> its field expressions and applying scalar NullTest to each one.

BTW, thinking about that a bit more, I'm pretty sure that back then we did
not have any robust notion of composite-type datums at all, and thus that
there simply would not have been any other way to implement this syntax.
It would not have worked to treat ROW(...) as a construct that delivered
a single Datum to a standard NullTest expression node.  So it was pretty
convenient for Lockhart that the spec was written the way it was.

Also note the observation in the 2006 thread that the N-scalar-NullTests
implementation didn't get the semantics quite right anyway, which makes it
even less likely that anybody was depending on the spec semantics between
2002 and 2006.

But that still leaves us with ten years of history in which we *were*
conforming to the spec, modulo the very narrow corner case mentioned
in this thread.

            regards, tom lane

Re: BUG #14235: inconsistencies with IS NULL / IS NOT NULL

От
Andrew Gierth
Дата:
>>>>> "Tom" == Tom Lane <tgl@sss.pgh.pa.us> writes:

 Tom> But that still leaves us with ten years of history in which we
 Tom> *were* conforming to the spec, modulo the very narrow corner case
 Tom> mentioned in this thread.

Yeah, but the main visible effect of that has been a stream of "you have
to use NOT (x IS NULL) rather than (x IS NOT NULL)" responses to people
having trouble with this.

Is there a single reported case where anyone has actually needed the
spec's version of (x IS NOT NULL) for a composite type?

--
Andrew (irc:RhodiumToad)

Re: BUG #14235: inconsistencies with IS NULL / IS NOT NULL

От
Tom Lane
Дата:
Andrew Gierth <andrew@tao11.riddles.org.uk> writes:
> Yeah, but the main visible effect of that has been a stream of "you have
> to use NOT (x IS NULL) rather than (x IS NOT NULL)" responses to people
> having trouble with this.

I don't offhand recall any such complaints on pgsql-bugs.  Maybe there
have been some on IRC.

> Is there a single reported case where anyone has actually needed the
> spec's version of (x IS NOT NULL) for a composite type?

By definition, we get no "reports" for a case where something works
as someone expects.  So you're demanding proof that cannot exist.

            regards, tom lane

Re: BUG #14235: inconsistencies with IS NULL / IS NOT NULL

От
"David G. Johnston"
Дата:
On Fri, Jul 22, 2016 at 8:22 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:

> Andrew Gierth <andrew@tao11.riddles.org.uk> writes:
> > Yeah, but the main visible effect of that has been a stream of "you hav=
e
> > to use NOT (x IS NULL) rather than (x IS NOT NULL)" responses to people
> > having trouble with this.
>
> I don't offhand recall any such complaints on pgsql-bugs.  Maybe there
> have been some on IRC.
>
> > Is there a single reported case where anyone has actually needed the
> > spec's version of (x IS NOT NULL) for a composite type?
>
> By definition, we get no "reports" for a case where something works
> as someone expects.  So you're demanding proof that cannot exist.
>

Yeah, I'd say we lump this into the same bucket as "unintentional
correlated subqueries" and "forgot to add a where clause to your delete
statement".

In short, don't use "IS NOT NULL".  But, sorry, we cannot actually prohibit
it=E2=80=8B without upsetting lots of people.

David J.
=E2=80=8B

Re: BUG #14235: inconsistencies with IS NULL / IS NOT NULL

От
Andrew Gierth
Дата:
>>>>> "Tom" == Tom Lane <tgl@sss.pgh.pa.us> writes:

 >> Yeah, but the main visible effect of that has been a stream of "you
 >> have to use NOT (x IS NULL) rather than (x IS NOT NULL)" responses
 >> to people having trouble with this.

 Tom> I don't offhand recall any such complaints on pgsql-bugs.  Maybe
 Tom> there have been some on IRC.

Indeed so.

As an aside, while looking back through logs I found this, which was
reported as bug #7808 in 2013 and is still unfixed:

postgres=# select * from unnest(array[row('a',1)::c1,null::c1]) u;
ERROR:  function returning set of rows cannot return null value

(it came up in the context of a "how to strip nulls from an array"
question; array(select u from unnest(a) u where u is not null) obviously
works only for scalars, and you need not (u is null) instead.)

 >> Is there a single reported case where anyone has actually needed the
 >> spec's version of (x IS NOT NULL) for a composite type?

 Tom> By definition, we get no "reports" for a case where something
 Tom> works as someone expects.  So you're demanding proof that cannot
 Tom> exist.

I meant "reported" in a more general sense that bug reports, obviously.

--
Andrew (irc:RhodiumToad)

Re: BUG #14235: inconsistencies with IS NULL / IS NOT NULL

От
Tom Lane
Дата:
Andrew Gierth <andrew@tao11.riddles.org.uk> writes:
> "Andrew" == Andrew Gierth <andrew@tao11.riddles.org.uk> writes:
>  Andrew> It seems possible that this could be fixed by simply setting
>  Andrew> argisrow=false for all the null tests generated in such cases.

> This turns out to be more somplicated than I thought. A lot of places
> look at argisrow to distinguish "simple" null tests from the
> standard-required logic for composite values. In some of the cases I've
> looked at, fixing the bug actually looks like it improves things (for
> example, in (row(1,a) IS NOT NULL), a can be deduced to be not null in
> the "not the null value" sense); but some places like
> match_clause_to_indexcol I'm less sure of.

I looked around to see what other consequences there might be.  I think
that match_clause_to_indexcol is fine, because it's mighty unlikely that
any index type would implement argisrow semantics for IS[NOT]NULL.
However, I found a couple of things that seem like bugs:

1. get_relation_constraints() converts attnotnull column markings into
"col IS NOT NULL" constraints, ie it sets argisrow true for a composite
column.  Perhaps ideally that would be an accurate representation of the
constraint semantics, but today it is not; a correct representation would
be a "simple not null" constraint.  We are overpromising what the
constraint implies.

2. ruleutils.c will print a NullTest node as "IS [NOT] NULL" regardless
of its argisrow setting.  For the case with a rowtype argument and not
argisrow, that is an incorrect depiction of the semantics.  Since such a
case can't arise in parser output, only in a plan, this doesn't break
view dumping, but it could lead to misleading EXPLAIN output.

3. postgres_fdw's deparse.c is likewise naive about NullTest, and here
that *can* lead to an indisputable bug if a const-folded condition is
being sent to the remote side.  It'd be better to issue IS [NOT] DISTINCT
FROM NULL in such cases.

Problem #1 is a pre-existing error, but problems #2/#3 are newly
introduced by this patch, since up to now argisrow has merely been a
pre-cached indication of the argument type, not an independent variable.

So one solution approach is to say "okay, argisrow is now independent,
deal with it".  In that case #1 could be fixed trivially by setting
argisrow = false in the generated NullTest, and I think we would have
to modify ruleutils.c and deparse.c to print scalar NullTest on a
composite value as IS [NOT] DISTINCT FROM NULL.

The other general answer is to revert to the previous belief that argisrow
is merely a helpful cache.  In that case, ruleutils is fine, but both
eval_const_expressions and get_relation_constraints would need to be fixed
to emit DistinctExprs in the cases where they now want to emit an argisrow
= false NullTest for a composite input.

The first approach seems to carry a rather larger risk of breaking
third-party code, since it removes an invariant that used to exist.
(The fact that we have to touch postgres_fdw is certainly a red flag
that other FDWs might be affected.)  On the other hand, the planner
generally has a lot more intelligence about NullTest than about
DistinctExpr, and so I'm worried that substituting the latter for the
former might result in some nasty performance regressions for specific
queries.

I'm leaning slightly to the first approach, but could probably be
talked out of it if anyone sees additional arguments.  Comments?

            regards, tom lane