Обсуждение: BUG #19412: Wrong query result with not null constraint

Поиск
Список
Период
Сортировка

BUG #19412: Wrong query result with not null constraint

От
PG Bug reporting form
Дата:
The following bug has been logged on the website:

Bug reference:      19412
Logged by:          Sergey Shinderuk
Email address:      s.shinderuk@postgrespro.ru
PostgreSQL version: 17.8
Operating system:   Ubuntu 24.04
Description:

Using this script (repro.sql):

drop table if exists a, x, y;

create table a (id int, x_id int, y_id int);
insert into a values (1, 1, 1), (1, 2, 2), (1, 3, 3);
create table x (id int, nm text, constraint pk_x_id primary key (id));
insert into x values (1, 'x1'), (2, 'x2'), (3, 'x3');
create table y (id int, nm text, constraint pk_y_id primary key (id));
insert into y values (1, 'y1'), (3, 'y3'), (4, 'y4');

select a.id, z.id
from a
join x on x.id = a.x_id
left join y on y.id = a.y_id
join lateral(select x.id
             union all
             select y.id) z on z.id is not null;

alter table y drop constraint pk_y_id;
alter table y alter column id drop not null;

select a.id, z.id
from a
join x on x.id = a.x_id
left join y on y.id = a.y_id
join lateral(select x.id
             union all
             select y.id) z on z.id is not null;

on PostgreSQL 17.8 I get:

postgres=# \i repro.sql
DROP TABLE
CREATE TABLE
INSERT 0 3
CREATE TABLE
INSERT 0 3
CREATE TABLE
INSERT 0 3
 id | id
----+----
  1 |  1
  1 |  1
  1 |  2
  1 |
  1 |  3
  1 |  3
(6 rows)

ALTER TABLE
ALTER TABLE
 id | id
----+----
  1 |  1
  1 |  1
  1 |  2
  1 |  3
  1 |  3
(5 rows)

Something seems to be wrong with IS NOT NULL optimization. v18 and master
show the same, v16 is fine.





Re: BUG #19412: Wrong query result with not null constraint

От
Artem Fadeev
Дата:
On 2/17/26 14:19, PG Bug reporting form wrote:
> The following bug has been logged on the website:
> 
> Bug reference:      19412
> Logged by:          Sergey Shinderuk
> Email address:      s.shinderuk@postgrespro.ru
> PostgreSQL version: 17.8
> Operating system:   Ubuntu 24.04
> Description:

git bisect shows the bug was introduced by commit 3af7040985b [0]

I suppose presence of UNION ALL subquery is important for the reproduction
because the commit uses RangeTblEntry->inh, which has the following remark
in the comment in src/include/nodes/parsenodes.h
        * inh is true for relation references that should be expanded to 
include
        * inheritance children, if the rel has any.  In the parser, this 
will
        * only be true for RTE_RELATION entries.  The planner also uses this
        * field to mark RTE_SUBQUERY entries that contain UNION ALL 
queries that
        * it has flattened into pulled-up subqueries (creating a 
structure much
        * like the effects of inheritance).


[0] 
https://github.com/postgres/postgres/commit/3af7040985b6df504a72cd307aad5d69ac5f5384

Regards,
Artem Fadeev.
https://postgrespro.com




Re: BUG #19412: Wrong query result with not null constraint

От
David Rowley
Дата:
On Wed, 18 Feb 2026 at 00:31, PG Bug reporting form
<noreply@postgresql.org> wrote:
> create table a (id int, x_id int, y_id int);
> insert into a values (1, 1, 1), (1, 2, 2), (1, 3, 3);
> create table x (id int, nm text, constraint pk_x_id primary key (id));
> insert into x values (1, 'x1'), (2, 'x2'), (3, 'x3');
> create table y (id int, nm text, constraint pk_y_id primary key (id));
> insert into y values (1, 'y1'), (3, 'y3'), (4, 'y4');
>
> select a.id, z.id
> from a
> join x on x.id = a.x_id
> left join y on y.id = a.y_id
> join lateral(select x.id
>              union all
>              select y.id) z on z.id is not null;

Thanks for the reproducer.

I'd say that y.id Var in the lateral join should be getting marked as
nullable by the left join, but it's not being marked as nullable by
anything.

Tom, do you have any thoughts on the empty varnullingrels here?

David



Re: BUG #19412: Wrong query result with not null constraint

От
Richard Guo
Дата:
On Wed, Feb 18, 2026 at 7:54 AM David Rowley <dgrowleyml@gmail.com> wrote:
> On Wed, 18 Feb 2026 at 00:31, PG Bug reporting form
> <noreply@postgresql.org> wrote:
> > create table a (id int, x_id int, y_id int);
> > insert into a values (1, 1, 1), (1, 2, 2), (1, 3, 3);
> > create table x (id int, nm text, constraint pk_x_id primary key (id));
> > insert into x values (1, 'x1'), (2, 'x2'), (3, 'x3');
> > create table y (id int, nm text, constraint pk_y_id primary key (id));
> > insert into y values (1, 'y1'), (3, 'y3'), (4, 'y4');
> >
> > select a.id, z.id
> > from a
> > join x on x.id = a.x_id
> > left join y on y.id = a.y_id
> > join lateral(select x.id
> >              union all
> >              select y.id) z on z.id is not null;

> Thanks for the reproducer.
>
> I'd say that y.id Var in the lateral join should be getting marked as
> nullable by the left join, but it's not being marked as nullable by
> anything.

Exactly.  I think this is because when adjust_appendrel_attrs_mutator
propagates the nullingrel bits from the parent rel's Var into the
translated Var, it loses the translated Var's original bits.  Instead
of overwriting the translated Var's nullingrels, I think we should
merge them.

--- a/src/backend/optimizer/util/appendinfo.c
+++ b/src/backend/optimizer/util/appendinfo.c
@@ -291,8 +291,11 @@ adjust_appendrel_attrs_mutator(Node *node,
                 var->varattno, get_rel_name(appinfo->parent_reloid));
        if (IsA(newnode, Var))
        {
-           ((Var *) newnode)->varreturningtype = var->varreturningtype;
-           ((Var *) newnode)->varnullingrels = var->varnullingrels;
+           Var    *newvar = (Var *) newnode;
+
+           newvar->varreturningtype = var->varreturningtype;
+           newvar->varnullingrels = bms_add_members(newvar->varnullingrels,
+                                                    var->varnullingrels);
        }

- Richard



Re: BUG #19412: Wrong query result with not null constraint

От
Richard Guo
Дата:
On Wed, Feb 18, 2026 at 10:51 AM Richard Guo <guofenglinux@gmail.com> wrote:
> Exactly.  I think this is because when adjust_appendrel_attrs_mutator
> propagates the nullingrel bits from the parent rel's Var into the
> translated Var, it loses the translated Var's original bits.  Instead
> of overwriting the translated Var's nullingrels, I think we should
> merge them.
>
> --- a/src/backend/optimizer/util/appendinfo.c
> +++ b/src/backend/optimizer/util/appendinfo.c
> @@ -291,8 +291,11 @@ adjust_appendrel_attrs_mutator(Node *node,
>                  var->varattno, get_rel_name(appinfo->parent_reloid));
>         if (IsA(newnode, Var))
>         {
> -           ((Var *) newnode)->varreturningtype = var->varreturningtype;
> -           ((Var *) newnode)->varnullingrels = var->varnullingrels;
> +           Var    *newvar = (Var *) newnode;
> +
> +           newvar->varreturningtype = var->varreturningtype;
> +           newvar->varnullingrels = bms_add_members(newvar->varnullingrels,
> +                                                    var->varnullingrels);
>         }

Here is a more readable version of the patch.

- Richard

Вложения

Re: BUG #19412: Wrong query result with not null constraint

От
Sergey Shinderuk
Дата:
On 2/18/26 12:50, Richard Guo wrote:
> On Wed, Feb 18, 2026 at 10:51 AM Richard Guo <guofenglinux@gmail.com> wrote:
>> Exactly.  I think this is because when adjust_appendrel_attrs_mutator
>> propagates the nullingrel bits from the parent rel's Var into the
>> translated Var, it loses the translated Var's original bits.  Instead
>> of overwriting the translated Var's nullingrels, I think we should
>> merge them.
>>
>> --- a/src/backend/optimizer/util/appendinfo.c
>> +++ b/src/backend/optimizer/util/appendinfo.c
>> @@ -291,8 +291,11 @@ adjust_appendrel_attrs_mutator(Node *node,
>>                   var->varattno, get_rel_name(appinfo->parent_reloid));
>>          if (IsA(newnode, Var))
>>          {
>> -           ((Var *) newnode)->varreturningtype = var->varreturningtype;
>> -           ((Var *) newnode)->varnullingrels = var->varnullingrels;
>> +           Var    *newvar = (Var *) newnode;
>> +
>> +           newvar->varreturningtype = var->varreturningtype;
>> +           newvar->varnullingrels = bms_add_members(newvar->varnullingrels,
>> +                                                    var->varnullingrels);
>>          }
> 
> Here is a more readable version of the patch.
> 

Thank you!

I'm not familiar with the code, just curios. There is a long comment 
above saying "You might think we need to adjust var->varnullingrels, but 
that shouldn't need any changes." Doesn't it need an update?

-- 
Sergey Shinderuk        https://postgrespro.com/



Re: BUG #19412: Wrong query result with not null constraint

От
Richard Guo
Дата:
On Wed, Feb 18, 2026 at 9:03 PM Sergey Shinderuk
<s.shinderuk@postgrespro.ru> wrote:
> I'm not familiar with the code, just curios. There is a long comment
> above saying "You might think we need to adjust var->varnullingrels, but
> that shouldn't need any changes." Doesn't it need an update?

No, I don't think we need to update it.  That comment explains why
varnullingrels do not require translation (since they are outer join
relids, not baserel relids).  It's unrelated to what this patch does,
which is about propagating varnullingrels into the translated Var.

- Richard



Re: BUG #19412: Wrong query result with not null constraint

От
Richard Guo
Дата:
On Wed, Feb 18, 2026 at 6:50 PM Richard Guo <guofenglinux@gmail.com> wrote:
> Here is a more readable version of the patch.

Regarding back-patching, I plan to back-patch this all through to v16.
Although the reported case does not seem to cause problems in v16, the
underlying logic error persists.  Leaving incorrect varnullingrels in
the tree seems like a trap for the unwary.

Any thoughts?

- Richard



Re: BUG #19412: Wrong query result with not null constraint

От
Tom Lane
Дата:
Richard Guo <guofenglinux@gmail.com> writes:
> On Wed, Feb 18, 2026 at 9:03 PM Sergey Shinderuk
> <s.shinderuk@postgrespro.ru> wrote:
>> I'm not familiar with the code, just curios. There is a long comment
>> above saying "You might think we need to adjust var->varnullingrels, but
>> that shouldn't need any changes." Doesn't it need an update?

> No, I don't think we need to update it.  That comment explains why
> varnullingrels do not require translation (since they are outer join
> relids, not baserel relids).  It's unrelated to what this patch does,
> which is about propagating varnullingrels into the translated Var.

I agree with this fix: I think the code is like it is simply because
it didn't occur to me that the child Vars could have any nullingrel
bits yet.  However, I don't agree that that comment needs no updates.
I suggest something like

-         * Below, we just propagate var->varnullingrels into the translated
-         * Var.
+         * Below, we just merge var->varnullingrels into the translated
+         * Var.  (We must merge not just copy: the child Var could have
+         * some nullingrel bits set already, and we mustn't drop those.)

Also, I think I'd then drop the comment you added adjacent to the
actual update; it seems redundant if the earlier comment says this.

I agree with back-patching to v16.  This particular example doesn't
misbehave in versions that don't have the drop-allegedly-redundant-
NOT-NULL-tests logic, but the varnullingrels are certainly wrong
all the way back, so possibly there are other examples that do
misbehave in v16.

            regards, tom lane



Re: BUG #19412: Wrong query result with not null constraint

От
Richard Guo
Дата:
On Fri, Feb 20, 2026 at 3:25 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:
> I agree with this fix: I think the code is like it is simply because
> it didn't occur to me that the child Vars could have any nullingrel
> bits yet.  However, I don't agree that that comment needs no updates.
> I suggest something like
>
> -         * Below, we just propagate var->varnullingrels into the translated
> -         * Var.
> +         * Below, we just merge var->varnullingrels into the translated
> +         * Var.  (We must merge not just copy: the child Var could have
> +         * some nullingrel bits set already, and we mustn't drop those.)
>
> Also, I think I'd then drop the comment you added adjacent to the
> actual update; it seems redundant if the earlier comment says this.

Thanks for taking a look.  I've updated the comments as suggested,

> I agree with back-patching to v16.  This particular example doesn't
> misbehave in versions that don't have the drop-allegedly-redundant-
> NOT-NULL-tests logic, but the varnullingrels are certainly wrong
> all the way back, so possibly there are other examples that do
> misbehave in v16.

... and then pushed and back-patched to v16.

Thank you, Sergey, for the report and the excellent self-contained
repro query.  This is a great catch.  (I'm curious how you found this
bug -- was it from a production query or a fuzzing tool?)

- Richard



Re: BUG #19412: Wrong query result with not null constraint

От
Sergey Shinderuk
Дата:
On 2/20/26 12:57, Richard Guo wrote:
> ... and then pushed and back-patched to v16.

Thank you!

> Thank you, Sergey, for the report and the excellent self-contained
> repro query.  This is a great catch.  (I'm curious how you found this
> bug -- was it from a production query or a fuzzing tool?)

The bug was encountered by our customer upgrading to v17 and my 
colleague passed this repro to me.

-- 
Sergey Shinderuk        https://postgrespro.com/