Обсуждение: BUG #18260: Unexpected error: "negative bitmapset member not allowed" triggered by multiple JOIN

Поиск
Список
Период
Сортировка

BUG #18260: Unexpected error: "negative bitmapset member not allowed" triggered by multiple JOIN

От
PG Bug reporting form
Дата:
The following bug has been logged on the website:

Bug reference:      18260
Logged by:          Zuming Jiang
Email address:      zuming.jiang@inf.ethz.ch
PostgreSQL version: 16.1
Operating system:   Ubuntu 20.04
Description:

My fuzzer finds a bug in Postgres 17devel, which triggers an unexpected
error "ERROR:  negative bitmapset member not allowed".

--- Set up database ---
create table t0 (vkey int4, c1 numeric);
create table t1 (vkey int4, c9 text, primary key(vkey));
create view t3 as
select
    ref_3.c1 as c_0,
    ref_3.c1 as c_2,
    null::int4 as c_4,
    right(ref_2.c9, ref_2.vkey) as c_6
  from
    ((select ref_0.vkey as c_2 from t1 as ref_0) as subq_0
      left outer join (t1 as ref_2
            right outer join t0 as ref_3
            on (ref_2.vkey = ref_3.vkey))
      on (subq_0.c_2 = ref_3.vkey));

The fuzzer generates a test case:

--- Test case ---
with cte_1 as (select
    ref_1.c_0 as c_2,
    ref_1.c_4 as c_3,
    case when (nullif(ref_1.c_2, ref_1.c_0) <> (
          select
              ref_53.c_2 as c_0
            from
              (t1 as ref_52 left outer join t3 as ref_53 on true)
            where (ref_53.c_6) ~~ (ref_53.c_6)
            order by c_0 limit 1)
          ) then null else null end
         as c_4
  from
    t3 as ref_1)
select 1
  from
    cte_1 as ref_33
  where (ref_33.c_2 > (
        select
            ref_200.c_2 as c_0
          from
            t3 as ref_200
          window w0 as (partition by ref_33.c_3 order by ref_33.c_4 desc)
          order by c_0 limit 1));

--- Expected behavior ---
The test case should not trigger any error.

--- Actual behavior ---
The test case trigger an error: 

ERROR:  negative bitmapset member not allowed

--- Postgres version ---
Github commit: 0eac3c798c2d223d6557a5440d7534317dbd4fa0
Version: PostgreSQL 17devel on x86_64-pc-linux-gnu, compiled by gcc (Ubuntu
9.4.0-1ubuntu1~20.04.1) 9.4.0, 64-bit

--- Platform information ---
Platform: Ubuntu 20.04
Kernel: Linux 5.4.0-147-generic


Re: BUG #18260: Unexpected error: "negative bitmapset member not allowed" triggered by multiple JOIN

От
Zu-Ming Jiang
Дата:
When I removed "window w0 as (partition by ref_33.c_3 order by 
ref_33.c_4 desc)" in the test case query, the query did not trigger 
errors any more.

Specifically, this query did not trigger errors:

with cte_1 as (select
     ref_1.c_0 as c_2,
     ref_1.c_4 as c_3,
     case when (nullif(ref_1.c_2, ref_1.c_0) <> (
           select
               ref_53.c_2 as c_0
             from
               (t1 as ref_52 left outer join t3 as ref_53 on true)
             where (ref_53.c_6) ~~ (ref_53.c_6)
             order by c_0 limit 1)
           ) then null else null end
          as c_4
   from
     t3 as ref_1)
select 1
   from
     cte_1 as ref_33
   where (ref_33.c_2 > (
         select
             ref_200.c_2 as c_0
           from
             t3 as ref_200
           order by c_0 limit 1));



Re: BUG #18260: Unexpected error: "negative bitmapset member not allowed" triggered by multiple JOIN

От
Richard Guo
Дата:

On Wed, Dec 27, 2023 at 6:18 PM PG Bug reporting form <noreply@postgresql.org> wrote:
My fuzzer finds a bug in Postgres 17devel, which triggers an unexpected
error "ERROR:  negative bitmapset member not allowed".

Thank you for the report.  This issue is caused by the way SJE removes
PHVs.  I have run out of time today, but I will look into it tomorrow.

Thanks
Richard

Re: BUG #18260: Unexpected error: "negative bitmapset member not allowed" triggered by multiple JOIN

От
Richard Guo
Дата:

On Wed, Dec 27, 2023 at 8:15 PM Richard Guo <guofenglinux@gmail.com> wrote:
On Wed, Dec 27, 2023 at 6:18 PM PG Bug reporting form <noreply@postgresql.org> wrote:
My fuzzer finds a bug in Postgres 17devel, which triggers an unexpected
error "ERROR:  negative bitmapset member not allowed".

Thank you for the report.  This issue is caused by the way SJE removes
PHVs.  I have run out of time today, but I will look into it tomorrow.

I've looked into it further.  When removing a useless join, we'd remove
PHVs that are not used at join partner rels or above the join.  A PHV
that references the join's relid in ph_eval_at is logically "above" the
join and thus should not be removed.  We added a check in 9a2dbc614 for
that:

    !bms_is_member(ojrelid, phinfo->ph_eval_at)

During that time, join removal was only performed for left joins, so it
was not possible for 'ojrelid' to be negative.  However, with the
introduction of the SJE feature, inner joins can also be removed, and
'ojrelid' is set to -1 in the inner join case.  That's how we see this
error.

A straightforward way to fix this error is to skip checking ojrelid for
inner joins:

--- a/src/backend/optimizer/plan/analyzejoins.c
+++ b/src/backend/optimizer/plan/analyzejoins.c
@@ -456,7 +456,7 @@ remove_rel_from_query(PlannerInfo *root, RelOptInfo *rel,
        Assert(sjinfo == NULL || !bms_is_member(relid, phinfo->ph_lateral));
        if (bms_is_subset(phinfo->ph_needed, joinrelids) &&
            bms_is_member(relid, phinfo->ph_eval_at) &&
-           !bms_is_member(ojrelid, phinfo->ph_eval_at))
+           (sjinfo == NULL || !bms_is_member(ojrelid, phinfo->ph_eval_at)))

Alternatively, we can modify bms_is_member() to return false for
negative numbers instead of emitting an error, as suggested by the
comment there.

--- a/src/backend/nodes/bitmapset.c
+++ b/src/backend/nodes/bitmapset.c
@@ -477,9 +477,9 @@ bms_is_member(int x, const Bitmapset *a)
    int         wordnum,
                bitnum;

-   /* XXX better to just return false for x<0 ? */
+   /* negative number cannot be a member of the bitmapset */
    if (x < 0)
-       elog(ERROR, "negative bitmapset member not allowed");
+       return false;

I prefer the second option, but I'm open to other thoughts.

Attached is a patch for the second option.

FWIW, here is a simplified repro for this error.

create table t (a int primary key, b int);

explain (verbose, costs off)
select 1 from t t1 left join
    (lateral (select 1 as x, * from t t2) s1 inner join
        (select * from t t3) s2 on s1.a = s2.a)
    on true
where s1.x = 1;
ERROR:  negative bitmapset member not allowed

Thanks
Richard
Вложения

Re: BUG #18260: Unexpected error: "negative bitmapset member not allowed" triggered by multiple JOIN

От
Andrei Lepikhov
Дата:
On 28/12/2023 10:03, Richard Guo wrote:
> During that time, join removal was only performed for left joins, so it
> was not possible for 'ojrelid' to be negative.  However, with the
> introduction of the SJE feature, inner joins can also be removed, and
> 'ojrelid' is set to -1 in the inner join case.  That's how we see this
> error.
Agreed, I have come to the same conclusion.

>          if (bms_is_subset(phinfo->ph_needed, joinrelids) &&
>              bms_is_member(relid, phinfo->ph_eval_at) &&
> -           !bms_is_member(ojrelid, phinfo->ph_eval_at))
> +           (sjinfo == NULL || !bms_is_member(ojrelid, phinfo->ph_eval_at)))
It would be better to check "ojrelid >= 0" as a sign. But why do you 
think SJE could need to delete PlaceHolderVar at all?
> 
> Alternatively, we can modify bms_is_member() to return false for
> negative numbers instead of emitting an error, as suggested by the
> comment there.
> -   /* XXX better to just return false for x<0 ? */
> +   /* negative number cannot be a member of the bitmapset */
>      if (x < 0)
> -       elog(ERROR, "negative bitmapset member not allowed");
> +       return false;
> 
> I prefer the second option, but I'm open to other thoughts.
I don't like this option. It could mask some blunders somewhere like a 
current one.

> FWIW, here is a simplified repro for this error.
> 
> create table t (a int primary key, b int);
> 
> explain (verbose, costs off)
> select 1 from t t1 left join
>      (lateral (select 1 as x, * from t t2) s1 inner join
>          (select * from t t3) s2 on s1.a = s2.a)
>      on true
> where s1.x = 1;
> ERROR:  negative bitmapset member not allowed
It can be simplified even more (without LATERAL):
explain (verbose, costs off)
select 1 from t t1 left join
      (select 1 as x, * from t t2) s1 inner join
         (select * from t t3) using (a)
     on true
where s1.x = 1;

-- 
regards,
Andrei Lepikhov
Postgres Professional





On Thu, Dec 28, 2023 at 12:30 PM Andrei Lepikhov <a.lepikhov@postgrespro.ru> wrote:
On 28/12/2023 10:03, Richard Guo wrote:
>          if (bms_is_subset(phinfo->ph_needed, joinrelids) &&
>              bms_is_member(relid, phinfo->ph_eval_at) &&
> -           !bms_is_member(ojrelid, phinfo->ph_eval_at))
> +           (sjinfo == NULL || !bms_is_member(ojrelid, phinfo->ph_eval_at)))

> Alternatively, we can modify bms_is_member() to return false for
> negative numbers instead of emitting an error, as suggested by the
> comment there.
> -   /* XXX better to just return false for x<0 ? */
> +   /* negative number cannot be a member of the bitmapset */
>      if (x < 0)
> -       elog(ERROR, "negative bitmapset member not allowed");
> +       return false;
>
> I prefer the second option, but I'm open to other thoughts.
I don't like this option. It could mask some blunders somewhere like a
current one.

After a second thought, I agree that the first option is better.  It
maintains consistency with the Assert a few lines ahead, and also
minimizes the potential impact of code changes compared to the second
option.
 
> FWIW, here is a simplified repro for this error.
>
> create table t (a int primary key, b int);
>
> explain (verbose, costs off)
> select 1 from t t1 left join
>      (lateral (select 1 as x, * from t t2) s1 inner join
>          (select * from t t3) s2 on s1.a = s2.a)
>      on true
> where s1.x = 1;
> ERROR:  negative bitmapset member not allowed
It can be simplified even more (without LATERAL):

Indeed.  I realized this shortly after v1 patch was sent out, and was
thinking that it wasn't worth the effort to update the patch with a new
version just for this issue.

Thanks
Richard
Вложения
On 11/1/2024 16:08, Richard Guo wrote:
> 
> On Thu, Dec 28, 2023 at 12:30 PM Andrei Lepikhov 
> <a.lepikhov@postgrespro.ru <mailto:a.lepikhov@postgrespro.ru>> wrote:
> 
>     On 28/12/2023 10:03, Richard Guo wrote:
>      >          if (bms_is_subset(phinfo->ph_needed, joinrelids) &&
>      >              bms_is_member(relid, phinfo->ph_eval_at) &&
>      > -           !bms_is_member(ojrelid, phinfo->ph_eval_at))
>      > +           (sjinfo == NULL || !bms_is_member(ojrelid,
>     phinfo->ph_eval_at)))
> 
>      > Alternatively, we can modify bms_is_member() to return false for
>      > negative numbers instead of emitting an error, as suggested by the
>      > comment there.
>      > -   /* XXX better to just return false for x<0 ? */
>      > +   /* negative number cannot be a member of the bitmapset */
>      >      if (x < 0)
>      > -       elog(ERROR, "negative bitmapset member not allowed");
>      > +       return false;
>      >
>      > I prefer the second option, but I'm open to other thoughts.
>     I don't like this option. It could mask some blunders somewhere like a
>     current one.
> 
> 
> After a second thought, I agree that the first option is better.  It
> maintains consistency with the Assert a few lines ahead, and also
> minimizes the potential impact of code changes compared to the second
> option.
> 
>      > FWIW, here is a simplified repro for this error.
>      >
>      > create table t (a int primary key, b int);
>      >
>      > explain (verbose, costs off)
>      > select 1 from t t1 left join
>      >      (lateral (select 1 as x, * from t t2) s1 inner join
>      >          (select * from t t3) s2 on s1.a = s2.a)
>      >      on true
>      > where s1.x = 1;
>      > ERROR:  negative bitmapset member not allowed
>     It can be simplified even more (without LATERAL):
> 
> 
> Indeed.  I realized this shortly after v1 patch was sent out, and was
> thinking that it wasn't worth the effort to update the patch with a new
> version just for this issue.
I like it, thanks for your efforts.

-- 
regards,
Andrei Lepikhov
Postgres Professional




On Mon, Jan 15, 2024 at 2:20 AM Andrei Lepikhov
<a.lepikhov@postgrespro.ru> wrote:
> On 11/1/2024 16:08, Richard Guo wrote:
> >
> > On Thu, Dec 28, 2023 at 12:30 PM Andrei Lepikhov
> > <a.lepikhov@postgrespro.ru <mailto:a.lepikhov@postgrespro.ru>> wrote:
> >
> >     On 28/12/2023 10:03, Richard Guo wrote:
> >      >          if (bms_is_subset(phinfo->ph_needed, joinrelids) &&
> >      >              bms_is_member(relid, phinfo->ph_eval_at) &&
> >      > -           !bms_is_member(ojrelid, phinfo->ph_eval_at))
> >      > +           (sjinfo == NULL || !bms_is_member(ojrelid,
> >     phinfo->ph_eval_at)))
> >
> >      > Alternatively, we can modify bms_is_member() to return false for
> >      > negative numbers instead of emitting an error, as suggested by the
> >      > comment there.
> >      > -   /* XXX better to just return false for x<0 ? */
> >      > +   /* negative number cannot be a member of the bitmapset */
> >      >      if (x < 0)
> >      > -       elog(ERROR, "negative bitmapset member not allowed");
> >      > +       return false;
> >      >
> >      > I prefer the second option, but I'm open to other thoughts.
> >     I don't like this option. It could mask some blunders somewhere like a
> >     current one.
> >
> >
> > After a second thought, I agree that the first option is better.  It
> > maintains consistency with the Assert a few lines ahead, and also
> > minimizes the potential impact of code changes compared to the second
> > option.
> >
> >      > FWIW, here is a simplified repro for this error.
> >      >
> >      > create table t (a int primary key, b int);
> >      >
> >      > explain (verbose, costs off)
> >      > select 1 from t t1 left join
> >      >      (lateral (select 1 as x, * from t t2) s1 inner join
> >      >          (select * from t t3) s2 on s1.a = s2.a)
> >      >      on true
> >      > where s1.x = 1;
> >      > ERROR:  negative bitmapset member not allowed
> >     It can be simplified even more (without LATERAL):
> >
> >
> > Indeed.  I realized this shortly after v1 patch was sent out, and was
> > thinking that it wasn't worth the effort to update the patch with a new
> > version just for this issue.
> I like it, thanks for your efforts.

+1,
Thank you, Richard, Andrei.  I'm going to push this tomorrow.

------
Regards,
Alexander Korotkov