Обсуждение: BUG #17618: unnecessary filter column <> text even after adding index

Поиск

Список

Период

Сортировка

BUG #17618: unnecessary filter column <> text even after adding index

От

PG Bug reporting form

Дата:

19 сентября 2022 г., 17:28:23

The following bug has been logged on the website:

Bug reference:      17618
Logged by:          Sindy Senorita
Email address:      sindysenorita@gmail.com
PostgreSQL version: 13.7
Operating system:   Ubuntu
Description:

Hi, I'm not sure if this is a bug or feature, but definitely not what I've
expected

So I have a table with "status" column which can contains 'valid',
'invalid', 'pending', 'unknown'.
A very simple table

CREATE TABLE public.test (
    id varchar NOT NULL,
    status varchar NOT NULL,
    CONSTRAINT test__pkey PRIMARY KEY (id)
)
CREATE INDEX pending_test_4 ON public.test USING btree ((((status)::text <>
'invalid'::text)));

notice that I've created an index to guide statuses that is not 'invalid
my query is:
SELECT * FROM test WHERE status != 'invalid'

When I run explain analyze on that with SET enable_seqscan = off, I got
QUERY PLAN
                                           |

------------------------------------------------------------------------------------------------------------------------+
Bitmap Heap Scan on test  (cost=4.62..8.37 rows=120 width=160) (actual
time=0.088..0.134 rows=117 loops=1)              |
  Filter: ((status)::text <> 'invalid'::text)
                                           |
  Heap Blocks: exact=3
                                           |
  ->  Bitmap Index Scan on pending_test_4  (cost=0.00..4.59 rows=60 width=0)
(actual time=0.073..0.073 rows=117 loops=1)|
        Index Cond: (((status)::text <> 'invalid'::text) = true)
                                           |
Planning Time: 0.222 ms
                                           |
Execution Time: 0.172 ms
                                           |

The plan has used the index condition just right, but it still perform
aditional bitmap heap scan just to filter for a clause that exactly match
the index. And worse, it double the query cost
My questions are:
1. Is this a bug? or intended feature by design? If it is by design, I'd be
very happy to learn the rationale behind it.
2. Is there any way to skip/avoid the additional bitmap scan?
3. Could there be a better solution for my query. Suppose that the variants
of the status is unknown so query SELECT .. WHERE STATUS IN (all status
beside 'invalid') is not possible

Many thanks!
Sindy

Re: BUG #17618: unnecessary filter column <> text even after adding index

От

Tom Lane

Дата:

19 сентября 2022 г., 18:24:12

PG Bug reporting form <noreply@postgresql.org> writes:
> When I run explain analyze on that with SET enable_seqscan = off, I got
> QUERY PLAN
>                                            |
>
------------------------------------------------------------------------------------------------------------------------+
> Bitmap Heap Scan on test  (cost=4.62..8.37 rows=120 width=160) (actual
> time=0.088..0.134 rows=117 loops=1)              |
>   Filter: ((status)::text <> 'invalid'::text)
>                                            |
>   Heap Blocks: exact=3
>                                            |
>   ->  Bitmap Index Scan on pending_test_4  (cost=0.00..4.59 rows=60 width=0)
> (actual time=0.073..0.073 rows=117 loops=1)|
>         Index Cond: (((status)::text <> 'invalid'::text) = true)
>                                            |
> Planning Time: 0.222 ms
>                                            |
> Execution Time: 0.172 ms
>                                            |

This is exactly what is expected; it's not a bug.

> The plan has used the index condition just right, but it still perform
> aditional bitmap heap scan just to filter for a clause that exactly match
> the index. And worse, it double the query cost

The filter condition is required because the bitmap produced by the index
can be lossy, ie it might identify more rows than actually satisfy the
condition.  BitmapHeapNext will only actually apply the condition if
the index reports that that happened, so in practice for this sort of
query the filter condition probably never gets rechecked.

The "doubled cost" has nothing whatever to do with the filter condition;
most of that is concerned with the number of disk pages touched.  It
might help you to read

https://www.postgresql.org/docs/current/using-explain.html

            regards, tom lane

Re: BUG #17618: unnecessary filter column <> text even after adding index

От

Sindy Senorita

Дата:

19 сентября 2022 г., 18:45:56

I see, quick google search takes me to BitmapHeapNext implementation here https://doxygen.postgresql.org/nodeBitmapHeapscan_8c_source.html#l00072. I hope this is what you mean

Noted. Thanks for the explanation

Cheers

On Mon, Sep 19, 2022 at 10:24 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:

PG Bug reporting form <noreply@postgresql.org> writes:
> When I run explain analyze on that with SET enable_seqscan = off, I got
> QUERY PLAN
> |
> ------------------------------------------------------------------------------------------------------------------------+
> Bitmap Heap Scan on test (cost=4.62..8.37 rows=120 width=160) (actual
> time=0.088..0.134 rows=117 loops=1) |
> Filter: ((status)::text <> 'invalid'::text)
> |
> Heap Blocks: exact=3
> |
> -> Bitmap Index Scan on pending_test_4 (cost=0.00..4.59 rows=60 width=0)
> (actual time=0.073..0.073 rows=117 loops=1)|
> Index Cond: (((status)::text <> 'invalid'::text) = true)
> |
> Planning Time: 0.222 ms
> |
> Execution Time: 0.172 ms
> |

This is exactly what is expected; it's not a bug.

> The plan has used the index condition just right, but it still perform
> aditional bitmap heap scan just to filter for a clause that exactly match
> the index. And worse, it double the query cost

The filter condition is required because the bitmap produced by the index
can be lossy, ie it might identify more rows than actually satisfy the
condition. BitmapHeapNext will only actually apply the condition if
the index reports that that happened, so in practice for this sort of
query the filter condition probably never gets rechecked.

The "doubled cost" has nothing whatever to do with the filter condition;
most of that is concerned with the number of disk pages touched. It
might help you to read

https://www.postgresql.org/docs/current/using-explain.html

regards, tom lane

Re: BUG #17618: unnecessary filter column <> text even after adding index

От

"David G. Johnston"

Дата:

19 сентября 2022 г., 18:53:29

On Mon, Sep 19, 2022 at 8:15 AM PG Bug reporting form <noreply@postgresql.org> wrote:

CREATE TABLE public.test (
id varchar NOT NULL,
status varchar NOT NULL,
CONSTRAINT test__pkey PRIMARY KEY (id)
)

CREATE INDEX pending_test_4 ON public.test USING btree ((((status)::text <>
'invalid'::text)));

notice that I've created an index to guide statuses that is not 'invalid
my query is:
SELECT * FROM test WHERE status != 'invalid'

Your index contains none of the fields in the original table so the system can never answer your inquiry using only the index.

You may find this to be informative:

https://www.postgresql.org/docs/devel/indexes-index-only-scans.html

Usually on a "status" field doing a few partial indexes gets you the best result. The more statuses you need to be concerned about the more likely just scanning the table is going to win out in performance. But if you do only care about a few the smaller index size will be of benefit to keep them in memory. A covering index may be of use as well though for rapidly changing statuses tuple visibility is going to be a challenge. In short, you seem to be providing a non-real situation and asking for advice that is situational in nature.

David J.

Re: BUG #17618: unnecessary filter column <> text even after adding index

От

Jeff Janes

Дата:

20 сентября 2022 г., 20:12:03

On Mon, Sep 19, 2022 at 11:24 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:

> The plan has used the index condition just right, but it still perform
> aditional bitmap heap scan just to filter for a clause that exactly match
> the index. And worse, it double the query cost

The filter condition is required because the bitmap produced by the index
can be lossy, ie it might identify more rows than actually satisfy the
condition. BitmapHeapNext will only actually apply the condition if
the index reports that that happened, so in practice for this sort of
query the filter condition probably never gets rechecked.

You are describing a Recheck, but attributing its properties to a Filter. I think that the filter condition always gets checked. It does so if I create a tattler function which raises a notice every time it is called and then build an index on a Boolean expression over that function (but of course that inevitably does change the code paths a bit).

I don't know about being a bug, but it is at least a mild mal-feature that boolean index columns/expressions can't be dealt with better, and have to be handed in a filter rather than in a recheck.

Cheers,

Jeff

Re: BUG #17618: unnecessary filter column <> text even after adding index

От

Tom Lane

Дата:

20 сентября 2022 г., 22:14:34

Jeff Janes <jeff.janes@gmail.com> writes:
> You are describing a Recheck, but attributing its properties to a Filter.

You're right of course, momentary brain fade on my part.

> I don't know about being a bug, but it is at least a mild mal-feature that
> boolean index columns/expressions can't be dealt with better, and have to
> be handed in a filter rather than in a recheck.

Yeah ... looking at create_bitmap_scan_plan, I see that it does this:

    /*
     * When dealing with special operators, we will at this point have
     * duplicate clauses in qpqual and bitmapqualorig.  We may as well drop
     * 'em from bitmapqualorig, since there's no point in making the tests
     * twice.
     */
    bitmapqualorig = list_difference_ptr(bitmapqualorig, qpqual);

I wonder if that isn't backwards, ie we should prefer to put duplicates
in bitmapqualorig (the recheck condition) instead of qpqual (the filter).
If my head is screwed on correctly today, that should allow us to skip
checking the condition much of the time, and the skip would be safe
if the index is correctly asserting that no recheck is needed.

            regards, tom lane

Re: BUG #17618: unnecessary filter column <> text even after adding index

От

Tom Lane

Дата:

21 сентября 2022 г., 02:55:59

I wrote:
> I wonder if that isn't backwards, ie we should prefer to put duplicates
> in bitmapqualorig (the recheck condition) instead of qpqual (the filter).
> If my head is screwed on correctly today, that should allow us to skip
> checking the condition much of the time, and the skip would be safe
> if the index is correctly asserting that no recheck is needed.

Flipping the removal around has the effect I expected on the plan shape,
but some of the regression test queries now give the wrong answer, so
there's something faulty about that analysis.

            regards, tom lane

Re: BUG #17618: unnecessary filter column <> text even after adding index

От

Richard Guo

Дата:

23 сентября 2022 г., 16:29:39

On Wed, Sep 21, 2022 at 7:56 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:

Flipping the removal around has the effect I expected on the plan shape,
but some of the regression test queries now give the wrong answer, so
there's something faulty about that analysis.

I think we may have a minor mistake when constructing the qpqual list in
create_bitmap_scan_plan. The qpqual list is supposed to be scan_clauses
minus indexquals. So we check each scan clause to see if it is redundant
with any indexqual, by using equal, checking EC or using
predicate_implied_by.

Note that the indexqual here may not be the form that has been going
through constant folding. Such as in this case with a boolean index, the
indexqual would be converted to 'indexkey expression = TRUE' by
match_boolean_index_clause. And that may make us fail to tell the scan
clause is redundant.

The comment of predicate_implied_by() says

* The top-level List structure of each list corresponds to an AND list.
* We assume that eval_const_expressions() has been applied and so there
* are no un-flattened ANDs or ORs (e.g., no AND immediately within an AND,
* including AND just below the top-level List structure).

So I think we need to run eval_const_expressions on indexquals before we
check for duplicate clauses, something like attached.

Thanks
Richard

Вложения

v1-0001-constant-folding-for-indexquals-in-bitmap-scan.patch

Re: BUG #17618: unnecessary filter column <> text even after adding index

От

Tom Lane

Дата:

23 сентября 2022 г., 17:10:09

Richard Guo <guofenglinux@gmail.com> writes:
> So I think we need to run eval_const_expressions on indexquals before we
> check for duplicate clauses, something like attached.

[ squint... ]  Surely that was done long before we ever get here?

            regards, tom lane

Re: BUG #17618: unnecessary filter column <> text even after adding index

От

Richard Guo

Дата:

24 сентября 2022 г., 02:54:02

On Fri, Sep 23, 2022 at 10:10 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:

Richard Guo <guofenglinux@gmail.com> writes:
> So I think we need to run eval_const_expressions on indexquals before we
> check for duplicate clauses, something like attached.

[ squint... ] Surely that was done long before we ever get here?

We should have already done that long before. It seems afterwards we may
do additional transformation on indexquals. In this case with a boolean
index, I can see we convert the indexqual to form 'indexkey = TRUE' in
match_boolean_index_clause.

Thanks
Richard

Re: BUG #17618: unnecessary filter column <> text even after adding index

От

Tom Lane

Дата:

24 сентября 2022 г., 03:04:51

Richard Guo <guofenglinux@gmail.com> writes:
> On Fri, Sep 23, 2022 at 10:10 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> Richard Guo <guofenglinux@gmail.com> writes:
>>> So I think we need to run eval_const_expressions on indexquals before we
>>> check for duplicate clauses, something like attached.

>> [ squint... ]  Surely that was done long before we ever get here?

> We should have already done that long before. It seems afterwards we may
> do additional transformation on indexquals. In this case with a boolean
> index, I can see we convert the indexqual to form 'indexkey = TRUE' in
> match_boolean_index_clause.

Of course, but what about that transformation would introduce something
that eval_const_expressions could simplify?  (Actually, now that I think
about it, I think eval_const_expressions would break it completely because
it'd re-canonicalize the expression as just 'indexkey', exactly what we
don't want here.)  In any case, if there's something between the
eval_const_expressions pass and createplan.c that introduces simplifiable
expressions, I think it's on that something's head to re-simplify; we
don't want to do something so expensive in a main code path if it's
usually going to be a complete waste.

            regards, tom lane

Re: BUG #17618: unnecessary filter column <> text even after adding index

От

Richard Guo

Дата:

24 сентября 2022 г., 03:06:06

On Fri, Sep 23, 2022 at 9:29 PM Richard Guo <guofenglinux@gmail.com> wrote:

So I think we need to run eval_const_expressions on indexquals before we
check for duplicate clauses, something like attached.

BTW, (revise to the v1 patch), if this is the right way to go, we should
do that before the foreach loop, so that we need to run
eval_const_expressions on indexquals only once rather than for each scan
clause.

Thanks
Richard

Re: BUG #17618: unnecessary filter column <> text even after adding index

От

Richard Guo

Дата:

24 сентября 2022 г., 03:41:48

On Sat, Sep 24, 2022 at 8:04 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:

Richard Guo <guofenglinux@gmail.com> writes:
> We should have already done that long before. It seems afterwards we may
> do additional transformation on indexquals. In this case with a boolean
> index, I can see we convert the indexqual to form 'indexkey = TRUE' in
> match_boolean_index_clause.

Of course, but what about that transformation would introduce something
that eval_const_expressions could simplify? (Actually, now that I think
about it, I think eval_const_expressions would break it completely because
it'd re-canonicalize the expression as just 'indexkey', exactly what we
don't want here.) In any case, if there's something between the
eval_const_expressions pass and createplan.c that introduces simplifiable
expressions, I think it's on that something's head to re-simplify; we
don't want to do something so expensive in a main code path if it's
usually going to be a complete waste.

Yeah, I agree that running eval_const_expressions here is expensive.
Maybe we can just do the reverse transformation in
create_bitmap_scan_plan against what we do for boolean index in
match_boolean_index_clause?

I think it's necessary to re-simplify the indexquals here, otherwise we
may fail to compare scan_clauses to indexquals correctly.

Thanks
Richard

Re: BUG #17618: unnecessary filter column <> text even after adding index

От

Richard Guo

Дата:

26 сентября 2022 г., 14:42:05

On Sat, Sep 24, 2022 at 8:41 AM Richard Guo <guofenglinux@gmail.com> wrote:

On Sat, Sep 24, 2022 at 8:04 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:
Richard Guo <guofenglinux@gmail.com> writes:
> We should have already done that long before. It seems afterwards we may
> do additional transformation on indexquals. In this case with a boolean
> index, I can see we convert the indexqual to form 'indexkey = TRUE' in
> match_boolean_index_clause.

Of course, but what about that transformation would introduce something
that eval_const_expressions could simplify? (Actually, now that I think
about it, I think eval_const_expressions would break it completely because
it'd re-canonicalize the expression as just 'indexkey', exactly what we
don't want here.) In any case, if there's something between the
eval_const_expressions pass and createplan.c that introduces simplifiable
expressions, I think it's on that something's head to re-simplify; we
don't want to do something so expensive in a main code path if it's
usually going to be a complete waste.

Yeah, I agree that running eval_const_expressions here is expensive.
Maybe we can just do the reverse transformation in
create_bitmap_scan_plan against what we do for boolean index in
match_boolean_index_clause?

Following this idea, I come up with v2 patch. Is this the right
direction to go?

Thanks
Richard

Вложения

v2-0001-constant-folding-for-indexquals-in-bitmap-scan.patch

Re: BUG #17618: unnecessary filter column <> text even after adding index

От

Richard Guo

Дата:

02 ноября 2022 г., 10:46:30

On Mon, Sep 26, 2022 at 7:42 PM Richard Guo <guofenglinux@gmail.com> wrote:

On Sat, Sep 24, 2022 at 8:41 AM Richard Guo <guofenglinux@gmail.com> wrote:
Yeah, I agree that running eval_const_expressions here is expensive.
Maybe we can just do the reverse transformation in
create_bitmap_scan_plan against what we do for boolean index in
match_boolean_index_clause?

Following this idea, I come up with v2 patch. Is this the right
direction to go?

Update with v3 patch, nothing changes except fixes a test failure
spotted by cfbot.

Thanks
Richard

Вложения

v3-0001-constant-folding-for-indexquals-in-bitmap-scan.patch

Re: BUG #17618: unnecessary filter column <> text even after adding index

От

Tom Lane

Дата:

05 ноября 2022 г., 20:07:44

Richard Guo <guofenglinux@gmail.com> writes:
> Update with v3 patch, nothing changes except fixes a test failure
> spotted by cfbot.

I think this is pretty close to usable, except that I don't believe
reusing simplify_boolean_equality this way is a great idea.
It does more than we need (surely the LHS-is-Const case cannot occur here)
and it has assumptions that I'm not sure hold --- particularly the bit
about !constisnull.  I'd be inclined to just copy-and-paste the three or
four lines we need.

            regards, tom lane

Re: BUG #17618: unnecessary filter column <> text even after adding index

От

Tom Lane

Дата:

05 ноября 2022 г., 20:23:20

I wrote:
>> I wonder if that isn't backwards, ie we should prefer to put duplicates
>> in bitmapqualorig (the recheck condition) instead of qpqual (the filter).
>> If my head is screwed on correctly today, that should allow us to skip
>> checking the condition much of the time, and the skip would be safe
>> if the index is correctly asserting that no recheck is needed.

> Flipping the removal around has the effect I expected on the plan shape,
> but some of the regression test queries now give the wrong answer, so
> there's something faulty about that analysis.

BTW, after looking more closely I see my mistake.  An example of the
sort of plan that fails with that change is

  Sort
    Sort Key: proname
    ->  Bitmap Heap Scan on pg_proc
-         Filter: (proname ~~ 'RI\_FKey%del'::text)
+         Recheck Cond: (proname ~~ 'RI\_FKey%del'::text)
          ->  Bitmap Index Scan on pg_proc_proname_args_nsp_index
                Index Cond: ((proname >= 'RI_FKey'::text) AND (proname < 'RI_FKez'::text))
 (6 rows)

The difficulty here is pretty obvious: the original clause is stricter
than the index conditions generated from it.  So even if the index
enforces the index conditions exactly, we still need to check the
original clause, and so it can't be relegated to the recheck field.
To improve this, we'd need to track which elements of bitmapqualorig
correspond exactly to index conditions, which we don't do ATM.

            regards, tom lane

Re: BUG #17618: unnecessary filter column <> text even after adding index

От

Richard Guo

Дата:

07 ноября 2022 г., 12:01:19

On Sun, Nov 6, 2022 at 1:07 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:

Richard Guo <guofenglinux@gmail.com> writes:
> Update with v3 patch, nothing changes except fixes a test failure
> spotted by cfbot.

I think this is pretty close to usable, except that I don't believe
reusing simplify_boolean_equality this way is a great idea.
It does more than we need (surely the LHS-is-Const case cannot occur here)
and it has assumptions that I'm not sure hold --- particularly the bit
about !constisnull. I'd be inclined to just copy-and-paste the three or
four lines we need.

Thanks for the suggestion. Yes, simplify_boolean_equality is doing more
than we need. I think here we just intend to handle indexquals of form
"indexkey = true/false", which seems can only come out from function
match_boolean_index_clause. From what this function does, we are sure
the constant input can be only on right (as you pointed out), and the
operator can only be BooleanEqualOperator. Also it seems the assumption
about !constisnull holds, as match_boolean_index_clause would not make a
clause with a constant-NULL input.

I've updated the patch according to the suggestions as in v4. Thanks
for reviewing this patch!

Thanks
Richard

Вложения

v4-0001-constant-folding-for-indexquals-in-bitmap-scan.patch

Re: BUG #17618: unnecessary filter column <> text even after adding index

От

Tom Lane

Дата:

08 ноября 2022 г., 00:06:33

Richard Guo <guofenglinux@gmail.com> writes:
> I've updated the patch according to the suggestions as in v4.  Thanks
> for reviewing this patch!

I was about ready to commit this when I re-read your initial comment
and realized that there's a second way to fix it.  We can improve
predtest.c so that it understands that "x = true" implies "x" and
so on, whereupon the existing logic in create_bitmap_scan_plan
handles the case correctly.  This is pretty nearly the same code as
in your v4, except that it's in a considerably less hot code path, plus
there's at least some chance that it could be useful for other purposes.
So I think I like this way better.  Thoughts?

            regards, tom lane

diff --git a/contrib/btree_gin/expected/bool.out b/contrib/btree_gin/expected/bool.out
index 207a3f2328..b379622baf 100644
--- a/contrib/btree_gin/expected/bool.out
+++ b/contrib/btree_gin/expected/bool.out
@@ -92,7 +92,7 @@ EXPLAIN (COSTS OFF) SELECT * FROM test_bool WHERE i=true ORDER BY i;
  Sort
    Sort Key: i
    ->  Bitmap Heap Scan on test_bool
-         Filter: i
+         Recheck Cond: i
          ->  Bitmap Index Scan on idx_bool
                Index Cond: (i = true)
 (6 rows)
@@ -119,3 +119,15 @@ EXPLAIN (COSTS OFF) SELECT * FROM test_bool WHERE i>true ORDER BY i;
                Index Cond: (i > true)
 (6 rows)

+-- probably sufficient to check just this one:
+EXPLAIN (COSTS OFF) SELECT * FROM test_bool WHERE i=false ORDER BY i;
+                QUERY PLAN
+-------------------------------------------
+ Sort
+   Sort Key: i
+   ->  Bitmap Heap Scan on test_bool
+         Recheck Cond: (NOT i)
+         ->  Bitmap Index Scan on idx_bool
+               Index Cond: (i = false)
+(6 rows)
+
diff --git a/contrib/btree_gin/sql/bool.sql b/contrib/btree_gin/sql/bool.sql
index dad2ff32b8..08f2986e8c 100644
--- a/contrib/btree_gin/sql/bool.sql
+++ b/contrib/btree_gin/sql/bool.sql
@@ -25,3 +25,5 @@ EXPLAIN (COSTS OFF) SELECT * FROM test_bool WHERE i<=true ORDER BY i;
 EXPLAIN (COSTS OFF) SELECT * FROM test_bool WHERE i=true ORDER BY i;
 EXPLAIN (COSTS OFF) SELECT * FROM test_bool WHERE i>=true ORDER BY i;
 EXPLAIN (COSTS OFF) SELECT * FROM test_bool WHERE i>true ORDER BY i;
+-- probably sufficient to check just this one:
+EXPLAIN (COSTS OFF) SELECT * FROM test_bool WHERE i=false ORDER BY i;
diff --git a/src/backend/optimizer/util/predtest.c b/src/backend/optimizer/util/predtest.c
index 182d5b1523..90701dab08 100644
--- a/src/backend/optimizer/util/predtest.c
+++ b/src/backend/optimizer/util/predtest.c
@@ -15,6 +15,7 @@
  */
 #include "postgres.h"

+#include "catalog/pg_operator.h"
 #include "catalog/pg_proc.h"
 #include "catalog/pg_type.h"
 #include "executor/executor.h"
@@ -1093,7 +1094,7 @@ arrayexpr_cleanup_fn(PredIterInfo info)
  *
  * We return true if able to prove the implication, false if not.
  *
- * We have three strategies for determining whether one simple clause
+ * We have several strategies for determining whether one simple clause
  * implies another:
  *
  * A simple and general way is to see if they are equal(); this works for any
@@ -1101,6 +1102,12 @@ arrayexpr_cleanup_fn(PredIterInfo info)
  * there is an implied assumption that the functions in the expression are
  * immutable --- but this was checked for the predicate by the caller.)
  *
+ * Another way that always works is that "x = TRUE" implies "x" (and vice
+ * versa), likewise for "x = FALSE" and "NOT x".  These can be worth
+ * checking because, while we preferentially simplify boolean comparisons
+ * down to "x" and "NOT x", the other form has to be dealt with anyway
+ * in the context of index conditions.
+ *
  * If the predicate is of the form "foo IS NOT NULL", and we are considering
  * strong implication, we can conclude that the predicate is implied if the
  * clause is strict for "foo", i.e., it must yield false or NULL when "foo"
@@ -1124,6 +1131,43 @@ predicate_implied_by_simple_clause(Expr *predicate, Node *clause,
     if (equal((Node *) predicate, clause))
         return true;

+    /* Next see if clause is boolean equality to a constant */
+    if (is_opclause(clause) &&
+        ((OpExpr *) clause)->opno == BooleanEqualOperator)
+    {
+        OpExpr       *op = (OpExpr *) clause;
+        Node       *rightop;
+
+        Assert(list_length(op->args) == 2);
+        rightop = lsecond(op->args);
+        /* We might never see a null Const here, but better check anyway */
+        if (rightop && IsA(rightop, Const) &&
+            !((Const *) rightop)->constisnull)
+        {
+            Node       *leftop = linitial(op->args);
+
+            if (DatumGetBool(((Const *) rightop)->constvalue))
+            {
+                /* X = true implies X */
+                if (equal(predicate, leftop))
+                    return true;
+            }
+            else
+            {
+                /* X = false implies NOT X */
+                if (is_notclause(predicate) &&
+                    equal(get_notclausearg(predicate), leftop))
+                    return true;
+            }
+        }
+    }
+
+    /*
+     * We could likewise check whether the predicate is boolean equality to a
+     * constant; but there are no known use-cases for that at the moment,
+     * assuming that the predicate has been through const-folding.
+     */
+
     /* Next try the IS NOT NULL case */
     if (!weak &&
         predicate && IsA(predicate, NullTest))

Re: BUG #17618: unnecessary filter column <> text even after adding index

От

Richard Guo

Дата:

08 ноября 2022 г., 04:59:34

On Tue, Nov 8, 2022 at 5:06 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:

Richard Guo <guofenglinux@gmail.com> writes:
> I've updated the patch according to the suggestions as in v4. Thanks
> for reviewing this patch!

I was about ready to commit this when I re-read your initial comment
and realized that there's a second way to fix it. We can improve
predtest.c so that it understands that "x = true" implies "x" and
so on, whereupon the existing logic in create_bitmap_scan_plan
handles the case correctly. This is pretty nearly the same code as
in your v4, except that it's in a considerably less hot code path, plus
there's at least some chance that it could be useful for other purposes.
So I think I like this way better. Thoughts?

It works for me. predtest.c is a more common place so that there may be
other cases that can benefit from this change. Thanks for the new
patch!

Thanks
Richard

Re: BUG #17618: unnecessary filter column <> text even after adding index

От

Tom Lane

Дата:

08 ноября 2022 г., 18:37:13

Richard Guo <guofenglinux@gmail.com> writes:
> It works for me.  predtest.c is a more common place so that there may be
> other cases that can benefit from this change.  Thanks for the new
> patch!

Pushed then.

            regards, tom lane

Вход в личный кабинет

Восстановление пароля

Подтверждение аккаунта

Изменение пароля

Обсуждение: BUG #17618: unnecessary filter column <> text even after adding index

Вложения

Вложения

Вложения

Вложения