Обсуждение: WHERE column = X AND column = Y will always be zero matching rows

Поиск
Список
Период
Сортировка

WHERE column = X AND column = Y will always be zero matching rows

От
"[Quipsy] Markus Karg"
Дата:

Hello PostgreSQL Developers,

 

I am using PostgreSQL 15.3 (Debian 15.3-1.pgdg120+1) on aarch64-unknown-linux-gnu, compiled by gcc (Debian 12.2.0-14) 12.2.0, 64-bit.

 

(In the following X and Y are literals; X <> Y)

 

I noticed is that…

 

EXPLAIN SELECT column FROM Table WHERE column = X AND column = Y

 

…says that PostgreSQL actually wants to perform a Scan!

 

I wonder why squandering any resources into a Scan here, as it is pretty obvious that the result is guaranteed to be always, under any conditions, and will always be: zero matching rows – at least in a universe where a single value cannot be X AND Y at the same time.

 

This can be seen as a chance to add an optimization („col = X AND col = Y always is zero matching rows), or in case such an optimization already exists in PostgreSQL, as a bug.  

 

BTW the same happens for column IS NULL AND column = Y.

 

Regards,

-Markus

 

Re: WHERE column = X AND column = Y will always be zero matching rows

От
"David G. Johnston"
Дата:
On Thursday, August 3, 2023, [Quipsy] Markus Karg <karg@quipsy.de> wrote:

EXPLAIN SELECT column FROM Table WHERE column = X AND column = Y

 

…says that PostgreSQL actually wants to perform a Scan!

 

I wonder why squandering any resources into a Scan here


Mostly because no one wants to volunteer to write code to deal with poorly written queries, nor, more technically, expend compute resources looking for situations that should not happen.

David J.

Re: WHERE column = X AND column = Y will always be zero matching rows

От
Laurenz Albe
Дата:
On Thu, 2023-08-03 at 14:20 +0000, [Quipsy] Markus Karg wrote:
> I am using PostgreSQL 15.3 (Debian 15.3-1.pgdg120+1) on aarch64-unknown-linux-gnu,
> compiled by gcc (Debian 12.2.0-14) 12.2.0, 64-bit.
>  
> (In the following X and Y are literals; X <> Y)
>  
> I noticed is that…
>  
> EXPLAIN SELECT column FROM Table WHERE column = X AND column = Y
>  
> …says that PostgreSQL actually wants to perform a Scan!

This is not a bug.

> I wonder why squandering any resources into a Scan here, as it is pretty obvious that the
> result is guaranteed to be always, under any conditions, and will always be: zero matching
> rows – at least in a universe where a single value cannot be X AND Y at the same time.

Such an optimization, while certainly possible, would not be free, and each
query with more than one WHERE condition would have to pay the price.
On the other hand, only ill-written queries would benefit.

Since PostgreSQL only caches execution plans in special cases, anything that
slows down the optimizer should improve enough statements that it is a net win.
I don't see that here (but that is of course a matter of opinion).

Yours,
Laurenz Albe



AW: WHERE column = X AND column = Y will always be zero matching rows

От
"[Quipsy] Markus Karg"
Дата:

David,

 

thank you for your opinion, so it is a chance for future optimization.

 

Whether or not this „should not happen“ is a different discussion. 😉

 

-Markus

 

Von: David G. Johnston <david.g.johnston@gmail.com>
Gesendet: Donnerstag, 3. August 2023 16:48
An: [Quipsy] Markus Karg <karg@quipsy.de>
Cc: pgsql-bugs@lists.postgresql.org
Betreff: Re: WHERE column = X AND column = Y will always be zero matching rows

 

On Thursday, August 3, 2023, [Quipsy] Markus Karg <karg@quipsy.de> wrote:

EXPLAIN SELECT column FROM Table WHERE column = X AND column = Y

 

…says that PostgreSQL actually wants to perform a Scan!

 

I wonder why squandering any resources into a Scan here

 

Mostly because no one wants to volunteer to write code to deal with poorly written queries, nor, more technically, expend compute resources looking for situations that should not happen.

 

David J.

 

Re: WHERE column = X AND column = Y will always be zero matching rows

От
"David G. Johnston"
Дата:


On Thu, Aug 3, 2023, 07:53 [Quipsy] Markus Karg <karg@quipsy.de> wrote:

David,

 

thank you for your opinion, so it is a chance for future optimization.

 

Whether or not this „should not happen“ is a different discussion. 😉


Given that this has come up a number of times over the decades with the same response the practical reality of an optimization are basically zero.  You want to patch your own fork then sure, it is technically possible.

David J.

Re: WHERE column = X AND column = Y will always be zero matching rows

От
Jaime Casanova
Дата:
On Thu, Aug 3, 2023 at 9:21 AM [Quipsy] Markus Karg <karg@quipsy.de> wrote:
>
> Hello PostgreSQL Developers,
>
> I am using PostgreSQL 15.3 (Debian 15.3-1.pgdg120+1) on aarch64-unknown-linux-gnu, compiled by gcc (Debian 12.2.0-14)
12.2.0,64-bit. 
>
> (In the following X and Y are literals; X <> Y)
> I noticed is that…
>
> EXPLAIN SELECT column FROM Table WHERE column = X AND column = Y
>
> …says that PostgreSQL actually wants to perform a Scan!
>

Actually no... you're original complain, never execute the scan
(because of the filter is recognized as false)
the one, with "col is null and col = Y" on the other side do execute the scan

postgres=# create table t1 (id int);
CREATE TABLE
postgres=# insert into t1 select generate_series(1, 100000);
INSERT 0 100000

postgres=# explain select * from t1 where id =5 and id =6;
                        QUERY PLAN
-----------------------------------------------------------
 Result  (cost=0.00..1855.06 rows=1 width=4)
   One-Time Filter: false
   ->  Seq Scan on t1  (cost=0.00..1855.06 rows=1 width=4)
         Filter: (id = 5)
(4 filas)

postgres=# explain analyze select * from t1 where id =5 and id =6;
                                      QUERY PLAN
---------------------------------------------------------------------------------------
 Result  (cost=0.00..1693.00 rows=1 width=4) (actual time=0.005..0.007
rows=0 loops=1)
   One-Time Filter: false
   ->  Seq Scan on t1  (cost=0.00..1693.00 rows=1 width=4) (never executed)
         Filter: (id = 5)
 Planning Time: 0.200 ms
 Execution Time: 0.056 ms
(6 filas)

--
Jaime Casanova
SYSTEMGUARDS



Re: WHERE column = X AND column = Y will always be zero matching rows

От
Tom Lane
Дата:
"[Quipsy] Markus Karg" <karg@quipsy.de> writes:
> (In the following X and Y are literals; X <> Y)
> I noticed is that...
> EXPLAIN SELECT column FROM Table WHERE column = X AND column = Y
> ...says that PostgreSQL actually wants to perform a Scan!

You should learn to read plans before complaining about them.

What you actually get is something like

regression=# explain analyze select * from tenk1 where hundred = 42 and hundred = 99;
                                             QUERY PLAN
-----------------------------------------------------------------------------------------------------
 Result  (cost=5.04..224.95 rows=1 width=244) (actual time=0.001..0.002 rows=0 loops=1)
   One-Time Filter: false
   ->  Bitmap Heap Scan on tenk1  (cost=5.04..224.95 rows=1 width=244) (never executed)
         Recheck Cond: (hundred = 42)
         ->  Bitmap Index Scan on tenk1_hundred  (cost=0.00..5.04 rows=100 width=0) (never executed)
               Index Cond: (hundred = 42)
 Planning Time: 0.148 ms
 Execution Time: 0.046 ms
(8 rows)

See the "never executed" markings?  The one-time filter prevents
anything below the Result from actually doing anything.

From memory, the reason it's like this is that it's the most
straightforward extension of cases involving pseudoconstant
conditions, that is parameters not known at plan time.
If you had something like "WHERE var = param1 AND var = const1"
then you might get something involving a "param1 = const1"
one-time filter, and below that a plan to retrieve rows with
"var = const1", which would only be run when the run-time-supplied
parameter happens to match the constant.

David's statement that we don't want to write code for such cases
isn't completely true, but it is true that we don't want to expend
code on producing a pretty-looking plan for them.

> BTW the same happens for column IS NULL AND column = Y.

That indeed is a case we haven't bothered with, and probably
never will.  The multiple-constants-equated-to-the-same-thing
situation is dealt with by the equivalence class machinery,
which has many other purposes but happens to be able to detect
that situation cheaply.  But IS NULL is not an equivalence
condition (as we define those, anyway).  Likewise, we won't
reduce something like "var > C1 AND var < C1" to constant-false;
there's just not enough win probability to justify spending
planner cycles looking for such cases.

            regards, tom lane



AW: WHERE column = X AND column = Y will always be zero matching rows

От
"[Quipsy] Markus Karg"
Дата:
Tom,

thanks a lot for your kind explanation, this was very interesting to read!

In fact, I have missed the "never executed" marks.

BTW, I did not "complain". I just noticed something and was unclear if this is a bug or not, and I clearly told so in
theopening of my posting. It was definitively not my intention to bother or annoy anybody, actually. Sorry to have
troubledso many in this mailing list. 

-Markus


-----Ursprüngliche Nachricht-----
Von: Tom Lane <tgl@sss.pgh.pa.us>
Gesendet: Donnerstag, 3. August 2023 19:59
An: [Quipsy] Markus Karg <karg@quipsy.de>
Cc: pgsql-bugs@lists.postgresql.org
Betreff: Re: WHERE column = X AND column = Y will always be zero matching rows

"[Quipsy] Markus Karg" <karg@quipsy.de> writes:
> (In the following X and Y are literals; X <> Y) I noticed is that...
> EXPLAIN SELECT column FROM Table WHERE column = X AND column = Y
> ...says that PostgreSQL actually wants to perform a Scan!

You should learn to read plans before complaining about them.

What you actually get is something like

regression=# explain analyze select * from tenk1 where hundred = 42 and hundred = 99;
                                             QUERY PLAN
-----------------------------------------------------------------------------------------------------
 Result  (cost=5.04..224.95 rows=1 width=244) (actual time=0.001..0.002 rows=0 loops=1)
   One-Time Filter: false
   ->  Bitmap Heap Scan on tenk1  (cost=5.04..224.95 rows=1 width=244) (never executed)
         Recheck Cond: (hundred = 42)
         ->  Bitmap Index Scan on tenk1_hundred  (cost=0.00..5.04 rows=100 width=0) (never executed)
               Index Cond: (hundred = 42)  Planning Time: 0.148 ms  Execution Time: 0.046 ms
(8 rows)

See the "never executed" markings?  The one-time filter prevents anything below the Result from actually doing
anything.

From memory, the reason it's like this is that it's the most straightforward extension of cases involving
pseudoconstantconditions, that is parameters not known at plan time. 
If you had something like "WHERE var = param1 AND var = const1"
then you might get something involving a "param1 = const1"
one-time filter, and below that a plan to retrieve rows with "var = const1", which would only be run when the
run-time-suppliedparameter happens to match the constant. 

David's statement that we don't want to write code for such cases isn't completely true, but it is true that we don't
wantto expend code on producing a pretty-looking plan for them. 

> BTW the same happens for column IS NULL AND column = Y.

That indeed is a case we haven't bothered with, and probably never will.  The
multiple-constants-equated-to-the-same-thing
situation is dealt with by the equivalence class machinery, which has many other purposes but happens to be able to
detectthat situation cheaply.  But IS NULL is not an equivalence condition (as we define those, anyway).  Likewise, we
won'treduce something like "var > C1 AND var < C1" to constant-false; there's just not enough win probability to
justifyspending planner cycles looking for such cases. 

            regards, tom lane