Обсуждение: BUG #5255: COUNT(*) returns wrong result with LEFT JOIN

Поиск
Список
Период
Сортировка

BUG #5255: COUNT(*) returns wrong result with LEFT JOIN

От
"Matteo Beccati"
Дата:
The following bug has been logged online:

Bug reference:      5255
Logged by:          Matteo Beccati
Email address:      php@beccati.com
PostgreSQL version: 8.5alpha3
Operating system:   NetBSD 5.0.1
Description:        COUNT(*)  returns wrong result with LEFT JOIN
Details:

Discovered this while fixing the php test suite to deal with 8.5 changes.

With the following data set a SELECT * query returns 1 rows, while SELECT
COUNT(*) returns 2.

CREATE TABLE a (id int PRIMARY KEY);
CREATE TABLE b (id int PRIMARY KEY, a_id int);
INSERT INTO a VALUES (0), (1);
INSERT INTO b VALUES (0, 0), (1, NULL);

test=# SELECT * FROM b LEFT JOIN a ON (b.a_id = a.id) WHERE (a.id IS NULL OR
a.id > 0);
 id | a_id | id
----+------+----
  1 |      |
(1 row)

test=# SELECT COUNT(*) FROM b LEFT JOIN a ON (b.a_id = a.id) WHERE (a.id IS
NULL OR a.id > 0);
 count
-------
     2
(1 row)

Re: BUG #5255: COUNT(*) returns wrong result with LEFT JOIN

От
Tom Lane
Дата:
"Matteo Beccati" <php@beccati.com> writes:
> With the following data set a SELECT * query returns 1 rows, while SELECT
> COUNT(*) returns 2.

Hm, looks like the join-elimination patch is firing mistakenly.  It's not
so much the count(*) that does it as the lack of any select-list
references to a:

regression=# explain SELECT * FROM b LEFT JOIN a ON (b.a_id = a.id) WHERE (a.id IS NULL OR a.id > 0);
                           QUERY PLAN
-----------------------------------------------------------------
 Hash Left Join  (cost=64.00..132.85 rows=720 width=12)
   Hash Cond: (b.a_id = a.id)
   Filter: ((a.id IS NULL) OR (a.id > 0))
   ->  Seq Scan on b  (cost=0.00..31.40 rows=2140 width=8)
   ->  Hash  (cost=34.00..34.00 rows=2400 width=4)
         ->  Seq Scan on a  (cost=0.00..34.00 rows=2400 width=4)
(6 rows)

regression=# explain SELECT b.* FROM b LEFT JOIN a ON (b.a_id = a.id) WHERE (a.id IS NULL OR a.id > 0);
                     QUERY PLAN
-----------------------------------------------------
 Seq Scan on b  (cost=0.00..31.40 rows=2140 width=8)
(1 row)

I guess we missed something about when it's safe to do this optimization.

            regards, tom lane

Re: BUG #5255: COUNT(*) returns wrong result with LEFT JOIN

От
Tom Lane
Дата:
I wrote:
> I guess we missed something about when it's safe to do this optimization.

I've applied the attached patch to fix this.

            regards, tom lane

Index: joinpath.c
===================================================================
RCS file: /cvsroot/pgsql/src/backend/optimizer/path/joinpath.c,v
retrieving revision 1.126
diff -c -r1.126 joinpath.c
*** joinpath.c    19 Sep 2009 17:48:09 -0000    1.126
--- joinpath.c    25 Dec 2009 17:02:40 -0000
***************
*** 228,233 ****
--- 228,238 ----
       * We can't remove the join if any inner-rel attributes are used above
       * the join.
       *
+      * Note that this test only detects use of inner-rel attributes in
+      * higher join conditions and the target list.  There might be such
+      * attributes in pushed-down conditions at this join, too.  We check
+      * that case below.
+      *
       * As a micro-optimization, it seems better to start with max_attr and
       * count down rather than starting with min_attr and counting up, on the
       * theory that the system attributes are somewhat less likely to be wanted
***************
*** 253,265 ****
          RestrictInfo *restrictinfo = (RestrictInfo *) lfirst(l);

          /*
!          * We are always considering an outer join here, so ignore pushed-down
!          * clauses.  Also ignore anything that doesn't have a mergejoinable
!          * operator.
           */
          if (restrictinfo->is_pushed_down)
!             continue;

          if (!restrictinfo->can_join ||
              restrictinfo->mergeopfamilies == NIL)
              continue;            /* not mergejoinable */
--- 258,273 ----
          RestrictInfo *restrictinfo = (RestrictInfo *) lfirst(l);

          /*
!          * If we find a pushed-down clause, it must have come from above the
!          * outer join and it must contain references to the inner rel.  (If
!          * it had only outer-rel variables, it'd have been pushed down into
!          * the outer rel.)  Therefore, we can conclude that join removal
!          * is unsafe without any examination of the clause contents.
           */
          if (restrictinfo->is_pushed_down)
!             return false;

+         /* Ignore if it's not a mergejoinable clause */
          if (!restrictinfo->can_join ||
              restrictinfo->mergeopfamilies == NIL)
              continue;            /* not mergejoinable */

Re: BUG #5255: COUNT(*) returns wrong result with LEFT JOIN

От
Matteo Beccati
Дата:
Il 25/12/2009 18:13, Tom Lane ha scritto:
> I wrote:
>> I guess we missed something about when it's safe to do this optimization.
>
> I've applied the attached patch to fix this.

Thanks. Everything's working fine now!


Merry Xmas
--
Matteo Beccati

Development & Consulting - http://www.beccati.com/