Обсуждение: BUG #4974: Equivalent of "= ANY" and "BETWEEN" not observed by planner.

Поиск
Список
Период
Сортировка

BUG #4974: Equivalent of "= ANY" and "BETWEEN" not observed by planner.

От
"Ian Turner"
Дата:
The following bug has been logged online:

Bug reference:      4974
Logged by:          Ian Turner
Email address:      ian.turner@deshaw.com
PostgreSQL version: 8.3
Operating system:   Ubuntu 8.10
Description:        Equivalent of "= ANY" and "BETWEEN" not observed by
planner.
Details:

Consider the following table with a few thousand rows:
CREATE TABLE example (pk INTEGER PRIMARY KEY);

The following queries are equivalent, because there are no integers between
5 and 6 and because the BETWEEN operator contemplates a closed range.
SELECT * FROM example WHERE pk IN (5,6);
SELECT * FROM example WHERE pk BETWEEN 5 AND 6;

Yet the two queries generate very different plans:
sysdb=# explain select * from example where pk between 5 and 6;
                                   QUERY PLAN
----------------------------------------------------------------------------
-----
 Index Scan using example_pkey on example  (cost=0.00..8.27 rows=1 width=71)
  Index Cond: ((uid >= 5) AND (uid <= 6))
(2 rows)

ysdb=# explain select * from example where pk IN (5, 6);
                                 QUERY PLAN
----------------------------------------------------------------------------
-
 Bitmap Heap Scan on example  (cost=8.52..14.88 rows=2 width=71)
   Recheck Cond: (pk = ANY ('{5,6}'::integer[]))
   ->  Bitmap Index Scan on example_pkey  (cost=0.00..8.52 rows=2 width=0)
         Index Cond: (pk = ANY ('{5,6}'::integer[]))
(4 rows)

The bug is that the planner should be able to consider the use of a vanilla
index scan for = ANY operators when the values are consecutive for the value
type in question. Probably the easiest way is to detect this case and
rewrite it as using <= / >= operators.

More generally, it might be desirable to use the index scan even when values
are not consecutive (but are very close). This last idea is a lot more
complex, however, as it depends on the distribution of values in the table.

Re: BUG #4974: Equivalent of "= ANY" and "BETWEEN" not observed by planner.

От
Tom Lane
Дата:
"Ian Turner" <ian.turner@deshaw.com> writes:
> The following queries are equivalent, because there are no integers between
> 5 and 6 and because the BETWEEN operator contemplates a closed range.
> SELECT * FROM example WHERE pk IN (5,6);
> SELECT * FROM example WHERE pk BETWEEN 5 AND 6;

The planner intentionally does not do very many inferences that are as
datatype-dependent as this one would be.  It doesn't fit into the system
design.  For the most part the possible gain is not large anyway.

            regards, tom lane

Re: BUG #4974: Equivalent of "= ANY" and "BETWEEN" not observed by planner.

От
Greg Stark
Дата:
On Wed, Aug 12, 2009 at 3:15 AM, Tom Lane<tgl@sss.pgh.pa.us> wrote:
> "Ian Turner" <ian.turner@deshaw.com> writes:
>> The following queries are equivalent, because there are no integers betw=
een
>> 5 and 6 and because the BETWEEN operator contemplates a closed range.
>> SELECT * FROM example WHERE pk IN (5,6);
>> SELECT * FROM example WHERE pk BETWEEN 5 AND 6;
>
> The planner intentionally does not do very many inferences that are as
> datatype-dependent as this one would be. =A0It doesn't fit into the system
> design. =A0For the most part the possible gain is not large anyway.


Hm, we could do it in a data-type independent way which would work
even for non-integral values by imposing a recheck condition. That
would work for any data type with a btree ordering.

I think you're right that the potential gain isn't very big. A series
of equality values versus a range scan which uses the same index pages
is going to be about the same i/o. The only saving is the repeated
descent of the tree to find the leaf pages. Which should hopefully all
be in cache anyways.

I suppose it might help if you were then doing a merge join or order
by on the same key since the range scan will be in order but the
bitmap index scan would have to be sorted. But if there were enough
tuples for the sort to actually matter surely we would want the bitmap
index scan anyways.


--=20
greg
http://mit.edu/~gsstark/resume.pdf