Обсуждение: Index use difference betweer LIKE, LIKE ANY?

Поиск
Список
Период
Сортировка

Index use difference betweer LIKE, LIKE ANY?

От
"Sam Wong"
Дата:
I found that "LIKE", "= ANY (...)", "LIKE .. OR LIKE .." against a text
field used the index correctly, but not "LIKE ANY (...)". Would that be a
bug?

----
Here is my table and index:
CREATE TABLE shipment_lookup
(
  shipment_id text NOT NULL,
  lookup text NOT NULL
);
CREATE INDEX shipment_lookup_prefix
  ONshipment_lookup
  USING btree
  (upper(lookup));
----
The table have 10 million rows.

The following statements use the index as expected:
select * from shipment_lookup where (UPPER(lookup) = 'SD1102228482' or
UPPER(lookup) ='ABCDEFGHIJK')
select * from shipment_lookup where (UPPER(lookup) = ANY
(ARRAY['SD1102228482','ABCDEFGHIJK']))
select * from shipment_lookup where (UPPER(lookup) LIKE 'SD1102228482%' or
UPPER(lookup) LIKE 'ABCDEFGHIJK%')

The following statement results in a full table scan (but this is what I
really want to do):
select * from shipment_lookup where (UPPER(lookup) LIKE
ANY(ARRAY['SD1102228482%', 'ABCDEFGHIJK%']))

I could rewrite the LIKE ANY(ARRAY[...]) as an LIKE .. OR .. LIKE .., but I
wonder what makes the difference?

Thanks,
Sam

----
Version Info:
Postgresql: "PostgreSQL 8.4.5, compiled by Visual C++ build 1400, 32-bit" on
Windows 2003


Re: Index use difference betweer LIKE, LIKE ANY?

От
Josh Berkus
Дата:
On 2/25/11 5:31 AM, Sam Wong wrote:
> I found that "LIKE", "= ANY (...)", "LIKE .. OR LIKE .." against a text
> field used the index correctly, but not "LIKE ANY (...)". Would that be a
> bug?

No, it would be a TODO.  This is a known limitation; it needs some
clever code to make it work, and nobody's written it.

--
                                  -- Josh Berkus
                                     PostgreSQL Experts Inc.
                                     http://www.pgexperts.com

Re: Index use difference betweer LIKE, LIKE ANY?

От
Chetan Suttraway
Дата:


On Sun, Feb 27, 2011 at 2:43 AM, Josh Berkus <josh@agliodbs.com> wrote:
On 2/25/11 5:31 AM, Sam Wong wrote:
> I found that "LIKE", "= ANY (...)", "LIKE .. OR LIKE .." against a text
> field used the index correctly, but not "LIKE ANY (...)". Would that be a
> bug?

No, it would be a TODO.  This is a known limitation; it needs some
clever code to make it work, and nobody's written it.


came up with attached patch without thinking too much.
With this patch, the explain output for the same query is as below:

postgres=# explain select * from shipment_lookup where (UPPER(lookup) LIKE                  
ANY(ARRAY['SD1102228482%', 'ABCDEFGHIJK%']))
;e
                                           QUERY PLAN                                           
-------------------------------------------------------------------------------------------------
 Seq Scan on shipment_lookup  (cost=0.00..254057.36 rows=2000 width=14)
   Filter: ((upper(lookup) ~~ 'SD1102228482%'::text) OR (upper(lookup) ~~ 'ABCDEFGHIJK%'::text))
(2 rows)

postgres-#

The thing to be noted here is that  the where clause "<pred> LIKE ANY ARRAY[..]"
has been converted into
(<pred> LIKE first_array_element) or (<pred> LIKE second_array_element) or ....

Please pass on your inputs.

Regards,
Chetan

--
Chetan Sutrave
Senior Software Engineer
EnterpriseDB Corporation
The Enterprise PostgreSQL Company
Phone: +91.20.30589523

Website: www.enterprisedb.com
EnterpriseDB Blog: http://blogs.enterprisedb.com/
Follow us on Twitter: http://www.twitter.com/enterprisedb

This e-mail message (and any attachment) is intended for the use of the individual or entity to whom it is addressed. This message contains information from EnterpriseDB Corporation that may be privileged, confidential, or exempt from disclosure under applicable law. If you are not the intended recipient or authorized to receive this for the intended recipient, any use, dissemination, distribution, retention, archiving, or copying of this communication is strictly prohibited. If you have received this e-mail in error, please notify the sender immediately by reply e-mail and delete this message.

Вложения

Re: Index use difference betweer LIKE, LIKE ANY?

От
Robert Haas
Дата:
On Tue, Mar 15, 2011 at 8:30 AM, Chetan Suttraway
<chetan.suttraway@enterprisedb.com> wrote:
> On Sun, Feb 27, 2011 at 2:43 AM, Josh Berkus <josh@agliodbs.com> wrote:
>>
>> On 2/25/11 5:31 AM, Sam Wong wrote:
>> > I found that "LIKE", "= ANY (...)", "LIKE .. OR LIKE .." against a text
>> > field used the index correctly, but not "LIKE ANY (...)". Would that be
>> > a
>> > bug?
>>
>> No, it would be a TODO.  This is a known limitation; it needs some
>> clever code to make it work, and nobody's written it.
>>
>
> came up with attached patch without thinking too much.
> With this patch, the explain output for the same query is as below:
>
> postgres=# explain select * from shipment_lookup where (UPPER(lookup)
> LIKE
> ANY(ARRAY['SD1102228482%', 'ABCDEFGHIJK%']))
> ;e
>                                            QUERY
> PLAN
> -------------------------------------------------------------------------------------------------
>  Seq Scan on shipment_lookup  (cost=0.00..254057.36 rows=2000 width=14)
>    Filter: ((upper(lookup) ~~ 'SD1102228482%'::text) OR (upper(lookup) ~~
> 'ABCDEFGHIJK%'::text))
> (2 rows)
>
> postgres-#
>
> The thing to be noted here is that  the where clause "<pred> LIKE ANY
> ARRAY[..]"
> has been converted into
> (<pred> LIKE first_array_element) or (<pred> LIKE second_array_element) or
> ....
>
> Please pass on your inputs.

Please add your patch here:

https://commitfest.postgresql.org/action/commitfest_view/open

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Re: Index use difference betweer LIKE, LIKE ANY?

От
Heikki Linnakangas
Дата:
On 15.03.2011 14:30, Chetan Suttraway wrote:
> On Sun, Feb 27, 2011 at 2:43 AM, Josh Berkus<josh@agliodbs.com>  wrote:
>
>> On 2/25/11 5:31 AM, Sam Wong wrote:
>>> I found that "LIKE", "= ANY (...)", "LIKE .. OR LIKE .." against a text
>>> field used the index correctly, but not "LIKE ANY (...)". Would that be a
>>> bug?
>>
>> No, it would be a TODO.  This is a known limitation; it needs some
>> clever code to make it work, and nobody's written it.
>>
>>
> came up with attached patch without thinking too much.
> With this patch, the explain output for the same query is as below:
>
> postgres=# explain select * from shipment_lookup where (UPPER(lookup)
> LIKE
> ANY(ARRAY['SD1102228482%', 'ABCDEFGHIJK%']))
> ;e
>                                             QUERY
> PLAN
> -------------------------------------------------------------------------------------------------
>   Seq Scan on shipment_lookup  (cost=0.00..254057.36 rows=2000 width=14)
>    * Filter: ((upper(lookup) ~~ 'SD1102228482%'::text) OR (upper(lookup) ~~
> 'ABCDEFGHIJK%'::text))*
> (2 rows)
>
> postgres-#
>
> The thing to be noted here is that  the where clause "<pred>  LIKE ANY
> ARRAY[..]"
> has been converted into
> (<pred>  LIKE first_array_element) or (<pred>  LIKE second_array_element) or
> ....
>
> Please pass on your inputs.

This suffers from the same multiple-evaluation issue that was recently
discovered in BETWEEN and IN expressions
(http://archives.postgresql.org/message-id/4D95B605.2020709@enterprisedb.com).
This transformation would also need to be done in the planner, after
checking that the left-hand expression is not volatile.

Also, even when safe, it's not clear that the transformation is always a
win. The left-hand expression could be expensive, in which case having
to evaluate it multiple times could hurt performance. Maybe yo

--
   Heikki Linnakangas
   EnterpriseDB   http://www.enterprisedb.com

Re: Index use difference betweer LIKE, LIKE ANY?

От
Heikki Linnakangas
Дата:
On 06.06.2011 12:43, Heikki Linnakangas wrote:
> Also, even when safe, it's not clear that the transformation is always a
> win. The left-hand expression could be expensive, in which case having
> to evaluate it multiple times could hurt performance. Maybe yo

Sorry, hit "send" too early.

Maybe you could put in some heuristic to only do the transformation when
the left-hand expression is cheap, or maybe use something like the
CaseTestExpr to avoid multiple evaluation and still use the OR form.
Also, if the array is very large, opening it into the OR form could
increase plan time substantially, so we'd probably only want to do it if
there's any Vars involved, and thus any chance of matching an index.

--
   Heikki Linnakangas
   EnterpriseDB   http://www.enterprisedb.com