Обсуждение: LIKE without wildcard different from =

Поиск
Список
Период
Сортировка

LIKE without wildcard different from =

От
"Kevin Grittner"
Дата:
We have been using the C locale for everything at our site, but
there is occasionally talk of supporting characters outside the
ASCII7 set. In playing around with indexing, to see what the impact
of that would be, I stumbled across something which was mildly
surprising.

In the C locale, if you want to search for an exact value which
doesn't contain wildcard characters, it doesn't matter whether you
use the 'LIKE' operator or the '=' operator.  With LATIN1 encoding,
it made three orders of magnitude difference, both in the estimated
cost and the actual run time.  I'm not entirely clear on whether it
would be *incorrect* for PostgreSQL to automatically turn the second
query below into the first, or just too expensive an optimization to
check for compared to how often it might help.

"SccaParty_SearchName" btree ("searchName" varchar_pattern_ops)

explain analyze select "searchName" from "SccaParty"
  where "searchName" like 'SMITH,JOHNBRACEYJR';

 Index Scan using "SccaParty_SearchName" on "SccaParty"
     (cost=0.00..2.94 rows=22 width=18)
     (actual time=0.046..0.051 rows=2 loops=1)
   Index Cond: (("searchName")::text ~=~ 'SMITH,JOHNBRACEYJR'::text)
   Filter: (("searchName")::text ~~ 'SMITH,JOHNBRACEYJR'::text)
 Total runtime: 0.083 ms

explain analyze select "searchName" from "SccaParty"
  where "searchName" = 'SMITH,JOHNBRACEYJR';

 Seq Scan on "SccaParty"
     (cost=0.00..3014.49 rows=22 width=18)
     (actual time=2.395..54.228 rows=2 loops=1)
   Filter: (("searchName")::text = 'SMITH,JOHNBRACEYJR'::text)
 Total runtime: 54.274 ms

I don't have a problem, and am not suggesting any action; just
trying to understand this.

-Kevin

Re: LIKE without wildcard different from =

От
Tom Lane
Дата:
"Kevin Grittner" <Kevin.Grittner@wicourts.gov> writes:
> We have been using the C locale for everything at our site, but
> there is occasionally talk of supporting characters outside the
> ASCII7 set. In playing around with indexing, to see what the impact
> of that would be, I stumbled across something which was mildly
> surprising.

> In the C locale, if you want to search for an exact value which
> doesn't contain wildcard characters, it doesn't matter whether you
> use the 'LIKE' operator or the '=' operator.  With LATIN1 encoding,
> it made three orders of magnitude difference, both in the estimated
> cost and the actual run time.

What PG version are you testing?  8.4 and up should know that an
exact-match pattern can be optimized regardless of the lc_collate
setting.

            regards, tom lane

Re: LIKE without wildcard different from =

От
"Kevin Grittner"
Дата:
Tom Lane <tgl@sss.pgh.pa.us> wrote:
> "Kevin Grittner" <Kevin.Grittner@wicourts.gov> writes:
>> We have been using the C locale for everything at our site, but
>> there is occasionally talk of supporting characters outside the
>> ASCII7 set. In playing around with indexing, to see what the
>> impact of that would be, I stumbled across something which was
>> mildly surprising.
>
>> In the C locale, if you want to search for an exact value which
>> doesn't contain wildcard characters, it doesn't matter whether
>> you use the 'LIKE' operator or the '=' operator.  With LATIN1
>> encoding, it made three orders of magnitude difference, both in
>> the estimated cost and the actual run time.
>
> What PG version are you testing?  8.4 and up should know that an
> exact-match pattern can be optimized regardless of the lc_collate
> setting.

For reasons not worth getting into, I had an 8.3.8 database sitting
around in this locale, so I was testing things there.  I'll take the
time to copy into an 8.4.4 database for further testing, and maybe
9.0 beta, too.  That'll take hours, though, so I can't immediately
test it.

To be clear, though, the problem isn't that it didn't turn a LIKE
with no wildcard characters into an equality test, it's that it
would have been three orders of magnitude faster (because of an
available index with an opclass specification) if it had treated an
equality test as a LIKE.

-Kevin

Re: LIKE without wildcard different from =

От
Tom Lane
Дата:
"Kevin Grittner" <Kevin.Grittner@wicourts.gov> writes:
> To be clear, though, the problem isn't that it didn't turn a LIKE
> with no wildcard characters into an equality test, it's that it
> would have been three orders of magnitude faster (because of an
> available index with an opclass specification) if it had treated an
> equality test as a LIKE.

Ah.  Well, the real fix for that is also in 8.4: we got rid of the
separate ~=~ operator, so a text_pattern_ops index is now usable
for plain =.

            regards, tom lane

Re: LIKE without wildcard different from =

От
"Kevin Grittner"
Дата:
Tom Lane <tgl@sss.pgh.pa.us> wrote:

> Ah.  Well, the real fix for that is also in 8.4: we got rid of the
> separate ~=~ operator, so a text_pattern_ops index is now usable
> for plain =.

Nice!

Thanks,

-Kevin