Dennis Gearon wrote:
> John Sidney-Woollett wrote:
>
>> For what it's worth, we have a unicode 7.4.1 database which gives us
>> the sorting and searching behaviour that we expect (with the
>> exception of the upper and lower functions). We access the data via
>> jdbc so we don't have to deal with encoding issues per se as the
>> driver does any translation for us.
>>
>> Currently we don't use any LIKE statements, but if we did, and wanted
>> them optimized then we'd use the appropriate OP Class when defining
>> the index. We also don't use any REGEX expressions. And we'll shortly
>> be experimenting with tsearch2...
>>
>> List of databases
>> Name | Owner | Encoding
>> ---------------+----------+----------
>> test | postgres | UNICODE
>>
>> Setting the psql client encoding to Latin1 and inserting the
>> following data...
>>
>> # select * from johntest;
>> id | value
>> ----+-------
>> 1 | test
>> 2 | tést
>> 3 | tèst
>> 4 | taste
>> 5 | TEST
>> 6 | TÉST
>> 7 | TÈST
>> 8 | TASTE
>> (8 rows)
>>
>> [snip]
>>
>> using a LIKE operation also works as expected (again no index on
>> value field)
>>
>> # select * from johntest where value like 't%';
>> id | value
>> ----+-------
>> 1 | test
>> 2 | tést
>> 3 | tèst
>> 4 | taste
>> (4 rows)
>>
> Like works, but it can't use an index, and so would have horibble
> performance vs. the situation where it CAN use an index. I believe
> this is how Postgres is working now.
If you use one of the OPCLASSes then LIKE operations using indexes
should work, I believe.
See http://www.postgresql.org/docs/7.4/static/indexes-opclass.html
John Sidney-Woollett