Re: like/ilike improvements

Поиск
Список
Период
Сортировка
От Andrew Dunstan
Тема Re: like/ilike improvements
Дата
Msg-id 46F16CE1.7040409@dunslane.net
обсуждение исходный текст
Ответ на Re: like/ilike improvements  ("Guillaume Smet" <guillaume.smet@gmail.com>)
Ответы Re: like/ilike improvements
Re: like/ilike improvements
Список pgsql-hackers

Guillaume Smet wrote:
> Andrew, All,
>
>   
>> On 5/22/07, Andrew Dunstan <andrew@dunslane.net> wrote:
>>     
>>> But before I commit this I'd appreciate seeing some more testing, both
>>> for correctness and performance.
>>>       
>
> I finally found some time to test this patch on our data. As our
> production database is still using 8.1, I made my tests with 8.1.10
> and 8.3devel. As I had very weird results, I tested also 8.2.5.
>
> The patch seems to work as expected in my locale. I didn't notice
> problems during the tests I made except for the performance problem I
> describe below.
>
> The box is a recent dual core box using CentOS 5. It's a test box
> installed specifically to test PostgreSQL 8.3. Every version is
> compiled with the same compiler. Locale is fr_FR.UTF-8 and database is
> UTF-8 too.
> The table used to make the tests fits entirely in RAM.
>
> I tested a simple ILIKE query on our data with 8.3devel and it was far
> slower than with 8.1.10 (2 times slower). It was obviously not the
> expected result as it should have been faster considering your work.
> So I decided to test also with 8.2.5 and it seems a performance
> regression was introduced in 8.2 (and not in 8.3 which is in fact a
> bit faster than 8.2).
>
> I saw this item in 8.2 release notes:
> Allow ILIKE to work for multi-byte encodings (Tom)
> Internally, ILIKE now calls lower() and then uses LIKE.
> Locale-specific regular expression patterns still do not work in these
> encodings.
>
> Could it be responsible of such a slow down?
>
> I attached the results of my tests. If anyone needs more information,
> I'll be glad to provide them.
>
>   
>

Ugh.

It's at least good to see that the LIKE case has some useful speedup in 
8.3.

Can you run the same set of tests in a single byte encoding like latin1?

We might have to look at doing on-demand lowering, but in a case like 
yours it looks like we'd still end up lowering almost every character 
anyway, so I'm not quite sure what to do. Note that the 8.2 change was a 
bug fix, so we can't just revert it. Maybe we need to look closely at 
the efficiency of lower().

cheers

andrew




В списке pgsql-hackers по дате отправления:

Предыдущее
От: Peter Eisentraut
Дата:
Сообщение: Re: Timezones change - never ending story
Следующее
От: "Pedro Belmino"
Дата:
Сообщение: Debugger