Re: like/ilike improvements
От | Andrew Dunstan |
---|---|
Тема | Re: like/ilike improvements |
Дата | |
Msg-id | 46F16CE1.7040409@dunslane.net обсуждение исходный текст |
Ответ на | Re: like/ilike improvements ("Guillaume Smet" <guillaume.smet@gmail.com>) |
Ответы |
Re: like/ilike improvements
("Guillaume Smet" <guillaume.smet@gmail.com>)
Re: like/ilike improvements ("Guillaume Smet" <guillaume.smet@gmail.com>) |
Список | pgsql-hackers |
Guillaume Smet wrote: > Andrew, All, > > >> On 5/22/07, Andrew Dunstan <andrew@dunslane.net> wrote: >> >>> But before I commit this I'd appreciate seeing some more testing, both >>> for correctness and performance. >>> > > I finally found some time to test this patch on our data. As our > production database is still using 8.1, I made my tests with 8.1.10 > and 8.3devel. As I had very weird results, I tested also 8.2.5. > > The patch seems to work as expected in my locale. I didn't notice > problems during the tests I made except for the performance problem I > describe below. > > The box is a recent dual core box using CentOS 5. It's a test box > installed specifically to test PostgreSQL 8.3. Every version is > compiled with the same compiler. Locale is fr_FR.UTF-8 and database is > UTF-8 too. > The table used to make the tests fits entirely in RAM. > > I tested a simple ILIKE query on our data with 8.3devel and it was far > slower than with 8.1.10 (2 times slower). It was obviously not the > expected result as it should have been faster considering your work. > So I decided to test also with 8.2.5 and it seems a performance > regression was introduced in 8.2 (and not in 8.3 which is in fact a > bit faster than 8.2). > > I saw this item in 8.2 release notes: > Allow ILIKE to work for multi-byte encodings (Tom) > Internally, ILIKE now calls lower() and then uses LIKE. > Locale-specific regular expression patterns still do not work in these > encodings. > > Could it be responsible of such a slow down? > > I attached the results of my tests. If anyone needs more information, > I'll be glad to provide them. > > > Ugh. It's at least good to see that the LIKE case has some useful speedup in 8.3. Can you run the same set of tests in a single byte encoding like latin1? We might have to look at doing on-demand lowering, but in a case like yours it looks like we'd still end up lowering almost every character anyway, so I'm not quite sure what to do. Note that the 8.2 change was a bug fix, so we can't just revert it. Maybe we need to look closely at the efficiency of lower(). cheers andrew
В списке pgsql-hackers по дате отправления: