Re: multibyte charater set in levenshtein function

Поиск

Список

Период

Сортировка

От	Robert Haas
Тема	Re: multibyte charater set in levenshtein function
Дата	29 июля 2010 г. 20:16:28
Msg-id	AANLkTim+eW5RJNDhGb9Fcj2TwiVnich-k1Tvf=zzYj0=@mail.gmail.com обсуждение исходный текст
Ответ на	Re: multibyte charater set in levenshtein function (Robert Haas <robertmhaas@gmail.com>)
Список	pgsql-hackers

Дерево обсуждения

On Wed, Jul 21, 2010 at 5:59 PM, Robert Haas <robertmhaas@gmail.com> wrote:
> On Wed, Jul 21, 2010 at 2:47 PM, Alexander Korotkov
> <aekorotkov@gmail.com> wrote:
>> On Wed, Jul 21, 2010 at 10:25 PM, Robert Haas <robertmhaas@gmail.com> wrote:
>>>
>>> *scratches head*  Aren't you just moving the same call to a different
>>> place?
>>
>> So, where you can find this different place? :) In this patch
>> null-terminated strings are not used at all.
>
> I can't.  You win.  :-)
>
> Actually, I wonder if there's enough performance improvement there
> that we might think about extracting that part of the patch and apply
> it separately.  Then we could continue trying to figure out what to do
> with the rest.  Sometimes it's simpler to deal with one change at a
> time.

I tested this today and the answer was a resounding yes.  I ran
sum(levenshtein(t, 'foo')) over a dictionary file with about 2 million
words and got a speedup of around 15% just by eliminating the
text_to_cstring() calls.  So I've committed that part of this patch.

I'll try to look at the rest of the patch when I get a chance, but I'm
wondering if it might make sense to split it into two patches -
specifically, one patch to handle multi-byte characters correctly, and
then a second patch for the less-than-or-equal-to functions.  I think
that might simplify reviewing a bit.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company

В списке pgsql-hackers по дате отправления:

Предыдущее

От: Josh Berkus
Дата: 29 июля 2010 г., 20:10:00
Сообщение: Re: On Scalability

Следующее

От: Tom Lane
Дата: 29 июля 2010 г., 20:37:18
Сообщение: Re: reducing NUMERIC size for 9.1

Вход в личный кабинет

Восстановление пароля

Подтверждение аккаунта

Изменение пароля

Re: multibyte charater set in levenshtein function

Предыдущее

Следующее