Re: trgm regex index peculiarity

Поиск
Список
Период
Сортировка
От Tom Lane
Тема Re: trgm regex index peculiarity
Дата
Msg-id 29409.1396745529@sss.pgh.pa.us
обсуждение исходный текст
Ответ на Re: trgm regex index peculiarity  (Alexander Korotkov <aekorotkov@gmail.com>)
Список pgsql-hackers
Alexander Korotkov <aekorotkov@gmail.com> writes:
> Next revision of patch is attached. Changes are so:
> 1) Notion "penalty" is used instead of "size".
> 2) We try to reduce total penalty to WISH_TRGM_PENALTY, but restriction is
> MAX_TRGM_COUNT total trigrams count.
> 3) Penalties are assigned to particular color trigram classes. I.e.
> separate penalties for __a, _aa, _a_, aa_. It's based on analysis of
> trigram frequencies in Oscar Wilde writings. We can end up with different
> numbers, but I don't think they will be dramatically different.

Committed with cosmetic improvements (adjusting the comments mostly).

The new whitespace penalties look reasonably sane to me.  I wonder though
if WISH_TRGM_PENALTY is too small --- it seems like this code will tend to
select many fewer trigrams than the old code did.  What testing did you do
that led you to select the specific value of 16?
        regards, tom lane



В списке pgsql-hackers по дате отправления:

Предыдущее
От: Alvaro Herrera
Дата:
Сообщение: Re: Another assert failure from no-palloc-in-critical-sections
Следующее
От: Amit Kapila
Дата:
Сообщение: Re: [BUG FIX] Compare returned value by socket() against PGINVALID_SOCKET instead of < 0