Re: BUG #15014: pg_trgm regexp with wchar not good?

Поиск
Список
Период
Сортировка
От Tom Lane
Тема Re: BUG #15014: pg_trgm regexp with wchar not good?
Дата
Msg-id 18067.1516288553@sss.pgh.pa.us
обсуждение исходный текст
Ответ на BUG #15014: pg_trgm regexp with wchar not good?  (PG Bug reporting form <noreply@postgresql.org>)
Список pgsql-bugs
=?utf-8?q?PG_Bug_reporting_form?= <noreply@postgresql.org> writes:
> when i use pg_trgm's gin index, with wchar search, it's not good for regexp,
> but good for like express. 

pg_trgm is going to ignore characters that it doesn't think are letters or
digits.  Don't know if the characters you are working with are considered
letters in en_US locale, but if they aren't, that would likely result in
no usable trigrams in this string.  Another issue is that "trigrams" are
three *bytes* not three characters, so the useful information per trigram
is a lot lower when working with many-byte characters; that could also
lead to an index search being much less selective than you'd hope.

You might learn something by looking at the result of show_trgm() for
these strings, but I'm thinking there's no bug here, just design
limitations of the trigram approach.

            regards, tom lane


В списке pgsql-bugs по дате отправления:

Предыдущее
От: PG Bug reporting form
Дата:
Сообщение: BUG #15014: pg_trgm regexp with wchar not good?
Следующее
От: Joe Conway
Дата:
Сообщение: Re: BUG #15006: "make check" error if current user is "user"