Tatsuo Ishii <ishii@postgresql.org> writes:
>> It's not a problem, it's just pilot error, or possibly inadequate
>> documentation. pg_trgm uses the locale's definition of "alpha",
>> "digit", etc. In C locale only basic ASCII letters and digits will be
>> recognized as word constituents.
> That means there is no chance to make pg_trgm work with multibyte + C
> locale? If so, I will leave pg_trgm as it is and provide private
> patches for those who need the functionality.
Exactly what do you consider to be the missing functionality?
You need a notion of word vs non-word character from somewhere,
and the locale setting is the standard place to get that. The
core text search functionality behaves the same way.
regards, tom lane