Re: Support regular expressions with nondeterministic collations
От | Tom Lane |
---|---|
Тема | Re: Support regular expressions with nondeterministic collations |
Дата | |
Msg-id | 2808617.1734551724@sss.pgh.pa.us обсуждение исходный текст |
Ответ на | Re: Support regular expressions with nondeterministic collations (Jeff Davis <pgsql@j-davis.com>) |
Ответы |
Re: Support regular expressions with nondeterministic collations
|
Список | pgsql-hackers |
Jeff Davis <pgsql@j-davis.com> writes: > On Mon, 2024-12-16 at 17:16 -0500, Tom Lane wrote: >> The existing logic in the regex engine for case-insensitive matching >> is to convert every letter to a bracket expression containing all >> its case variants. For example, "a" becomes "[aA]" and "[xY1]" >> becomes "[xXyY1]". This fails on "ß", so a better way would be >> nice... > We have a couple options: > * create more complex regexes like "(ß|[sS][sS])" > * case fold the pattern first, and then lazily case fold the string as > we match against it > The former sounds faster but the latter sounds simpler. Yeah, the latter sounds really slow. It would not actually be too hard I think to build the right regex, if we had the information available as to what all the case-variants are. The problem at the moment is that the existing code assumes that pg_wc_tolower and pg_wc_toupper together give us all the case variants, and that API can't cope with multi-glyph expansions. regards, tom lane
В списке pgsql-hackers по дате отправления: