Re: Comparing strings with non-ASCII characters
| От | Tom Lane |
|---|---|
| Тема | Re: Comparing strings with non-ASCII characters |
| Дата | |
| Msg-id | 21525.1037199502@sss.pgh.pa.us обсуждение |
| Ответ на | Comparing strings with non-ASCII characters (Artur Rataj <arataj@iitis.gliwice.pl>) |
| Ответы |
Re: Comparing strings with non-ASCII characters
|
| Список | pgsql-general |
Artur Rataj <arataj@iitis.gliwice.pl> writes:
> Anyway, why the index optimizations are disabled for the ~ operator
> and not for the < or > operators in the locale?
The < and > operators are compatible with the index ordering, by
definition. The trouble with ~ is that it is not doing ordering,
but pattern matching. You can exploit an index to narrow down
the candidates for a left-anchored pattern only when the index
ordering is strict lexicographic ... and in all too many non-C
locales, it ain't. Digraphs, multipass sorting rules, and things
like that are killers.
We have tried and failed (repeatedly) to find a way to use non-C
indexes for LIKE and ~. If you'd like to try again, you're more
than welcome, but I suggest you read the PG list archives to learn
the reasons why previous attempts didn't work.
The long-run answer to this will probably be to allow individual columns
to be declared as being of particular locales; then you could assign C
locale to a column you need to do pattern matching on. This seems to
require building our own locale library :-( ... so it's not going to
happen quickly.
BTW, you might think about using full-text search (contrib/tsearch
or contrib/fulltextindex) instead of pattern matching.
regards, tom lane
В списке pgsql-general по дате отправления: