Re: Unicode FFFF Special Codepoint should always collate high.

Поиск
Список
Период
Сортировка
От Thomas Munro
Тема Re: Unicode FFFF Special Codepoint should always collate high.
Дата
Msg-id CA+hUKGJF0C0_rch1emO7id0QxDNa2s0pPtQCVjuTPM3mqLs+hg@mail.gmail.com
обсуждение исходный текст
Ответ на Unicode FFFF Special Codepoint should always collate high.  (Telford Tendys <psql@lnx-bsp.net>)
Ответы Re: Unicode FFFF Special Codepoint should always collate high.
Список pgsql-bugs
On Wed, Jun 23, 2021 at 3:00 PM Telford Tendys <psql@lnx-bsp.net> wrote:
> Thank you for taking a look at it, you seem to have confirmed that
> this is coming from the system itself. Yes, my purpose is to do
> prefix searching on strings by specifying a range and taking advantage
> of a B-Tree index, exactly as described in the quote above.

Just in case you didn't know, PostgreSQL knows how to convert a prefix
search for 'aar%' into a range search with a related trick in some
limited circumstances:

tmunro=> create table t (name text);
CREATE TABLE
tmunro=> insert into t values ('aardvark'), ('buffalo'), ('cat');
INSERT 0 3
tmunro=> create index on t(name text_pattern_ops);
CREATE INDEX
tmunro=> analyze t;
ANALYZE
tmunro=> explain select * from t where name like 'aar%';
                               QUERY PLAN
-------------------------------------------------------------------------
 Index Only Scan using t_name_idx on t  (cost=0.13..8.15 rows=1 width=7)
   Index Cond: ((name ~>=~ 'aar'::text) AND (name ~<~ 'aas'::text))
   Filter: (name ~~ 'aar%'::text)
(3 rows)

>     https://bugzilla.redhat.com/show_bug.cgi?id=1975045

Seems a little light on references and justifications for the
expectiation.  ICU's results could be useful in the discussion.  It's
interesting that more recent Unicode versions removed the prohibition
on using these code points at all (some earlier version said that if
your string contained them, they weren't Unicode, apparently,
according to my quick read of
https://en.wikipedia.org/wiki/Specials_(Unicode_block) but I didn't
have time to look into any source documents).



В списке pgsql-bugs по дате отправления:

Предыдущее
От: Telford Tendys
Дата:
Сообщение: Re: Unicode FFFF Special Codepoint should always collate high.
Следующее
От: Vladimir Shvartsgor
Дата:
Сообщение: Example in "42.8. Transaction Management" doesn't work for PostgreSQL v 12.7