Windows UTF-8, non-ICU collation trouble

Поиск

Список

Период

Сортировка

От	Noah Misch
Тема	Windows UTF-8, non-ICU collation trouble
Дата	6 декабря 2019 г. 09:34:01
Msg-id	20191206063401.GB1629883@rfd.leadboat.com обсуждение исходный текст
Ответы	Re: Windows UTF-8, non-ICU collation trouble (Thomas Munro <thomas.munro@gmail.com>)
Список	pgsql-hackers

Дерево обсуждения

We use system UTF-16 collation to implement UTF-8 collation on Windows.  The
PostgreSQL security team received a report, from Timothy Kuun, that this
collation does not uphold the "symmetric law" and "transitive law" that we
require for btree operator classes.  The attached test program demonstrates
this.  http://www.delphigroups.info/2/62/478610.html quotes reports of that
problem going back eighteen years.  Most code points are unaffected.  Indexing
an affected code point using such a collation can cause btree index scans to not
find a row they should find and can make a UNIQUE or PRIMARY KEY constraint
admit a duplicate.  The security team determined that this doesn't qualify as a
security vulnerability, but it's still a bug.

All I can think to do is issue a warning whenever a CREATE DATABASE or CREATE
COLLATION combines UTF8 encoding with a locale having this problem.  In a
greenfield, I would forbid affected combinations of encoding and locale.  That
is too harsh, considering the few code points affected and the difficulty of
changing the collation of existing databases.  For CREATE DATABASE, all except
LOCALE=C would trigger the warning.  For CREATE COLLATION, ICU locales would
also not trigger the warning.  Hence, the chief workaround is to use LOCALE=C at
the database level and ICU collations for indexes and operator invocations.
(The ability to use an ICU collation at the database level would improve the
user experience here.)  Better ideas?

Вложения

locale-sort.c

В списке pgsql-hackers по дате отправления:

Предыдущее

От: Tatsuro Yamada
Дата: 06 декабря 2019 г., 09:23:58
Сообщение: Re: progress report for ANALYZE

Следующее

От: Rushabh Lathia
Дата: 06 декабря 2019 г., 09:35:19
Сообщение: Re: backup manifests

Вход в личный кабинет

Восстановление пароля

Подтверждение аккаунта

Изменение пароля

Windows UTF-8, non-ICU collation trouble

Вложения

Предыдущее

Следующее