pgsql: Add SQL functions for Unicode normalization

Поиск
Список
Период
Сортировка
От Peter Eisentraut
Тема pgsql: Add SQL functions for Unicode normalization
Дата
Msg-id E1jJtwj-0000LS-M9@gemulon.postgresql.org
обсуждение исходный текст
Список pgsql-committers
Add SQL functions for Unicode normalization

This adds SQL expressions NORMALIZE() and IS NORMALIZED to convert and
check Unicode normal forms, per SQL standard.

To support fast IS NORMALIZED tests, we pull in a new data file
DerivedNormalizationProps.txt from Unicode and build a lookup table
from that, using techniques similar to ones already used for other
Unicode data.  make update-unicode will keep it up to date.  We only
build and use these tables for the NFC and NFKC forms, because they
are too big for NFD and NFKD and the improvement is not significant
enough there.

Reviewed-by: Daniel Verite <daniel@manitou-mail.org>
Reviewed-by: Andreas Karlsson <andreas@proxel.se>
Discussion: https://www.postgresql.org/message-id/flat/c1909f27-c269-2ed9-12f8-3ab72c8caf7a@2ndquadrant.com

Branch
------
master

Details
-------
https://git.postgresql.org/pg/commitdiff/2991ac5fc9b3904ca4582be6d323497d7c3d17c9

Modified Files
--------------
doc/src/sgml/charset.sgml                          |   10 +
doc/src/sgml/func.sgml                             |   48 +
src/backend/catalog/sql_features.txt               |    2 +-
src/backend/catalog/system_views.sql               |   15 +
src/backend/parser/gram.y                          |   41 +-
src/backend/utils/adt/varlena.c                    |  150 +
src/common/unicode/.gitignore                      |    1 +
src/common/unicode/Makefile                        |    9 +-
.../unicode/generate-unicode_normprops_table.pl    |   86 +
src/common/unicode_norm.c                          |  110 +
src/include/catalog/catversion.h                   |    2 +-
src/include/catalog/pg_proc.dat                    |    8 +
src/include/common/unicode_norm.h                  |   10 +
src/include/common/unicode_normprops_table.h       | 6154 ++++++++++++++++++++
src/include/parser/kwlist.h                        |    6 +
src/test/regress/expected/unicode.out              |   81 +
src/test/regress/expected/unicode_1.out            |    3 +
src/test/regress/parallel_schedule                 |    2 +-
src/test/regress/serial_schedule                   |    1 +
src/test/regress/sql/unicode.sql                   |   32 +
20 files changed, 6764 insertions(+), 7 deletions(-)


В списке pgsql-committers по дате отправления:

Предыдущее
От: Peter Eisentraut
Дата:
Сообщение: pgsql: doc: Update for Unix-domain sockets on Windows
Следующее
От: Fujii Masao
Дата:
Сообщение: Re: pgsql: Fix whitespace