[PATCH] Expand character set for ltree labels

Поиск
Список
Период
Сортировка
От Garen Torikian
Тема [PATCH] Expand character set for ltree labels
Дата
Msg-id CAGXsc+-mNg9Gc0rp-ER0sv+zkZSZp2wE9-LX6XcoWSLVz22tZA@mail.gmail.com
обсуждение исходный текст
Ответы Re: [PATCH] Expand character set for ltree labels  (Nathan Bossart <nathandbossart@gmail.com>)
Re: [PATCH] Expand character set for ltree labels  (Tom Lane <tgl@sss.pgh.pa.us>)
Список pgsql-hackers
Dear hackers,

I am submitting a patch to expand the label requirements for ltree. 

The current format is restricted to alphanumeric characters, plus _. Unfortunately, for non-English labels, this set is insufficient. Rather than figure out how to expand this set to include characters beyond the ASCII limit, I have instead opted to provide users with some mechanism for storing encoded UTF-8 characters which is widely used: punycode (https://en.wikipedia.org/wiki/Punycode).

The punycode range of characters is the exact same set as the existing ltree range, with the addition of a hyphen (-). Within this system, any human language can be encoded using just A-Za-z0-9-. 

On top of this, I added support for two more characters: # and ;, which are used for HTML entities. Note that & and % have special significance in the existing ltree logic; users would have to encode items as #20; (rather than %20). This seems a fair compromise.

Since the encoding could make a regular slug even longer, I have also doubled the character limit, from 256 to 512.

Please let me know if I can provide any more information or changes.

Very sincerely,
Garen
Вложения

В списке pgsql-hackers по дате отправления:

Предыдущее
От: Tom Lane
Дата:
Сообщение: Re: Reducing the chunk header sizes on all memory context types
Следующее
От: Andres Freund
Дата:
Сообщение: Re: interrupted tap tests leave postgres instances around