Re: proposal: unescape_text function

Поиск
Список
Период
Сортировка
От Chapman Flack
Тема Re: proposal: unescape_text function
Дата
Msg-id 5FC7AAF5.7010209@anastigmatix.net
обсуждение исходный текст
Ответ на Re: proposal: unescape_text function  (Pavel Stehule <pavel.stehule@gmail.com>)
Ответы Re: proposal: unescape_text function  (Chapman Flack <chap@anastigmatix.net>)
Список pgsql-hackers
On 12/02/20 05:37, Pavel Stehule wrote:
> 2. there can be optional parameter "prefix" with default "\". But with "\u"
> it can be compatible with Java or Python.

Java's unicode escape form is one of those early ones that lack
a six-digit form, and where any character outside of the basic multilingual
plane has to be represented by two four-digit escapes in a row, encoding
the two surrogates that would make up the character's representation
in UTF-16.

Obviously that's an existing form that's out there, so it's not a bad
thing to have some kind of support for it, but it's not a great
representation to encourage people to use.

Python, by contrast, has both \uxxxx and \Uxxxxxxxx where you would use
the latter to represent a non-BMP character directly. So the Java and
Python schemes should be considered distinct.

In Perl, there is a useful extension to regexp substitution where
you specify the replacement not as a string or even a string with &
and \1 \2 ... magic, but as essentially a lambda that is passed the
match and returns a computed replacement. That makes conversions of
the sort discussed here generally trivial to implement. Would it be
worth considering to add something of general utility like that, and
then there could be a small library of pure SQL functions (or a wiki
page or GitHub gist) covering a bunch of the two dozen representations
on that page linked above?

Regards,
-Chap



В списке pgsql-hackers по дате отправления:

Предыдущее
От: Peter Eisentraut
Дата:
Сообщение: macOS SIP, next try
Следующее
От: Dmitry Dolgov
Дата:
Сообщение: Re: [HACKERS] [PATCH] Generic type subscripting