Re: [rfc] unicode escapes for extended strings

Поиск
Список
Период
Сортировка
От Sam Mason
Тема Re: [rfc] unicode escapes for extended strings
Дата
Msg-id 20090416184309.GQ12225@frubble.xen.chris-lamb.co.uk
обсуждение исходный текст
Ответ на [rfc] unicode escapes for extended strings  (Marko Kreen <markokr@gmail.com>)
Ответы Re: [rfc] unicode escapes for extended strings  (Andrew Dunstan <andrew@dunslane.net>)
Re: [rfc] unicode escapes for extended strings  (Marko Kreen <markokr@gmail.com>)
Список pgsql-hackers
On Thu, Apr 16, 2009 at 08:48:58PM +0300, Marko Kreen wrote:
> Seems I'm bad at communicating in english,

I hope you're not saying this because of my misunderstandings!

> so here is C variant of
> my proposal to bring \u escaping into extended strings.  Reasons:
> 
> - More people are familiar with \u escaping, as it's standard
>   in Java/C#/Python, probably more..
> - U& strings will not work when stdstr=off.
> 
> Syntax:
> 
>   \uXXXX      - 16-bit value
>   \UXXXXXXXX  - 32-bit value
> 
> Additionally, both \u and \U can be used to specify UTF-16 surrogate
> pairs to encode characters with value > 0xFFFF.  This is exact behaviour
> used by Java/C#/Python.  (except that Java does not have \U)

Are you sure that this handling of surrogates is correct?  The best
answer I've managed to find on the Unicode consortium's site is:
 http://unicode.org/faq/utf_bom.html#utf16-7

it says:
 They are invalid in interchange, but may be freely used internal to an implementation.

I think this means they consider the handling of them you noted above,
in other languages, to be an error.

--  Sam  http://samason.me.uk/


В списке pgsql-hackers по дате отправления:

Предыдущее
От: Merlin Moncure
Дата:
Сообщение: Re: [GENERAL] Performance of full outer join in 8.3
Следующее
От: Grzegorz Jaskiewicz
Дата:
Сообщение: Re: [GENERAL] Performance of full outer join in 8.3