Re: jsonb, unicode escapes and escaped backslashes

Поиск
Список
Период
Сортировка
От Tom Lane
Тема Re: jsonb, unicode escapes and escaped backslashes
Дата
Msg-id 22710.1422384021@sss.pgh.pa.us
обсуждение исходный текст
Ответ на Re: jsonb, unicode escapes and escaped backslashes  (Andrew Dunstan <andrew@dunslane.net>)
Ответы Re: jsonb, unicode escapes and escaped backslashes  (Andrew Dunstan <andrew@dunslane.net>)
Re: jsonb, unicode escapes and escaped backslashes  (Merlin Moncure <mmoncure@gmail.com>)
Список pgsql-hackers
Andrew Dunstan <andrew@dunslane.net> writes:
> On 01/27/2015 12:23 PM, Tom Lane wrote:
>> I think coding anything is premature until we decide how we're going to
>> deal with the fundamental ambiguity.

> The input \\uabcd will be stored correctly as \uabcd, but this will in 
> turn be rendered as \uabcd, whereas it should be rendered as \\uabcd. 
> That's what the patch fixes.
> There are two problems here and this addresses one of them. The other 
> problem is the ambiguity regarding \\u0000 and \u0000.

It's the same problem really, and until we have an answer about
what to do with \u0000, I think any patch is half-baked and possibly
counterproductive.

In particular, I would like to suggest that the current representation of
\u0000 is fundamentally broken and that we have to change it, not try to
band-aid around it.  This will mean an on-disk incompatibility for jsonb
data containing U+0000, but hopefully there is very little of that out
there yet.  If we can get a fix into 9.4.1, I think it's reasonable to
consider such solutions.

The most obvious way to store such data unambiguously is to just go ahead
and store U+0000 as a NUL byte (\000).  The only problem with that is that
then such a string cannot be considered to be a valid value of type TEXT,
which would mean that we'd need to throw an error if we were asked to
convert a JSON field containing such a character to text.  I don't
particularly have a problem with that, except possibly for the time cost
of checking for \000 before allowing a conversion to occur.  While a
memchr() check might be cheap enough, we could also consider inventing a
new JEntry type code for string-containing-null, so that there's a
distinction in the type system between strings that are coercible to text
and those that are not.

If we went down a path like that, the currently proposed patch would be
quite useless.
        regards, tom lane



В списке pgsql-hackers по дате отправления:

Предыдущее
От: "Joshua D. Drake"
Дата:
Сообщение: Re: Release notes
Следующее
От: Gavin Flower
Дата:
Сообщение: Re: Re: Abbreviated keys for Numeric