Re: jsonb, unicode escapes and escaped backslashes

Поиск
Список
Период
Сортировка
От Tom Lane
Тема Re: jsonb, unicode escapes and escaped backslashes
Дата
Msg-id 30821.1422570074@sss.pgh.pa.us
обсуждение исходный текст
Ответ на Re: jsonb, unicode escapes and escaped backslashes  (Robert Haas <robertmhaas@gmail.com>)
Список pgsql-hackers
Robert Haas <robertmhaas@gmail.com> writes:
> On Thu, Jan 29, 2015 at 4:33 PM, Andrew Dunstan <andrew@dunslane.net> wrote:
>> I'm coming down more and more on the side of Tom's suggestion just to ban
>> \u0000 in jsonb.

> I have yet to understand what we fix by banning \u0000.  How is 0000
> different from any other four-digit hexadecimal number that's not a
> valid character in the current encoding?  What does banning that one
> particular value do?

As Andrew pointed out upthread, it avoids having to answer the question of
what to return for

select (jsonb '["foo\u0000bar"]')->>0;

or any other construct which is supposed to return an *unescaped* text
representation of some JSON string value.

Right now you get
  ?column?   
--------------foo\u0000bar
(1 row)

Which is wrong IMO, first because it violates the premise that the output
should be unescaped, and second because this output cannot be
distinguished from the (correct) output of

regression=# select (jsonb '["foo\\u0000bar"]')->>0;  ?column?   
--------------foo\u0000bar
(1 row)

There is no way to deliver an output that is not confusable with some
other value's correct output, other than by emitting a genuine \0 byte
which unfortunately we cannot support in a TEXT result.

Potential solutions for this have been mooted upthread, but none
of them look like they're something we can do in the very short run.
So the proposal is to ban \u0000 until such time as we can do something
sane with it.

> In any case, whatever we do about that issue, the idea that the text
> -> json string transformation can *change the input string into some
> other string* seems like an independent problem.

No, it's exactly the same problem, because the reason for that breakage
is an ill-advised attempt to make it safe to include \u0000 in JSONB.
        regards, tom lane



В списке pgsql-hackers по дате отправления:

Предыдущее
От: Stephen Frost
Дата:
Сообщение: Re: [COMMITTERS] pgsql: Fix column-privilege leak in error-message paths
Следующее
От: Andres Freund
Дата:
Сообщение: Re: Misaligned BufferDescriptors causing major performance problems on AMD