jsonb, unicode escapes and escaped backslashes

Поиск
Список
Период
Сортировка
От Andrew Dunstan
Тема jsonb, unicode escapes and escaped backslashes
Дата
Msg-id 54C03B86.80604@dunslane.net
обсуждение исходный текст
Ответы Re: jsonb, unicode escapes and escaped backslashes  (Noah Misch <noah@leadboat.com>)
Список pgsql-hackers
The following case has just been brought to my attention (look at the 
differing number of backslashes):
   andrew=# select jsonb '"\\u0000"';      jsonb   ----------     "\u0000"   (1 row)
   andrew=# select jsonb '"\u0000"';      jsonb   ----------     "\u0000"   (1 row)
   andrew=# select json '"\u0000"';       json   ----------     "\u0000"   (1 row)
   andrew=# select json '"\\u0000"';       json   -----------     "\\u0000"   (1 row)

The problem is that jsonb uses the parsed, unescaped value of the 
string, while json does not. when the string parser sees the input with 
the 2 backslashes, it outputs a single backslash, and then it encounters 
the remaining chareacters and emits them as is, resulting in a token of 
'\u0000'. When it encounters the input with one backslash, it recognizes 
a unicode escape, and because it's for u+0000 emits '\u0000'. All other 
unicode escapes are resolved, so the only abiguity on input concerns 
this case.

Things get worse, though. On output, '\uabcd' for any four hex digits is 
recognized as a unicode escape, and thus the backslash is not escaped, 
so that we get:
   andrew=# select jsonb '"\\uabcd"';      jsonb   ----------     "\uabcd"   (1 row)


We could probably fix this fairly easily for non- U+0000 cases by having 
jsonb_to_cstring use a different escape_json routine.

But it's a mess, sadly, and I'm not sure what a good fix for the U+0000 
case would look like. Maybe we should detect such input and emit a 
warning of ambiguity? It's likely to be rare enough, but clearly not as 
rare as we'd like, since this is a report from the field.

cheers

andrew



В списке pgsql-hackers по дате отправления:

Предыдущее
От: Stephen Frost
Дата:
Сообщение: Re: pgaudit - an auditing extension for PostgreSQL
Следующее
От: Jim Nasby
Дата:
Сообщение: Re: pgaudit - an auditing extension for PostgreSQL