Re: jsonb, unicode escapes and escaped backslashes

Поиск
Список
Период
Сортировка
От Tom Lane
Тема Re: jsonb, unicode escapes and escaped backslashes
Дата
Msg-id 3722.1422602921@sss.pgh.pa.us
обсуждение исходный текст
Ответ на Re: jsonb, unicode escapes and escaped backslashes  (Peter Geoghegan <pg@heroku.com>)
Ответы Re: jsonb, unicode escapes and escaped backslashes  (Peter Geoghegan <pg@heroku.com>)
Список pgsql-hackers
Peter Geoghegan <pg@heroku.com> writes:
> On Thu, Jan 29, 2015 at 10:20 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> I made the \u0000 error be errcode(ERRCODE_INVALID_TEXT_REPRESENTATION)
>> and errmsg("invalid input syntax for type json"), by analogy to what's
>> thrown for non-ASCII Unicode escapes in non-UTF8 encoding.  I'm not
>> terribly happy with that, though.  ISTM that for both cases, this is
>> not "invalid syntax" at all, but an implementation restriction that
>> forces us to reject perfectly valid syntax.  So I think we ought to
>> use a different ERRCODE and text message, though I'm not entirely
>> sure what it should be instead.  ERRCODE_FEATURE_NOT_SUPPORTED is
>> one possibility.

> I personally prefer what you have here.

> The point of JSONB is that we take a position on certain aspects like
> this. We're bridging a pointedly loosey goosey interchange format,
> JSON, with native PostgreSQL types. For example, we take a firm
> position on encoding. The JSON type is a bit more permissive, to about
> the extent that that's possible. The whole point is that we're
> interpreting JSON data in a way that's consistent with *Postgres*
> conventions. You'd have to interpret the data according to *some*
> convention in order to do something non-trivial with it in any case,
> and users usually want that.

I quite agree with you, actually, in terms of that perspective.  But my
point remains: "\u0000" is not invalid JSON syntax, and neither is
"\u1234".  If we choose to throw an error because we can't interpret or
process that according to our conventions, fine, but we should call it
something other than "invalid syntax".

ERRCODE_UNTRANSLATABLE_CHARACTER or ERRCODE_CHARACTER_NOT_IN_REPERTOIRE
seem more apropos from here.  And I still think there's a case to be
made for ERRCODE_FEATURE_NOT_SUPPORTED, because it's at least possible
that we'd relax this restriction in future (eg, allow Unicode characters
that can be converted to the database's encoding).
        regards, tom lane



В списке pgsql-hackers по дате отправления:

Предыдущее
От: Peter Geoghegan
Дата:
Сообщение: Re: jsonb, unicode escapes and escaped backslashes
Следующее
От: Peter Geoghegan
Дата:
Сообщение: Re: jsonb, unicode escapes and escaped backslashes