Re: [PATCH] json_lex_string: don't overread on bad UTF8

Поиск
Список
Период
Сортировка
От Jacob Champion
Тема Re: [PATCH] json_lex_string: don't overread on bad UTF8
Дата
Msg-id CAOYmi+=yCFok+UNRHDJna5dSasqa9cMHviBZ6pYmtt1Yn_RfRg@mail.gmail.com
обсуждение исходный текст
Ответ на Re: [PATCH] json_lex_string: don't overread on bad UTF8  (Michael Paquier <michael@paquier.xyz>)
Ответы Re: [PATCH] json_lex_string: don't overread on bad UTF8
Список pgsql-hackers
On Mon, May 6, 2024 at 8:43 PM Michael Paquier <michael@paquier.xyz> wrote:
> On Fri, May 03, 2024 at 07:05:38AM -0700, Jacob Champion wrote:
> > We could port something like that to src/common. IMO that'd be more
> > suited for an actual conversion routine, though, as opposed to a
> > parser that for the most part assumes you didn't lie about the input
> > encoding and is just trying not to crash if you're wrong. Most of the
> > time, the parser just copies bytes between delimiters around and it's
> > up to the caller to handle encodings... the exceptions to that are the
> > \uXXXX escapes and the error handling.
>
> Hmm.  That would still leave the backpatch issue at hand, which is
> kind of confusing to leave as it is.  Would it be complicated to
> truncate the entire byte sequence in the error message and just give
> up because we cannot do better if the input byte sequence is
> incomplete?

Maybe I've misunderstood, but isn't that what's being done in v2?

> > Maybe I'm missing
> > code somewhere, but I don't see a conversion routine from
> > json_errdetail() to the actual client/locale encoding. (And the parser
> > does not support multibyte input_encodings that contain ASCII in trail
> > bytes.)
>
> Referring to json_lex_string() that does UTF-8 -> ASCII -> give-up in
> its conversion for FRONTEND, I guess?  Yep.  This limitation looks
> like a problem, especially if plugging that to libpq.

Okay. How we deal with that will likely guide the "optimal" fix to
error reporting, I think...

Thanks,
--Jacob



В списке pgsql-hackers по дате отправления:

Предыдущее
От: Peter Eisentraut
Дата:
Сообщение: Re: partitioning and identity column
Следующее
От: Nathan Bossart
Дата:
Сообщение: Re: New GUC autovacuum_max_threshold ?