Re: SQL/JSON path: collation for comparisons, minor typos in docs

Поиск

Список

Период

Сортировка

От	Alexander Korotkov
Тема	Re: SQL/JSON path: collation for comparisons, minor typos in docs
Дата	8 августа 2019 г. 03:27:38
Msg-id	CAPpHfdtqpf5rd_E7tT2PPc3ixkr-GZY3AXSwGED5QEjkLpLH-w@mail.gmail.com обсуждение исходный текст
Ответ на	Re: SQL/JSON path: collation for comparisons, minor typos in docs (Alexander Korotkov <a.korotkov@postgrespro.ru>)
Ответы	Re: SQL/JSON path: collation for comparisons, minor typos in docs (Markus Winand <markus.winand@winand.at>)
Список	pgsql-hackers

Дерево обсуждения

On Thu, Aug 8, 2019 at 3:05 AM Alexander Korotkov
<a.korotkov@postgrespro.ru> wrote:
> On Thu, Aug 8, 2019 at 12:55 AM Alexander Korotkov
> <a.korotkov@postgrespro.ru> wrote:
> > On Wed, Aug 7, 2019 at 4:11 PM Alexander Korotkov
> > <a.korotkov@postgrespro.ru> wrote:
> > > On Wed, Aug 7, 2019 at 2:25 PM Markus Winand <markus.winand@winand.at> wrote:
> > > > I was playing around with JSON path quite a bit and might have found one case where the current implementation
doesn’tfollow the standard. 
> > > >
> > > > The functionality in question are the comparison operators except ==. They use the database default collation
ratherthen the standard-mandated "Unicode codepoint collation” (SQL-2:2016 9.39 General Rule 12 c iii 2 D, last
sentencein first paragraph). 
> > >
> > > Thank you for pointing!  Nikita is about to write a patch fixing that.
> >
> > Please, see the attached patch.
> >
> > Our idea is to not sacrifice "==" operator performance for standard
> > conformance.  So, "==" remains per-byte comparison.  For consistency
> > in other operators we compare code points first, then do per-byte
> > comparison.  In some edge cases, when same Unicode codepoints have
> > different binary representations in database encoding, this behavior
> > diverges standard.  In future we can implement strict standard
> > conformance by normalization of input JSON strings.
>
> Previous version of patch has buggy implementation of
> compareStrings().  Revised version is attached.

Nikita pointed me that for UTF-8 strings per-byte comparison result
matches codepoints comparison result.  That allows simplify patch a
lot.

------
Alexander Korotkov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

Вложения

0001-Use-Unicode-codepoint-collation-in-jsonpath-4.patch

В списке pgsql-hackers по дате отправления:

Предыдущее

От: Alexander Korotkov
Дата: 08 августа 2019 г., 03:17:29
Сообщение: Re: Rethinking opclass member checks and dependency strength

Следующее

От: Amit Langote
Дата: 08 августа 2019 г., 04:01:38
Сообщение: Re: no default hash partition

Вход в личный кабинет

Восстановление пароля

Подтверждение аккаунта

Изменение пароля

Re: SQL/JSON path: collation for comparisons, minor typos in docs

Вложения

Предыдущее

Следующее