Обсуждение: Bug in comparison of empty jsonb arrays to scalars

Поиск
Список
Период
Сортировка

Bug in comparison of empty jsonb arrays to scalars

От
Nikita Glukhov
Дата:
Hi hackers.

While working on jsonbstatistics, I found the following bug:
an empty jsonb array is considered to be lesser than any scalar,
but it is expected that objects > arrays > scalars.

# select '[]'::jsonb < 'null'::jsonb;
  ?column?
----------
  t
(1 row)

Attached patch contains:
  1. bug fix (added the missing "else" in compareJsonbContainers())
  2. regression test
  3. pg_upgrade: invalidation of btree indexes on jsonb columns and
REINDEX-script generation

--
Nikita Glukhov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

Вложения

Re: Bug in comparison of empty jsonb arrays to scalars

От
Michael Paquier
Дата:
On Wed, Nov 9, 2016 at 8:31 AM, Nikita Glukhov <n.gluhov@postgrespro.ru> wrote:
> Hi hackers.
>
> While working on jsonbstatistics, I found the following bug:
> an empty jsonb array is considered to be lesser than any scalar,
> but it is expected that objects > arrays > scalars.

Sources? Does the JSON spec contain any information regarding
comparison operators? I don't think so, so that would be up to the
implementation to decide that, no?

Btw I would agree with you that's quite unintuitive, but that's not
wrong either to keep the current comparison algorithm because that's
harmless for btree. We could have more regression tests to make the
current behavior clear though. Thoughts from others are welcome.
-- 
Michael



Re: Bug in comparison of empty jsonb arrays to scalars

От
Robert Haas
Дата:
On Tue, Nov 8, 2016 at 9:49 PM, Michael Paquier
<michael.paquier@gmail.com> wrote:
> On Wed, Nov 9, 2016 at 8:31 AM, Nikita Glukhov <n.gluhov@postgrespro.ru> wrote:
>> Hi hackers.
>>
>> While working on jsonbstatistics, I found the following bug:
>> an empty jsonb array is considered to be lesser than any scalar,
>> but it is expected that objects > arrays > scalars.
>
> Sources?

How about "our documentation"?

https://www.postgresql.org/docs/current/static/datatype-json.html

Look at the last page.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



Re: Bug in comparison of empty jsonb arrays to scalars

От
Michael Paquier
Дата:
On Thu, Nov 10, 2016 at 3:27 AM, Robert Haas <robertmhaas@gmail.com> wrote:
> On Tue, Nov 8, 2016 at 9:49 PM, Michael Paquier
> <michael.paquier@gmail.com> wrote:
>> On Wed, Nov 9, 2016 at 8:31 AM, Nikita Glukhov <n.gluhov@postgrespro.ru> wrote:
>>> Hi hackers.
>>>
>>> While working on jsonbstatistics, I found the following bug:
>>> an empty jsonb array is considered to be lesser than any scalar,
>>> but it is expected that objects > arrays > scalars.
>>
>> Sources?
>
> How about "our documentation"?
>
> https://www.postgresql.org/docs/current/static/datatype-json.html

Indeed, I missed that. So that's broken...
-- 
Michael



Re: Bug in comparison of empty jsonb arrays to scalars

От
Tom Lane
Дата:
Michael Paquier <michael.paquier@gmail.com> writes:
> On Thu, Nov 10, 2016 at 3:27 AM, Robert Haas <robertmhaas@gmail.com> wrote:
>> https://www.postgresql.org/docs/current/static/datatype-json.html

> Indeed, I missed that. So that's broken...

Given that nobody actually cares what that sort order is, I think that
having to jump through hoops in pg_upgrade in order to fix it is not a
great tradeoff.  I suggest changing the documentation to match the code.
        regards, tom lane



Re: Bug in comparison of empty jsonb arrays to scalars

От
Michael Paquier
Дата:
On Thu, Nov 10, 2016 at 7:37 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Michael Paquier <michael.paquier@gmail.com> writes:
>> On Thu, Nov 10, 2016 at 3:27 AM, Robert Haas <robertmhaas@gmail.com> wrote:
>>> https://www.postgresql.org/docs/current/static/datatype-json.html
>
>> Indeed, I missed that. So that's broken...
>
> Given that nobody actually cares what that sort order is, I think that
> having to jump through hoops in pg_upgrade in order to fix it is not a
> great tradeoff.  I suggest changing the documentation to match the code.

Yes, definitely.
=# create table json_data (a jsonb);
CREATE TABLE
=# INSERT INTO json_data values ('{}'::jsonb), ('[]'::jsonb),
('null'::jsonb), ('true'::jsonb), ('1'::jsonb), ('""'::jsonb);
INSERT 0 6
=# SELECT * FROM json_data ORDER BY 1 DESC;
  a
------
 {}
 true
 1
 ""
 null
 []
(6 rows)
So that's object > boolean > integer > string > NULL > array.

And attached is a patch.
--
Michael

Вложения

Re: Bug in comparison of empty jsonb arrays to scalars

От
Ali Akbar
Дата:
2016-11-10 13:54 GMT+07:00 Michael Paquier <michael.paquier@gmail.com>:
On Thu, Nov 10, 2016 at 7:37 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Given that nobody actually cares what that sort order is, I think that
> having to jump through hoops in pg_upgrade in order to fix it is not a
> great tradeoff.  I suggest changing the documentation to match the code.

Don't you in this case think we should match sort order in javascript?
 
Yes, definitely.
=# create table json_data (a jsonb);
CREATE TABLE
=# INSERT INTO json_data values ('{}'::jsonb), ('[]'::jsonb),
('null'::jsonb), ('true'::jsonb), ('1'::jsonb), ('""'::jsonb);
INSERT 0 6
=# SELECT * FROM json_data ORDER BY 1 DESC;
  a
------
 {}
 true
 1
 ""
 null
 []
(6 rows)
So that's object > boolean > integer > string > NULL > array.

> a = [{}, [], null, true, 1, '""']
[ {}, [], null, true, 1, '""' ]
> a.sort()
[ [], '""', 1, {}, null, true ]
> a.reverse()
[ true, null, {}, 1, '""', [] ]

So in this case it's boolean > NULL > Object > integer > string > array
(tried in Chromium 53, Firefox 49 and Node v6.9.1)

When I tried to search for the ECMA Standard for this behavior, i found this: http://blog.rodneyrehm.de/archives/14-Sorting-Were-Doing-It-Wrong.html. There are problems about automatic conversion in javascript, like this:

> a = [{}, [], null, true, 1, 'someotherstring']
[ {}, [], null, true, 1, 'someotherstring' ]
> a.sort().reverse()
[ true, 'someotherstring', null, {}, 1, [] ]


versus this:

> a = [{}, [], null, true, 1, 'SomeOtherString']
[ {}, [], null, true, 1, 'SomeOtherString' ]
> a.sort().reverse()
[ true, null, {}, 'SomeOtherString', 1, [] ]


and this:

> a = [{}, [], null, true, 1, '2']
[ {}, [], null, true, 1, '2' ]
> a.sort().reverse()
[ true, null, {}, '2', 1, [] ]


So we can't replicate javascript sort order without emulating those.

Regards,
Ali Akbar

Re: Bug in comparison of empty jsonb arrays to scalars

От
Nikita Glukhov
Дата:
On 10.11.2016 09:54, Michael Paquier wrote:

> Yes, definitely.
> =# create table json_data (a jsonb);
> CREATE TABLE
> =# INSERT INTO json_data values ('{}'::jsonb), ('[]'::jsonb),
> ('null'::jsonb), ('true'::jsonb), ('1'::jsonb), ('""'::jsonb);
> INSERT 0 6
> =# SELECT * FROM json_data ORDER BY 1 DESC;
>    a
> ------
>   {}
>   true
>   1
>   ""
>   null
>   []
> (6 rows)
> So that's object > boolean > integer > string > NULL > array.
>
> And attached is a patch.

Perhaps I did not explain it clearly enough, but only *empty top-level* 
arrays are out of the correct order.
See complete example:

=# SELECT * FROM (VALUES    ('null'::jsonb), ('0'), ('""'), ('true'), ('[]'), ('{}'),    ('[null]'), ('[0]'), ('[""]'),
('[true]'),('[[]]'), ('[{}]'),    ('{"a": null}'), ('{"a": 0}'), ('{"a": ""}'), ('{"a": true}'), 
 
('{"a": []}'), ('{"a": {}}')
) valsORDER BY 1;   column1
------------- [] null "" 0 true [null] [""] [0] [true] [[]] [{}] {} {"a": null} {"a": ""} {"a": 0} {"a": true} {"a":
[]}{"a": {}}
 
(18 rows)


-- 
Nikita Glukhov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company



Re: Bug in comparison of empty jsonb arrays to scalars

От
Tom Lane
Дата:
Michael Paquier <michael.paquier@gmail.com> writes:
> On Thu, Nov 10, 2016 at 7:37 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> Given that nobody actually cares what that sort order is, I think that
>> having to jump through hoops in pg_upgrade in order to fix it is not a
>> great tradeoff.  I suggest changing the documentation to match the code.

> Yes, definitely.
> So that's object > boolean > integer > string > NULL > array.

No, because the issue is that empty and nonempty arrays sort differently.

regression=# create table json_data (a jsonb);
CREATE TABLE
regression=#  INSERT INTO json_data values ('{}'::jsonb), ('[]'::jsonb),
regression-# ('null'::jsonb), ('true'::jsonb), ('1'::jsonb), ('""'::jsonb),
regression-# ('[42]'::jsonb),('[[43]]'::jsonb);
INSERT 0 8
regression=# SELECT * FROM json_data ORDER BY 1 DESC;  a    
--------{}[[43]][42]true1""null[]
(8 rows)

> And attached is a patch.

If we go with the fix-the-docs approach, we'll need to have two entries
for empty and nonempty arrays.  It's definitely ugly.  Still, my judgement
is that it's not worth the pain of changing the behavior.  It was never
intended that this sort order be anything but an implementation detail.

(I guess another approach is to not document the order at all ...)
        regards, tom lane