Обсуждение: Bug in comparison of empty jsonb arrays to scalars
Hi hackers. While working on jsonbstatistics, I found the following bug: an empty jsonb array is considered to be lesser than any scalar, but it is expected that objects > arrays > scalars. # select '[]'::jsonb < 'null'::jsonb; ?column? ---------- t (1 row) Attached patch contains: 1. bug fix (added the missing "else" in compareJsonbContainers()) 2. regression test 3. pg_upgrade: invalidation of btree indexes on jsonb columns and REINDEX-script generation -- Nikita Glukhov Postgres Professional: http://www.postgrespro.com The Russian Postgres Company
Вложения
On Wed, Nov 9, 2016 at 8:31 AM, Nikita Glukhov <n.gluhov@postgrespro.ru> wrote: > Hi hackers. > > While working on jsonbstatistics, I found the following bug: > an empty jsonb array is considered to be lesser than any scalar, > but it is expected that objects > arrays > scalars. Sources? Does the JSON spec contain any information regarding comparison operators? I don't think so, so that would be up to the implementation to decide that, no? Btw I would agree with you that's quite unintuitive, but that's not wrong either to keep the current comparison algorithm because that's harmless for btree. We could have more regression tests to make the current behavior clear though. Thoughts from others are welcome. -- Michael
On Tue, Nov 8, 2016 at 9:49 PM, Michael Paquier <michael.paquier@gmail.com> wrote: > On Wed, Nov 9, 2016 at 8:31 AM, Nikita Glukhov <n.gluhov@postgrespro.ru> wrote: >> Hi hackers. >> >> While working on jsonbstatistics, I found the following bug: >> an empty jsonb array is considered to be lesser than any scalar, >> but it is expected that objects > arrays > scalars. > > Sources? How about "our documentation"? https://www.postgresql.org/docs/current/static/datatype-json.html Look at the last page. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
On Thu, Nov 10, 2016 at 3:27 AM, Robert Haas <robertmhaas@gmail.com> wrote: > On Tue, Nov 8, 2016 at 9:49 PM, Michael Paquier > <michael.paquier@gmail.com> wrote: >> On Wed, Nov 9, 2016 at 8:31 AM, Nikita Glukhov <n.gluhov@postgrespro.ru> wrote: >>> Hi hackers. >>> >>> While working on jsonbstatistics, I found the following bug: >>> an empty jsonb array is considered to be lesser than any scalar, >>> but it is expected that objects > arrays > scalars. >> >> Sources? > > How about "our documentation"? > > https://www.postgresql.org/docs/current/static/datatype-json.html Indeed, I missed that. So that's broken... -- Michael
Michael Paquier <michael.paquier@gmail.com> writes: > On Thu, Nov 10, 2016 at 3:27 AM, Robert Haas <robertmhaas@gmail.com> wrote: >> https://www.postgresql.org/docs/current/static/datatype-json.html > Indeed, I missed that. So that's broken... Given that nobody actually cares what that sort order is, I think that having to jump through hoops in pg_upgrade in order to fix it is not a great tradeoff. I suggest changing the documentation to match the code. regards, tom lane
On Thu, Nov 10, 2016 at 7:37 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote: > Michael Paquier <michael.paquier@gmail.com> writes: >> On Thu, Nov 10, 2016 at 3:27 AM, Robert Haas <robertmhaas@gmail.com> wrote: >>> https://www.postgresql.org/docs/current/static/datatype-json.html > >> Indeed, I missed that. So that's broken... > > Given that nobody actually cares what that sort order is, I think that > having to jump through hoops in pg_upgrade in order to fix it is not a > great tradeoff. I suggest changing the documentation to match the code. Yes, definitely. =# create table json_data (a jsonb); CREATE TABLE =# INSERT INTO json_data values ('{}'::jsonb), ('[]'::jsonb), ('null'::jsonb), ('true'::jsonb), ('1'::jsonb), ('""'::jsonb); INSERT 0 6 =# SELECT * FROM json_data ORDER BY 1 DESC; a ------ {} true 1 "" null [] (6 rows) So that's object > boolean > integer > string > NULL > array. And attached is a patch. -- Michael
Вложения
2016-11-10 13:54 GMT+07:00 Michael Paquier <michael.paquier@gmail.com>:
On Thu, Nov 10, 2016 at 7:37 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Given that nobody actually cares what that sort order is, I think that
> having to jump through hoops in pg_upgrade in order to fix it is not a
> great tradeoff. I suggest changing the documentation to match the code.
Don't you in this case think we should match sort order in javascript?
Yes, definitely.
=# create table json_data (a jsonb);
CREATE TABLE
=# INSERT INTO json_data values ('{}'::jsonb), ('[]'::jsonb),
('null'::jsonb), ('true'::jsonb), ('1'::jsonb), ('""'::jsonb);
INSERT 0 6
=# SELECT * FROM json_data ORDER BY 1 DESC;
a
------
{}
true
1
""
null
[]
(6 rows)
So that's object > boolean > integer > string > NULL > array.
> a = [{}, [], null, true, 1, '""']
[ {}, [], null, true, 1, '""' ]
> a.sort()
[ [], '""', 1, {}, null, true ]
> a.reverse()
[ true, null, {}, 1, '""', [] ]
So in this case it's boolean > NULL > Object > integer > string > array
(tried in Chromium 53, Firefox 49 and Node v6.9.1)
When I tried to search for the ECMA Standard for this behavior, i found this: http://blog.rodneyrehm.de/archives/14-Sorting-Were-Doing-It-Wrong.html. There are problems about automatic conversion in javascript, like this:
> a = [{}, [], null, true, 1, 'someotherstring']
[ {}, [], null, true, 1, 'someotherstring' ]
> a.sort().reverse()
[ true, 'someotherstring', null, {}, 1, [] ]
versus this:
> a = [{}, [], null, true, 1, 'SomeOtherString']
[ {}, [], null, true, 1, 'SomeOtherString' ]
> a.sort().reverse()
[ true, null, {}, 'SomeOtherString', 1, [] ]
and this:
> a = [{}, [], null, true, 1, '2']
[ {}, [], null, true, 1, '2' ]
> a.sort().reverse()
[ true, null, {}, '2', 1, [] ]
So we can't replicate javascript sort order without emulating those.
Regards,
Ali Akbar
On 10.11.2016 09:54, Michael Paquier wrote: > Yes, definitely. > =# create table json_data (a jsonb); > CREATE TABLE > =# INSERT INTO json_data values ('{}'::jsonb), ('[]'::jsonb), > ('null'::jsonb), ('true'::jsonb), ('1'::jsonb), ('""'::jsonb); > INSERT 0 6 > =# SELECT * FROM json_data ORDER BY 1 DESC; > a > ------ > {} > true > 1 > "" > null > [] > (6 rows) > So that's object > boolean > integer > string > NULL > array. > > And attached is a patch. Perhaps I did not explain it clearly enough, but only *empty top-level* arrays are out of the correct order. See complete example: =# SELECT * FROM (VALUES ('null'::jsonb), ('0'), ('""'), ('true'), ('[]'), ('{}'), ('[null]'), ('[0]'), ('[""]'), ('[true]'),('[[]]'), ('[{}]'), ('{"a": null}'), ('{"a": 0}'), ('{"a": ""}'), ('{"a": true}'), ('{"a": []}'), ('{"a": {}}') ) valsORDER BY 1; column1 ------------- [] null "" 0 true [null] [""] [0] [true] [[]] [{}] {} {"a": null} {"a": ""} {"a": 0} {"a": true} {"a": []}{"a": {}} (18 rows) -- Nikita Glukhov Postgres Professional: http://www.postgrespro.com The Russian Postgres Company
Michael Paquier <michael.paquier@gmail.com> writes: > On Thu, Nov 10, 2016 at 7:37 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote: >> Given that nobody actually cares what that sort order is, I think that >> having to jump through hoops in pg_upgrade in order to fix it is not a >> great tradeoff. I suggest changing the documentation to match the code. > Yes, definitely. > So that's object > boolean > integer > string > NULL > array. No, because the issue is that empty and nonempty arrays sort differently. regression=# create table json_data (a jsonb); CREATE TABLE regression=# INSERT INTO json_data values ('{}'::jsonb), ('[]'::jsonb), regression-# ('null'::jsonb), ('true'::jsonb), ('1'::jsonb), ('""'::jsonb), regression-# ('[42]'::jsonb),('[[43]]'::jsonb); INSERT 0 8 regression=# SELECT * FROM json_data ORDER BY 1 DESC; a --------{}[[43]][42]true1""null[] (8 rows) > And attached is a patch. If we go with the fix-the-docs approach, we'll need to have two entries for empty and nonempty arrays. It's definitely ugly. Still, my judgement is that it's not worth the pain of changing the behavior. It was never intended that this sort order be anything but an implementation detail. (I guess another approach is to not document the order at all ...) regards, tom lane