Обсуждение: SQL/JSON: FOR ORDINALITY bug

Поиск
Список
Период
Сортировка

SQL/JSON: FOR ORDINALITY bug

От
Erik Rijkers
Дата:
Hi

I've copied some statements from the .pdf called:
"TECHNICAL REPORT ISO/IEC TR 19075-6   First edition 2017-03
Part SQL Notation support 6: (JSON) for JavaScript Object"
(not available anymore although there should be a similar replacement file)

In that pdf I found the data and statement (called 'table 15' in the 
.pdf) as in the attached bash file.  But the result is different: as 
implemented by 15devel, the column rowseq is always 1.  It seems to me 
that that is wrong; it should count 1, 2, 3 as indeed the example-result 
column in that pdf shows.

What do you think?

Erik Rijkers
Вложения

Re: SQL/JSON: FOR ORDINALITY bug

От
Andrew Dunstan
Дата:
On 2022-05-03 Tu 11:19, Erik Rijkers wrote:
> Hi
>
> I've copied some statements from the .pdf called:
> "TECHNICAL REPORT ISO/IEC TR 19075-6   First edition 2017-03
> Part SQL Notation support 6: (JSON) for JavaScript Object"
> (not available anymore although there should be a similar replacement
> file)
>
> In that pdf I found the data and statement (called 'table 15' in the
> .pdf) as in the attached bash file.  But the result is different: as
> implemented by 15devel, the column rowseq is always 1.  It seems to me
> that that is wrong; it should count 1, 2, 3 as indeed the
> example-result column in that pdf shows.
>
> What do you think?
>
>

Possibly. 


Here's what the standard says in section 7.11 in I think the relevant
bit of mindbogglingly impenetrable prose:


General Rules
1)
If a <table primary> simply contains a <JSON table primitive> JTP, then:
a) If the value of the <JSON context item> simply contained in the <JSON
API common syntax> is the null value, then the result of <JSON table
primitive> is an empty table and no further General Rules of this
Subclause are applied.
b) Let JACS be the <JSON API common syntax> simply contained in JTP.
c) Let JTEB be the <JSON table error behavior> simply contained in JTP.
d) The General Rules of Subclause 10.14, “<JSON API common syntax>”, are
applied with JACS as JSON API COMMON SYNTAX; let ROWST be the STATUS and
let ROWSEQ be the SQL/JSON SEQUENCE returned from the application of
those General Rules.
460
Foundation (SQL/Foundation)
e) Case:
i) If ROWST is an exception condition, then
Case:
1) If JTEB is ERROR, then the exception condition ROWST is raised.
2) Otherwise, the result of JTP is an empty table.
ii) Otherwise, let NI be the number of SQL/JSON items in ROWSEQ, let Ij,
1 (one) ≤ j ≤ NI, be those SQL/JSON items in order, let NCD be the
number of <JSON table primitive column definition>s contained in JTP,
and let JTCDi, 1 (one) ≤ i ≤ NCD, be those <JSON table primitive column
definition>s.
For all j, 1 (one) ≤ j ≤ NI, and for all i, 1 (one) ≤ i ≤ NCD, the value
of the i-th column of the j-th row in the result of JTP is determined as
follows:
Case:
1) If JTCDi is a <JSON table ordinality column definition>, then the
value of the i-th column
of the j-th row is j.


Maybe some language lawyer can turn that into comprehensible English.

This should probably be an open item for release 15, but I don't really
know what the precise behaviour should be, so it's hard to modify it.

If we can't get it right maybe we should disable the "WITH ORDINALITY"
clause, although that would be a pity.


cheers


andrew


--
Andrew Dunstan
EDB: https://www.enterprisedb.com




Re: SQL/JSON: FOR ORDINALITY bug

От
"David G. Johnston"
Дата:
On Tue, May 3, 2022 at 5:27 PM Andrew Dunstan <andrew@dunslane.net> wrote:

On 2022-05-03 Tu 11:19, Erik Rijkers wrote:
> Hi
>
> I've copied some statements from the .pdf called:
> "TECHNICAL REPORT ISO/IEC TR 19075-6   First edition 2017-03
> Part SQL Notation support 6: (JSON) for JavaScript Object"
> (not available anymore although there should be a similar replacement
> file)
>
> In that pdf I found the data and statement (called 'table 15' in the
> .pdf) as in the attached bash file.  But the result is different: as
> implemented by 15devel, the column rowseq is always 1.  It seems to me
> that that is wrong; it should count 1, 2, 3 as indeed the
> example-result column in that pdf shows.
>
> What do you think?
>
>

Possibly. 


I don't see how rowseq can be anything but 1.  Each invocation of json_table is given a single jsonb record via the lateral reference to bookclub.jcol.  It produces one result, having a rowseq 1.  It does this for all three outer lateral reference tuples and thus produces three output rows each with one match numbered rowseq 1.

David J.

Re: SQL/JSON: FOR ORDINALITY bug

От
Andrew Dunstan
Дата:
On 2022-05-03 Tu 20:39, David G. Johnston wrote:
> On Tue, May 3, 2022 at 5:27 PM Andrew Dunstan <andrew@dunslane.net> wrote:
>
>
>     On 2022-05-03 Tu 11:19, Erik Rijkers wrote:
>     > Hi
>     >
>     > I've copied some statements from the .pdf called:
>     > "TECHNICAL REPORT ISO/IEC TR 19075-6   First edition 2017-03
>     > Part SQL Notation support 6: (JSON) for JavaScript Object"
>     > (not available anymore although there should be a similar
>     replacement
>     > file)
>     >
>     > In that pdf I found the data and statement (called 'table 15' in the
>     > .pdf) as in the attached bash file.  But the result is different: as
>     > implemented by 15devel, the column rowseq is always 1.  It seems
>     to me
>     > that that is wrong; it should count 1, 2, 3 as indeed the
>     > example-result column in that pdf shows.
>     >
>     > What do you think?
>     >
>     >
>
>     Possibly. 
>
>
> I don't see how rowseq can be anything but 1.  Each invocation of
> json_table is given a single jsonb record via the lateral reference to
> bookclub.jcol.  It produces one result, having a rowseq 1.  It does
> this for all three outer lateral reference tuples and thus produces
> three output rows each with one match numbered rowseq 1.
>

I imagine we could overcome that by stashing the sequence counter
somewhere it would survive across calls. The question really is what is
the right thing to do? I'm also a bit worried about how correct is
ordinal numbering with nested paths, e.g. (from the regression tests):


select
    jt.*
from
    jsonb_table_test jtt,
    json_table (
        jtt.js,'strict $[*]' as p
        columns (
            n for ordinality,
            a int path 'lax $.a' default -1 on empty,
            nested path 'strict $.b[*]' as pb columns ( b int path '$' ),
            nested path 'strict $.c[*]' as pc columns ( c int path '$' )
        )
    ) jt;
 n | a  | b | c  
---+----+---+----
 1 |  1 |   |   
 2 |  2 | 1 |   
 2 |  2 | 2 |   
 2 |  2 | 3 |   
 2 |  2 |   | 10
 2 |  2 |   |   
 2 |  2 |   | 20
 3 |  3 | 1 |   
 3 |  3 | 2 |   
 4 | -1 | 1 |   
 4 | -1 | 2 |   


cheers


andrew

--
Andrew Dunstan
EDB: https://www.enterprisedb.com




Re: SQL/JSON: FOR ORDINALITY bug

От
Erik Rijkers
Дата:
Op 04-05-2022 om 13:55 schreef Andrew Dunstan:
> 
> On 2022-05-03 Tu 20:39, David G. Johnston wrote:
>> On Tue, May 3, 2022 at 5:27 PM Andrew Dunstan <andrew@dunslane.net> wrote:
>>
>>
>>      On 2022-05-03 Tu 11:19, Erik Rijkers wrote:
>>      > Hi
>>      >
>>      > I've copied some statements from the .pdf called:
>>      > "TECHNICAL REPORT ISO/IEC TR 19075-6   First edition 2017-03
>>      > Part SQL Notation support 6: (JSON) for JavaScript Object"
>>      > (not available anymore although there should be a similar
>>      replacement
>>      > file)
>>      >
>>      > In that pdf I found the data and statement (called 'table 15' in the
>>      > .pdf) as in the attached bash file.  But the result is different: as
>>      > implemented by 15devel, the column rowseq is always 1.  It seems
>>      to me
>>      > that that is wrong; it should count 1, 2, 3 as indeed the
>>      > example-result column in that pdf shows.
>>      >
>>      > What do you think?
>>      >
>>      >
>>
>>      Possibly.
>>
>>
>> I don't see how rowseq can be anything but 1.  Each invocation of


After some further experimentation, I now think you must be right, David.

Also, looking at the DB2 docs:
   https://www.ibm.com/docs/en/i/7.2?topic=data-using-json-table
     (see especially under 'Handling nested information')

There, I gathered some example data + statements where one is the case 
at hand.  I also made them runnable under postgres (attached).

I thought that was an instructive example, with those 'outer_ordinality' 
and 'inner_ordinality' columns.

Erik


>> json_table is given a single jsonb record via the lateral reference to
>> bookclub.jcol.  It produces one result, having a rowseq 1.  It does
>> this for all three outer lateral reference tuples and thus produces
>> three output rows each with one match numbered rowseq 1.
>>
> 
> I imagine we could overcome that by stashing the sequence counter
> somewhere it would survive across calls. The question really is what is
> the right thing to do? I'm also a bit worried about how correct is
> ordinal numbering with nested paths, e.g. (from the regression tests):
> 
> 
> select
>      jt.*
> from
>      jsonb_table_test jtt,
>      json_table (
>          jtt.js,'strict $[*]' as p
>          columns (
>              n for ordinality,
>              a int path 'lax $.a' default -1 on empty,
>              nested path 'strict $.b[*]' as pb columns ( b int path '$' ),
>              nested path 'strict $.c[*]' as pc columns ( c int path '$' )
>          )
>      ) jt;
>   n | a  | b | c
> ---+----+---+----
>   1 |  1 |   |
>   2 |  2 | 1 |
>   2 |  2 | 2 |
>   2 |  2 | 3 |
>   2 |  2 |   | 10
>   2 |  2 |   |
>   2 |  2 |   | 20
>   3 |  3 | 1 |
>   3 |  3 | 2 |
>   4 | -1 | 1 |
>   4 | -1 | 2 |
> 
> 
> cheers
> 
> 
> andrew
> 
> --
> Andrew Dunstan
> EDB: https://www.enterprisedb.com
>
Вложения

Re: SQL/JSON: FOR ORDINALITY bug

От
Andrew Dunstan
Дата:
On 2022-05-04 We 10:39, Erik Rijkers wrote:
> Op 04-05-2022 om 13:55 schreef Andrew Dunstan:
>>
>> On 2022-05-03 Tu 20:39, David G. Johnston wrote:
>>> On Tue, May 3, 2022 at 5:27 PM Andrew Dunstan <andrew@dunslane.net>
>>> wrote:
>>>
>>>
>>>      On 2022-05-03 Tu 11:19, Erik Rijkers wrote:
>>>      > Hi
>>>      >
>>>      > I've copied some statements from the .pdf called:
>>>      > "TECHNICAL REPORT ISO/IEC TR 19075-6   First edition 2017-03
>>>      > Part SQL Notation support 6: (JSON) for JavaScript Object"
>>>      > (not available anymore although there should be a similar
>>>      replacement
>>>      > file)
>>>      >
>>>      > In that pdf I found the data and statement (called 'table 15'
>>> in the
>>>      > .pdf) as in the attached bash file.  But the result is
>>> different: as
>>>      > implemented by 15devel, the column rowseq is always 1.  It seems
>>>      to me
>>>      > that that is wrong; it should count 1, 2, 3 as indeed the
>>>      > example-result column in that pdf shows.
>>>      >
>>>      > What do you think?
>>>      >
>>>      >
>>>
>>>      Possibly.
>>>
>>>
>>> I don't see how rowseq can be anything but 1.  Each invocation of
>
>
> After some further experimentation, I now think you must be right, David.
>
> Also, looking at the DB2 docs:
>   https://www.ibm.com/docs/en/i/7.2?topic=data-using-json-table
>     (see especially under 'Handling nested information')
>
> There, I gathered some example data + statements where one is the case
> at hand.  I also made them runnable under postgres (attached).
>
> I thought that was an instructive example, with those
> 'outer_ordinality' and 'inner_ordinality' columns.
>
>

Yeah, I just reviewed the latest version of that page (7.5) and the
example seems fairly plain that we are doing the right thing, or if not
we're in pretty good company, so I guess this is probably a false alarm.
Looks like ordinality is for the number of the element produced by the
path expression. So a path of 'lax $' should just produce ordinality of
1 in each case, while a path of 'lax $[*]' will produce increasing
ordinality for each element of the root array.


cheers


andrew

--
Andrew Dunstan
EDB: https://www.enterprisedb.com




Re: SQL/JSON: FOR ORDINALITY bug

От
Erik Rijkers
Дата:
Op 04-05-2022 om 21:12 schreef Andrew Dunstan:
> 
>>>>
>>>> I don't see how rowseq can be anything but 1.  Each invocation of
>>
>>
>> After some further experimentation, I now think you must be right, David.
>>
>> Also, looking at the DB2 docs:
>>    https://www.ibm.com/docs/en/i/7.2?topic=data-using-json-table
>>      (see especially under 'Handling nested information')
>>
>> There, I gathered some example data + statements where one is the case
>> at hand.  I also made them runnable under postgres (attached).
>>
>> I thought that was an instructive example, with those
>> 'outer_ordinality' and 'inner_ordinality' columns.
>>
>>
> 
> Yeah, I just reviewed the latest version of that page (7.5) and the
> example seems fairly plain that we are doing the right thing, or if not
> we're in pretty good company, so I guess this is probably a false alarm.
> Looks like ordinality is for the number of the element produced by the
> path expression. So a path of 'lax $' should just produce ordinality of
> 1 in each case, while a path of 'lax $[*]' will produce increasing
> ordinality for each element of the root array.

Agreed.

You've probably noticed then that on that same page under 'Sibling 
Nesting' is a statement that gives a 13-row resultset on DB2 whereas in 
15devel that statement yields just 10 rows.  I don't know which is correct.


Erik


> 
> 
> cheers
> 
> 
> andrew
> 
> --
> Andrew Dunstan
> EDB: https://www.enterprisedb.com
> 



Re: SQL/JSON: FOR ORDINALITY bug

От
"David G. Johnston"
Дата:
On Wed, May 4, 2022 at 1:09 PM Erik Rijkers <er@xs4all.nl> wrote:
Op 04-05-2022 om 21:12 schreef Andrew Dunstan:
>
>>>>
>>>> I don't see how rowseq can be anything but 1.  Each invocation of
>>
>>
>> After some further experimentation, I now think you must be right, David.
>>
>> Also, looking at the DB2 docs:
>>    https://www.ibm.com/docs/en/i/7.2?topic=data-using-json-table
>>      (see especially under 'Handling nested information')
>>
>> There, I gathered some example data + statements where one is the case
>> at hand.  I also made them runnable under postgres (attached).
>>
>> I thought that was an instructive example, with those
>> 'outer_ordinality' and 'inner_ordinality' columns.
>>
>>
>
> Yeah, I just reviewed the latest version of that page (7.5) and the
> example seems fairly plain that we are doing the right thing, or if not
> we're in pretty good company, so I guess this is probably a false alarm.
> Looks like ordinality is for the number of the element produced by the
> path expression. So a path of 'lax $' should just produce ordinality of
> 1 in each case, while a path of 'lax $[*]' will produce increasing
> ordinality for each element of the root array.

Agreed.

You've probably noticed then that on that same page under 'Sibling
Nesting' is a statement that gives a 13-row resultset on DB2 whereas in
15devel that statement yields just 10 rows.  I don't know which is correct.


There should be 12 results (minimum would be 8 - 5 of which are used for real matches, plus 4 new row producing matches).

Our result seems internally inconsistent; conceptually there are two kinds of nulls here and we cannot collapse them.

null-val: we are outputting the record from the nested path but there is no actual value to output so we output null-val
null-union: we are not outputting the record for the nested path (we are doing a different one) but we need to output something for this column so we output null-union.

Sally, null-val, null-union
Sally, null-union, null-val

We only have one Sally but need both (11)

We are also missing:

Mary, null-union, null-val (12)

The fact that we agree on John means that we at least agree on UNION meaning we output a pair of rows when there are two nested paths.

I point to relative comparisons for fear of reading the specification here...

David J.


David J.

Re: SQL/JSON: FOR ORDINALITY bug

От
"David G. Johnston"
Дата:
On Wed, May 4, 2022 at 1:43 PM David G. Johnston <david.g.johnston@gmail.com> wrote:
On Wed, May 4, 2022 at 1:09 PM Erik Rijkers <er@xs4all.nl> wrote:
Op 04-05-2022 om 21:12 schreef Andrew Dunstan:
>
>>>>
>>>> I don't see how rowseq can be anything but 1.  Each invocation of
>>
>>
>> After some further experimentation, I now think you must be right, David.
>>
>> Also, looking at the DB2 docs:
>>    https://www.ibm.com/docs/en/i/7.2?topic=data-using-json-table
>>      (see especially under 'Handling nested information')

You've probably noticed then that on that same page under 'Sibling
Nesting' is a statement that gives a 13-row resultset on DB2 whereas in
15devel that statement yields just 10 rows.  I don't know which is correct.


There should be 12 results (minimum would be 8 - 5 of which are used for real matches, plus 4 new row producing matches).

Our result seems internally inconsistent; conceptually there are two kinds of nulls here and we cannot collapse them.
 
null-val: we are outputting the record from the nested path but there is no actual value to output so we output null-val
null-union: we are not outputting the record for the nested path (we are doing a different one) but we need to output something for this column so we output null-union.


Thinking this over - I think the difference is we implemented a FULL OUTER JOIN to combine the siblings - including the behavior of that construct and the absence of rows.  DB2 took the word "UNION" for the plan modifier literally and unioned (actually union all) the two subpaths together using the null concepts above (though somehow ensuring that at least one row was produced from each subpath...).

Thus we are indeed back to seeing whether the standard defines sibling combining as union or join, or some other special construct.  I'm now leaning toward what we've done as at least being the more sane option.

Even if our outer join process is correct the existing wording is odd.

"Use FULL OUTER JOIN ON FALSE, so that both parent and child rows are included into the output, with NULL values inserted into both child and parent columns for all missing values."

I don't think it helps to mention parent here.  This aspect of plan doesn't concern itself with the final output, only the output of the subplan which is then combined with the parent using a join.  I would probably want to phrase the default more like:

"This is the default option for joining the combined child rows to the parent."

David J.

Re: SQL/JSON: FOR ORDINALITY bug

От
Andrew Dunstan
Дата:
On 2022-05-04 We 16:09, Erik Rijkers wrote:
> Op 04-05-2022 om 21:12 schreef Andrew Dunstan:
>>
>>>>>
>>>>> I don't see how rowseq can be anything but 1.  Each invocation of
>>>
>>>
>>> After some further experimentation, I now think you must be right,
>>> David.
>>>
>>> Also, looking at the DB2 docs:
>>>    https://www.ibm.com/docs/en/i/7.2?topic=data-using-json-table
>>>      (see especially under 'Handling nested information')
>>>
>>> There, I gathered some example data + statements where one is the case
>>> at hand.  I also made them runnable under postgres (attached).
>>>
>>> I thought that was an instructive example, with those
>>> 'outer_ordinality' and 'inner_ordinality' columns.
>>>
>>>
>>
>> Yeah, I just reviewed the latest version of that page (7.5) and the
>> example seems fairly plain that we are doing the right thing, or if not
>> we're in pretty good company, so I guess this is probably a false alarm.
>> Looks like ordinality is for the number of the element produced by the
>> path expression. So a path of 'lax $' should just produce ordinality of
>> 1 in each case, while a path of 'lax $[*]' will produce increasing
>> ordinality for each element of the root array.
>
> Agreed.
>
> You've probably noticed then that on that same page under 'Sibling
> Nesting' is a statement that gives a 13-row resultset on DB2 whereas
> in 15devel that statement yields just 10 rows.  I don't know which is
> correct.


Oracle also gives 10 rows for that query according to my testing, so I
suspect either DB2 and/or its docs are wrong.


cheers


andrew


--
Andrew Dunstan
EDB: https://www.enterprisedb.com