Re: remaining sql/json patches

Поиск
Список
Период
Сортировка
От Alvaro Herrera
Тема Re: remaining sql/json patches
Дата
Msg-id 202312061042.uxymt2swy5fz@alvherre.pgsql
обсуждение исходный текст
Ответ на Re: remaining sql/json patches  (Amit Langote <amitlangote09@gmail.com>)
Ответы Re: remaining sql/json patches  (Amit Langote <amitlangote09@gmail.com>)
Список pgsql-hackers
On 2023-Dec-05, Amit Langote wrote:

> I've attempted to trim down the JSON_TABLE grammar (0004), but this is
> all I've managed so far.  Among other things, I couldn't refactor the
> grammar to do away with the following:
> 
> +%nonassoc  NESTED
> +%left      PATH

To recap, the reason we're arguing about this is that this creates two
new precedence classes, which are higher than everything else.  Judging
by the discussios in thread [1], this is not acceptable.  Without either
those new classes or the two hacks I describe below, the grammar has the
following shift/reduce conflict:

State 6220

  2331 json_table_column_definition: NESTED . path_opt Sconst COLUMNS '(' json_table_column_definition_list ')'
  2332                             | NESTED . path_opt Sconst AS name COLUMNS '(' json_table_column_definition_list
')'
  2636 unreserved_keyword: NESTED .

    PATH  shift, and go to state 6286

    SCONST    reduce using rule 2336 (path_opt)
    PATH      [reduce using rule 2636 (unreserved_keyword)]
    $default  reduce using rule 2636 (unreserved_keyword)

    path_opt  go to state 6287



First, while the grammar uses "NESTED path_opt" in the relevant productions, I
noticed that there's no test that uses NESTED without PATH, so if we break that
case, we won't notice.  I propose we remove the PATH keyword from one of
the tests in jsonb_sqljson.sql in order to make sure the grammar
continues to work after whatever hacking we do:

diff --git a/src/test/regress/expected/jsonb_sqljson.out b/src/test/regress/expected/jsonb_sqljson.out
index 7e8ae6a696..8fd2385cdc 100644
--- a/src/test/regress/expected/jsonb_sqljson.out
+++ b/src/test/regress/expected/jsonb_sqljson.out
@@ -1548,7 +1548,7 @@ HINT:  JSON_TABLE column names must be distinct from one another.
 SELECT * FROM JSON_TABLE(
     jsonb 'null', '$[*]' AS p0
     COLUMNS (
-        NESTED PATH '$' AS p1 COLUMNS (
+        NESTED '$' AS p1 COLUMNS (
             NESTED PATH '$' AS p11 COLUMNS ( foo int ),
             NESTED PATH '$' AS p12 COLUMNS ( bar int )
         ),
diff --git a/src/test/regress/sql/jsonb_sqljson.sql b/src/test/regress/sql/jsonb_sqljson.sql
index ea5db88b40..ea9b4ff8b6 100644
--- a/src/test/regress/sql/jsonb_sqljson.sql
+++ b/src/test/regress/sql/jsonb_sqljson.sql
@@ -617,7 +617,7 @@ SELECT * FROM JSON_TABLE(
 SELECT * FROM JSON_TABLE(
     jsonb 'null', '$[*]' AS p0
     COLUMNS (
-        NESTED PATH '$' AS p1 COLUMNS (
+        NESTED '$' AS p1 COLUMNS (
             NESTED PATH '$' AS p11 COLUMNS ( foo int ),
             NESTED PATH '$' AS p12 COLUMNS ( bar int )
         ),


Having done that, AFAICS there are two possible fixes for the grammar.
One is to keep the idea of assigning precedence explicitly to these
keywords, but do something less hackish -- we can put NESTED together
with UNBOUNDED, and classify PATH in the IDENT group.  This requires no
further changes.  This would make NESTED PATH follow the same rationale
as UNBOUNDED FOLLOWING / UNBOUNDED PRECEDING.  Here's is a preliminary
patch for that (the large comment above needs to be updated.)

diff --git a/src/backend/parser/gram.y b/src/backend/parser/gram.y
index c15fcf2eb2..1493ac7580 100644
--- a/src/backend/parser/gram.y
+++ b/src/backend/parser/gram.y
@@ -887,9 +887,9 @@ static Node *makeRecursiveViewSelect(char *relname, List *aliases, Node *query);
  * json_predicate_type_constraint and json_key_uniqueness_constraint_opt
  * productions (see comments there).
  */
-%nonassoc    UNBOUNDED        /* ideally would have same precedence as IDENT */
+%nonassoc    UNBOUNDED NESTED        /* ideally would have same precedence as IDENT */
 %nonassoc    IDENT PARTITION RANGE ROWS GROUPS PRECEDING FOLLOWING CUBE ROLLUP
-            SET KEYS OBJECT_P SCALAR VALUE_P WITH WITHOUT
+            SET KEYS OBJECT_P SCALAR VALUE_P WITH WITHOUT PATH
 %left        Op OPERATOR        /* multi-character ops and user-defined operators */
 %left        '+' '-'
 %left        '*' '/' '%'
@@ -911,8 +911,6 @@ static Node *makeRecursiveViewSelect(char *relname, List *aliases, Node *query);
  */
 %left        JOIN CROSS LEFT FULL RIGHT INNER_P NATURAL
 
-%nonassoc    NESTED
-%left        PATH
 %%
 
 /*


The other thing we can do is use the two-token lookahead trick, by
declaring
%token NESTED_LA
and using the parser.c code to replace NESTED with NESTED_LA when it is
followed by PATH.  This doesn't require assigning precedence to
anything.  We do need to expand the two rules that have "NESTED
opt_path Sconst" to each be two rules, one for "NESTED_LA PATH Sconst"
and another for "NESTED Sconst".  So the opt_path production goes away.
This preliminary patch does that. (I did not touch the ecpg grammar, but
it needs an update too.)

diff --git a/src/backend/parser/gram.y b/src/backend/parser/gram.y
index c15fcf2eb2..8e4c1d4ebe 100644
--- a/src/backend/parser/gram.y
+++ b/src/backend/parser/gram.y
@@ -817,7 +817,7 @@ static Node *makeRecursiveViewSelect(char *relname, List *aliases, Node *query);
  * FORMAT_LA, NULLS_LA, WITH_LA, and WITHOUT_LA are needed to make the grammar
  * LALR(1).
  */
-%token        FORMAT_LA NOT_LA NULLS_LA WITH_LA WITHOUT_LA
+%token        FORMAT_LA NESTED_LA NOT_LA NULLS_LA WITH_LA WITHOUT_LA
 
 /*
  * The grammar likewise thinks these tokens are keywords, but they are never
@@ -911,8 +911,6 @@ static Node *makeRecursiveViewSelect(char *relname, List *aliases, Node *query);
  */
 %left        JOIN CROSS LEFT FULL RIGHT INNER_P NATURAL
 
-%nonassoc    NESTED
-%left        PATH
 %%
 
 /*
@@ -16771,7 +16769,7 @@ json_table_column_definition:
                     n->location = @1;
                     $$ = (Node *) n;
                 }
-            | NESTED path_opt Sconst
+            | NESTED_LA PATH Sconst
                 COLUMNS '('    json_table_column_definition_list ')'
                 {
                     JsonTableColumn *n = makeNode(JsonTableColumn);
@@ -16783,7 +16781,19 @@ json_table_column_definition:
                     n->location = @1;
                     $$ = (Node *) n;
                 }
-            | NESTED path_opt Sconst AS name
+            | NESTED Sconst
+                COLUMNS '('    json_table_column_definition_list ')'
+                {
+                    JsonTableColumn *n = makeNode(JsonTableColumn);
+
+                    n->coltype = JTC_NESTED;
+                    n->pathspec = $2;
+                    n->pathname = NULL;
+                    n->columns = $5;
+                    n->location = @1;
+                    $$ = (Node *) n;
+                }
+            | NESTED_LA PATH Sconst AS name
                 COLUMNS '('    json_table_column_definition_list ')'
                 {
                     JsonTableColumn *n = makeNode(JsonTableColumn);
@@ -16795,6 +16805,19 @@ json_table_column_definition:
                     n->location = @1;
                     $$ = (Node *) n;
                 }
+            | NESTED Sconst AS name
+                COLUMNS '('    json_table_column_definition_list ')'
+                {
+                    JsonTableColumn *n = makeNode(JsonTableColumn);
+
+                    n->coltype = JTC_NESTED;
+                    n->pathspec = $2;
+                    n->pathname = $4;
+                    n->columns = $7;
+                    n->location = @1;
+                    $$ = (Node *) n;
+                }
+
         ;
 
 json_table_column_path_specification_clause_opt:
@@ -16802,11 +16825,6 @@ json_table_column_path_specification_clause_opt:
             | /* EMPTY */                            { $$ = NULL; }
         ;
 
-path_opt:
-            PATH                                { }
-            | /* EMPTY */                        { }
-        ;
-
 json_table_plan_clause_opt:
             PLAN '(' json_table_plan ')'            { $$ = $3; }
             | PLAN DEFAULT '(' json_table_default_plan_choices ')'
diff --git a/src/backend/parser/parser.c b/src/backend/parser/parser.c
index e17c310cc1..e3092f2c3e 100644
--- a/src/backend/parser/parser.c
+++ b/src/backend/parser/parser.c
@@ -138,6 +138,7 @@ base_yylex(YYSTYPE *lvalp, YYLTYPE *llocp, core_yyscan_t yyscanner)
     switch (cur_token)
     {
         case FORMAT:
+        case NESTED:
             cur_token_length = 6;
             break;
         case NOT:
@@ -204,6 +205,16 @@ base_yylex(YYSTYPE *lvalp, YYLTYPE *llocp, core_yyscan_t yyscanner)
             }
             break;
 
+        case NESTED:
+            /* Replace NESTED by NESTED_LA if it's followed by PATH */
+            switch (next_token)
+            {
+                case PATH:
+                    cur_token = NESTED_LA;
+                    break;
+            }
+            break;
+
         case NOT:
             /* Replace NOT by NOT_LA if it's followed by BETWEEN, IN, etc */
             switch (next_token)


I don't know which of the two "fixes" is less bad.  Like Amit, I was not
able to find a solution to the problem by merely attaching precedences
to rules.  (I did not try to mess with the precedence of
unreserved_keyword, because I'm pretty sure that would not be a good
solution even if I could make it work.)

[1] https://postgr.es/m/CADT4RqBPdbsZW7HS1jJP319TMRHs1hzUiP=iRJYR6UqgHCrgNQ@mail.gmail.com

-- 
Álvaro Herrera               48°01'N 7°57'E  —  https://www.EnterpriseDB.com/



В списке pgsql-hackers по дате отправления:

Предыдущее
От: Amit Kapila
Дата:
Сообщение: Re: logical decoding and replication of sequences, take 2
Следующее
От: jian he
Дата:
Сообщение: Re: POC PATCH: copy from ... exceptions to: (was Re: VLDB Features)