Re: remaining sql/json patches

Поиск
Список
Период
Сортировка
От Amit Langote
Тема Re: remaining sql/json patches
Дата
Msg-id CA+HiwqET8gAhoCqNx_ueELrs4UevhRMNOYSmq=WykOu9THn=og@mail.gmail.com
обсуждение исходный текст
Ответ на Re: remaining sql/json patches  (Alvaro Herrera <alvherre@alvh.no-ip.org>)
Ответы Re: remaining sql/json patches  (Alvaro Herrera <alvherre@alvh.no-ip.org>)
Re: remaining sql/json patches  (jian he <jian.universality@gmail.com>)
Re: remaining sql/json patches  (Peter Eisentraut <peter@eisentraut.org>)
Список pgsql-hackers
Thanks Alvaro.

On Wed, Dec 6, 2023 at 7:43 PM Alvaro Herrera <alvherre@alvh.no-ip.org> wrote:
> On 2023-Dec-05, Amit Langote wrote:
>
> > I've attempted to trim down the JSON_TABLE grammar (0004), but this is
> > all I've managed so far.  Among other things, I couldn't refactor the
> > grammar to do away with the following:
> >
> > +%nonassoc  NESTED
> > +%left      PATH
>
> To recap, the reason we're arguing about this is that this creates two
> new precedence classes, which are higher than everything else.  Judging
> by the discussios in thread [1], this is not acceptable.  Without either
> those new classes or the two hacks I describe below, the grammar has the
> following shift/reduce conflict:
>
> State 6220
>
>   2331 json_table_column_definition: NESTED . path_opt Sconst COLUMNS '(' json_table_column_definition_list ')'
>   2332                             | NESTED . path_opt Sconst AS name COLUMNS '(' json_table_column_definition_list
')'
>   2636 unreserved_keyword: NESTED .
>
>     PATH  shift, and go to state 6286
>
>     SCONST    reduce using rule 2336 (path_opt)
>     PATH      [reduce using rule 2636 (unreserved_keyword)]
>     $default  reduce using rule 2636 (unreserved_keyword)
>
>     path_opt  go to state 6287
>
>
>
> First, while the grammar uses "NESTED path_opt" in the relevant productions, I
> noticed that there's no test that uses NESTED without PATH, so if we break that
> case, we won't notice.  I propose we remove the PATH keyword from one of
> the tests in jsonb_sqljson.sql in order to make sure the grammar
> continues to work after whatever hacking we do:
>
> diff --git a/src/test/regress/expected/jsonb_sqljson.out b/src/test/regress/expected/jsonb_sqljson.out
> index 7e8ae6a696..8fd2385cdc 100644
> --- a/src/test/regress/expected/jsonb_sqljson.out
> +++ b/src/test/regress/expected/jsonb_sqljson.out
> @@ -1548,7 +1548,7 @@ HINT:  JSON_TABLE column names must be distinct from one another.
>  SELECT * FROM JSON_TABLE(
>         jsonb 'null', '$[*]' AS p0
>         COLUMNS (
> -               NESTED PATH '$' AS p1 COLUMNS (
> +               NESTED '$' AS p1 COLUMNS (
>                         NESTED PATH '$' AS p11 COLUMNS ( foo int ),
>                         NESTED PATH '$' AS p12 COLUMNS ( bar int )
>                 ),
> diff --git a/src/test/regress/sql/jsonb_sqljson.sql b/src/test/regress/sql/jsonb_sqljson.sql
> index ea5db88b40..ea9b4ff8b6 100644
> --- a/src/test/regress/sql/jsonb_sqljson.sql
> +++ b/src/test/regress/sql/jsonb_sqljson.sql
> @@ -617,7 +617,7 @@ SELECT * FROM JSON_TABLE(
>  SELECT * FROM JSON_TABLE(
>         jsonb 'null', '$[*]' AS p0
>         COLUMNS (
> -               NESTED PATH '$' AS p1 COLUMNS (
> +               NESTED '$' AS p1 COLUMNS (
>                         NESTED PATH '$' AS p11 COLUMNS ( foo int ),
>                         NESTED PATH '$' AS p12 COLUMNS ( bar int )
>                 ),

Fixed the test case like that in the attached.

> Having done that, AFAICS there are two possible fixes for the grammar.
> One is to keep the idea of assigning precedence explicitly to these
> keywords, but do something less hackish -- we can put NESTED together
> with UNBOUNDED, and classify PATH in the IDENT group.  This requires no
> further changes.  This would make NESTED PATH follow the same rationale
> as UNBOUNDED FOLLOWING / UNBOUNDED PRECEDING.  Here's is a preliminary
> patch for that (the large comment above needs to be updated.)
>
> diff --git a/src/backend/parser/gram.y b/src/backend/parser/gram.y
> index c15fcf2eb2..1493ac7580 100644
> --- a/src/backend/parser/gram.y
> +++ b/src/backend/parser/gram.y
> @@ -887,9 +887,9 @@ static Node *makeRecursiveViewSelect(char *relname, List *aliases, Node *query);
>   * json_predicate_type_constraint and json_key_uniqueness_constraint_opt
>   * productions (see comments there).
>   */
> -%nonassoc      UNBOUNDED               /* ideally would have same precedence as IDENT */
> +%nonassoc      UNBOUNDED NESTED                /* ideally would have same precedence as IDENT */
>  %nonassoc      IDENT PARTITION RANGE ROWS GROUPS PRECEDING FOLLOWING CUBE ROLLUP
> -                       SET KEYS OBJECT_P SCALAR VALUE_P WITH WITHOUT
> +                       SET KEYS OBJECT_P SCALAR VALUE_P WITH WITHOUT PATH
>  %left          Op OPERATOR             /* multi-character ops and user-defined operators */
>  %left          '+' '-'
>  %left          '*' '/' '%'
> @@ -911,8 +911,6 @@ static Node *makeRecursiveViewSelect(char *relname, List *aliases, Node *query);
>   */
>  %left          JOIN CROSS LEFT FULL RIGHT INNER_P NATURAL
>
> -%nonassoc      NESTED
> -%left          PATH
>  %%
>
>  /*
>
>
> The other thing we can do is use the two-token lookahead trick, by
> declaring
> %token NESTED_LA
> and using the parser.c code to replace NESTED with NESTED_LA when it is
> followed by PATH.  This doesn't require assigning precedence to
> anything.  We do need to expand the two rules that have "NESTED
> opt_path Sconst" to each be two rules, one for "NESTED_LA PATH Sconst"
> and another for "NESTED Sconst".  So the opt_path production goes away.
> This preliminary patch does that. (I did not touch the ecpg grammar, but
> it needs an update too.)
>
> diff --git a/src/backend/parser/gram.y b/src/backend/parser/gram.y
> index c15fcf2eb2..8e4c1d4ebe 100644
> --- a/src/backend/parser/gram.y
> +++ b/src/backend/parser/gram.y
> @@ -817,7 +817,7 @@ static Node *makeRecursiveViewSelect(char *relname, List *aliases, Node *query);
>   * FORMAT_LA, NULLS_LA, WITH_LA, and WITHOUT_LA are needed to make the grammar
>   * LALR(1).
>   */
> -%token         FORMAT_LA NOT_LA NULLS_LA WITH_LA WITHOUT_LA
> +%token         FORMAT_LA NESTED_LA NOT_LA NULLS_LA WITH_LA WITHOUT_LA
>
>  /*
>   * The grammar likewise thinks these tokens are keywords, but they are never
> @@ -911,8 +911,6 @@ static Node *makeRecursiveViewSelect(char *relname, List *aliases, Node *query);
>   */
>  %left          JOIN CROSS LEFT FULL RIGHT INNER_P NATURAL
>
> -%nonassoc      NESTED
> -%left          PATH
>  %%
>
>  /*
> @@ -16771,7 +16769,7 @@ json_table_column_definition:
>                                         n->location = @1;
>                                         $$ = (Node *) n;
>                                 }
> -                       | NESTED path_opt Sconst
> +                       | NESTED_LA PATH Sconst
>                                 COLUMNS '('     json_table_column_definition_list ')'
>                                 {
>                                         JsonTableColumn *n = makeNode(JsonTableColumn);
> @@ -16783,7 +16781,19 @@ json_table_column_definition:
>                                         n->location = @1;
>                                         $$ = (Node *) n;
>                                 }
> -                       | NESTED path_opt Sconst AS name
> +                       | NESTED Sconst
> +                               COLUMNS '('     json_table_column_definition_list ')'
> +                               {
> +                                       JsonTableColumn *n = makeNode(JsonTableColumn);
> +
> +                                       n->coltype = JTC_NESTED;
> +                                       n->pathspec = $2;
> +                                       n->pathname = NULL;
> +                                       n->columns = $5;
> +                                       n->location = @1;
> +                                       $$ = (Node *) n;
> +                               }
> +                       | NESTED_LA PATH Sconst AS name
>                                 COLUMNS '('     json_table_column_definition_list ')'
>                                 {
>                                         JsonTableColumn *n = makeNode(JsonTableColumn);
> @@ -16795,6 +16805,19 @@ json_table_column_definition:
>                                         n->location = @1;
>                                         $$ = (Node *) n;
>                                 }
> +                       | NESTED Sconst AS name
> +                               COLUMNS '('     json_table_column_definition_list ')'
> +                               {
> +                                       JsonTableColumn *n = makeNode(JsonTableColumn);
> +
> +                                       n->coltype = JTC_NESTED;
> +                                       n->pathspec = $2;
> +                                       n->pathname = $4;
> +                                       n->columns = $7;
> +                                       n->location = @1;
> +                                       $$ = (Node *) n;
> +                               }
> +
>                 ;
>
>  json_table_column_path_specification_clause_opt:
> @@ -16802,11 +16825,6 @@ json_table_column_path_specification_clause_opt:
>                         | /* EMPTY */                                                   { $$ = NULL; }
>                 ;
>
> -path_opt:
> -                       PATH                                                            { }
> -                       | /* EMPTY */                                           { }
> -               ;
> -
>  json_table_plan_clause_opt:
>                         PLAN '(' json_table_plan ')'                    { $$ = $3; }
>                         | PLAN DEFAULT '(' json_table_default_plan_choices ')'
> diff --git a/src/backend/parser/parser.c b/src/backend/parser/parser.c
> index e17c310cc1..e3092f2c3e 100644
> --- a/src/backend/parser/parser.c
> +++ b/src/backend/parser/parser.c
> @@ -138,6 +138,7 @@ base_yylex(YYSTYPE *lvalp, YYLTYPE *llocp, core_yyscan_t yyscanner)
>         switch (cur_token)
>         {
>                 case FORMAT:
> +               case NESTED:
>                         cur_token_length = 6;
>                         break;
>                 case NOT:
> @@ -204,6 +205,16 @@ base_yylex(YYSTYPE *lvalp, YYLTYPE *llocp, core_yyscan_t yyscanner)
>                         }
>                         break;
>
> +               case NESTED:
> +                       /* Replace NESTED by NESTED_LA if it's followed by PATH */
> +                       switch (next_token)
> +                       {
> +                               case PATH:
> +                                       cur_token = NESTED_LA;
> +                                       break;
> +                       }
> +                       break;
> +
>                 case NOT:
>                         /* Replace NOT by NOT_LA if it's followed by BETWEEN, IN, etc */
>                         switch (next_token)
>
>
> I don't know which of the two "fixes" is less bad.

I think I'm inclined toward adapting the LA-token fix (attached 0005),
because we've done that before with SQL/JSON constructors patch.
Also, if I understand the concerns that Tom mentioned at [1]
correctly, maybe we'd be better off not assigning precedence to
symbols as much as possible, so there's that too against the approach
#1.

Also I've attached 0006 to add news tests under ECPG for the SQL/JSON
query functions, which I haven't done so far but realized after you
mentioned ECPG.  It also includes the ECPG variant of the LA-token
fix.  I'll eventually merge it into 0003 and 0004 after expanding the
test cases some more.  I do wonder what kinds of tests we normally add
to ECPG suite but not others?

Finally, I also fixed a couple of silly mistakes in 0003 around
transformJsonBehavior() and some further assorted tightening in the ON
ERROR / EMPTY expression coercion handling code.

--
Thanks, Amit Langote
EDB: http://www.enterprisedb.com

Вложения

В списке pgsql-hackers по дате отправления:

Предыдущее
От: Tomas Vondra
Дата:
Сообщение: Re: logical decoding and replication of sequences, take 2
Следующее
От: Junwang Zhao
Дата:
Сообщение: Re: Make COPY format extendable: Extract COPY TO format implementations