Re: jsonpath

Поиск

Список

Период

Сортировка

От	Alexander Korotkov
Тема	Re: jsonpath
Дата	22 марта 2019 г. 12:14:10
Msg-id	CAPpHfdu6meyXZtyWL6vw0P54tJO8VCUL5my5qcfi29C4DDvACQ@mail.gmail.com обсуждение исходный текст
Ответ на	Re: jsonpath (John Naylor <john.naylor@2ndquadrant.com>)
Ответы	Re: jsonpath
Список	pgsql-hackers

Дерево обсуждения

On Fri, Mar 22, 2019 at 5:38 AM John Naylor <john.naylor@2ndquadrant.com> wrote:
> On Thu, Mar 21, 2019 at 9:59 PM Alexander Korotkov
> <a.korotkov@postgrespro.ru> wrote:
> > Attaches patches improving jsonpath parser.  First one introduces
> > cosmetic changes, while second gets rid of backtracking.  I'm also
> > planning to add high-level comment for both grammar and lexer.
>
> The cosmetic changes look good to me. I just noticed a couple things
> about the comments.
>
> 0001:
>
> +/* Check if current scanstring value constitutes a keyword */
>
> 'is a keyword' is better. 'Constitutes' implies parts of a whole.
>
> + * Resize scanstring for appending of given length.  Reinitilize if required.
>
> s/Reinitilize/Reinitialize/
>
> The first sentence is not entirely clear to me.

Thank you, fixed.

> 0002:
>
> These two rules are not strictly necessary:
>
> <xnq,xq,xvq,xsq>{unicode}+\\ {
> /* throw back the \\, and treat as unicode */
> yyless(yyleng - 1);
> parseUnicode(yytext, yyleng);
> }
>
> <xnq,xq,xvq,xsq>{hex_char}+\\ {
> /* throw back the \\, and treat as hex */
> yyless(yyleng - 1);
> parseHexChars(yytext, yyleng);
> }
>
> ...and only seem to be there because of how these are written:
>
> <xnq,xq,xvq,xsq>{unicode}+ { parseUnicode(yytext, yyleng); }
> <xnq,xq,xvq,xsq>{hex_char}+ { parseHexChars(yytext, yyleng); }
> <xnq,xq,xvq,xsq>{unicode}*{unicodefail} { yyerror(NULL, "Unicode
> sequence is invalid"); }
> <xnq,xq,xvq,xsq>{hex_char}*{hex_fail} { yyerror(NULL, "Hex character
> sequence is invalid"); }
>
> I don't understand the reasoning here -- is it a micro-optimization?
> The following is simpler, allow the rules I mentioned to be removed,
> and make check still passes. I would prefer it unless there is a
> performance penalty, in which case a comment to describe the
> additional complexity would be helpful.
>
> <xnq,xq,xvq,xsq>{unicode} { parseUnicode(yytext, yyleng); }
> <xnq,xq,xvq,xsq>{hex_char} { parseHexChars(yytext, yyleng); }
> <xnq,xq,xvq,xsq>{unicodefail} { yyerror(NULL, "Unicode sequence is invalid"); }
> <xnq,xq,xvq,xsq>{hex_fail} { yyerror(NULL, "Hex character sequence is
> invalid"); }

These rules are needed for unicode.  Sequential escaped unicode
characters might be connected by hi surrogate value.  See
jsonpath_encoding regression test in attached patch.

Regarding hex, I made it so for the sake of uniformity.  But I changed
my mind and decided that simpler flex rules are more important.  So,
now they are considered one-by-one.

------
Alexander Korotkov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

Вложения

В списке pgsql-hackers по дате отправления:

Вход в личный кабинет

Восстановление пароля

Подтверждение аккаунта

Изменение пароля

Re: jsonpath

Вложения