Обсуждение: Re: Row pattern recognition
Attached are the v29 patch sets to implement the subset of row pattern recognition feature defined in the SQL standard. In this patch set: - Adjust 0003 and 0004 to deal with the recent changes made by 8b1b342. - Add tests to 0007 for CREATE VIEW/pg_get_viewdef. Best reagards, -- Tatsuo Ishii SRA OSS K.K. English: http://www.sraoss.co.jp/index_en/ Japanese:http://www.sraoss.co.jp
Вложения
- v29-0001-Row-pattern-recognition-patch-for-raw-parser.patch
- v29-0002-Row-pattern-recognition-patch-parse-analysis.patch
- v29-0003-Row-pattern-recognition-patch-rewriter.patch
- v29-0004-Row-pattern-recognition-patch-planner.patch
- v29-0005-Row-pattern-recognition-patch-executor.patch
- v29-0006-Row-pattern-recognition-patch-docs.patch
- v29-0007-Row-pattern-recognition-patch-tests.patch
- v29-0008-Row-pattern-recognition-patch-typedefs.list.patch
- v29-0009-Allow-to-print-raw-parse-tree.patch
Attached are the v30 patches, just adding a patch to change the default io_method parameter to sync, expecting this affects something to the recent CFbot failure. https://commitfest.postgresql.org/patch/4460/ https://cirrus-ci.com/task/6078653164945408 which is similar to: https://www.postgresql.org/message-id/20250422.111139.1502127917165838403.ishii%40postgresql.org (Once the issue resolved, the patch should be removed, of course) Best regards, -- Tatsuo Ishii SRA OSS K.K. English: http://www.sraoss.co.jp/index_en/ Japanese:http://www.sraoss.co.jp
Вложения
- v30-0005-Row-pattern-recognition-patch-executor.patch
- v30-0006-Row-pattern-recognition-patch-docs.patch
- v30-0001-Row-pattern-recognition-patch-for-raw-parser.patch
- v30-0002-Row-pattern-recognition-patch-parse-analysis.patch
- v30-0003-Row-pattern-recognition-patch-rewriter.patch
- v30-0004-Row-pattern-recognition-patch-planner.patch
- v30-0007-Row-pattern-recognition-patch-tests.patch
- v30-0008-Row-pattern-recognition-patch-typedefs.list.patch
- v30-0009-Allow-to-print-raw-parse-tree.patch
- v30-0010-Temporarily-change-IO-method-to-sync.patch
> Attached are the v30 patches, just adding a patch to change the > default io_method parameter to sync, expecting this affects something > to the recent CFbot failure. > https://commitfest.postgresql.org/patch/4460/ > https://cirrus-ci.com/task/6078653164945408 > which is similar to: > https://www.postgresql.org/message-id/20250422.111139.1502127917165838403.ishii%40postgresql.org CFBot has just run tests against v30 patches and now it turns to green again! I guess io_method = sync definitely did the trick. Note that previously only the Windows Server 2019, VS 2019 - Neason & ninja test had failed. Best regards, -- Tatsuo Ishii SRA OSS K.K. English: http://www.sraoss.co.jp/index_en/ Japanese:http://www.sraoss.co.jp
> Attached are the v30 patches, just adding a patch to change the > default io_method parameter to sync, expecting this affects something > to the recent CFbot failure. > https://commitfest.postgresql.org/patch/4460/ > https://cirrus-ci.com/task/6078653164945408 > which is similar to: > https://www.postgresql.org/message-id/20250422.111139.1502127917165838403.ishii%40postgresql.org > > (Once the issue resolved, the patch should be removed, of course) It seems this has turned to green since May 2, 2025. https://commitfest.postgresql.org/patch/5679/. The last time it turned to red was April 16, 2025. Maybe something committed between the period solved the cause of red, but I don't know exactly which commit. Anyway v31 patches now remove the change to default io_method parameter to see if it passes Windows Server 2019, VS 2019 - Meson & ninja test. Best regards, -- Tatsuo Ishii SRA OSS K.K. English: http://www.sraoss.co.jp/index_en/ Japanese:http://www.sraoss.co.jp
Вложения
- v31-0001-Row-pattern-recognition-patch-for-raw-parser.patch
- v31-0002-Row-pattern-recognition-patch-parse-analysis.patch
- v31-0003-Row-pattern-recognition-patch-rewriter.patch
- v31-0004-Row-pattern-recognition-patch-planner.patch
- v31-0005-Row-pattern-recognition-patch-executor.patch
- v31-0006-Row-pattern-recognition-patch-docs.patch
- v31-0007-Row-pattern-recognition-patch-tests.patch
- v31-0008-Row-pattern-recognition-patch-typedefs.list.patch
- v31-0009-Allow-to-print-raw-parse-tree.patch
>> Attached are the v30 patches, just adding a patch to change the >> default io_method parameter to sync, expecting this affects something >> to the recent CFbot failure. >> https://commitfest.postgresql.org/patch/4460/ >> https://cirrus-ci.com/task/6078653164945408 >> which is similar to: >> https://www.postgresql.org/message-id/20250422.111139.1502127917165838403.ishii%40postgresql.org >> >> (Once the issue resolved, the patch should be removed, of course) > > It seems this has turned to green since May 2, 2025. > https://commitfest.postgresql.org/patch/5679/. > > The last time it turned to red was April 16, 2025. Maybe something > committed between the period solved the cause of red, but I don't know > exactly which commit. > > Anyway v31 patches now remove the change to default io_method > parameter to see if it passes Windows Server 2019, VS 2019 - Meson & > ninja test. Now I see it passes the test. https://cirrus-ci.com/build/5671612202090496 Best regards, -- Tatsuo Ishii SRA OSS K.K. English: http://www.sraoss.co.jp/index_en/ Japanese:http://www.sraoss.co.jp
Attached are the v32 patches for Row pattern recognition. Recent changes to doc/src/sgml/func.sgml required v31 to be rebased. Other than that, nothing has been changed. Best regards, -- Tatsuo Ishii SRA OSS K.K. English: http://www.sraoss.co.jp/index_en/ Japanese:http://www.sraoss.co.jp
Вложения
- v32-0001-Row-pattern-recognition-patch-for-raw-parser.patch
- v32-0002-Row-pattern-recognition-patch-parse-analysis.patch
- v32-0003-Row-pattern-recognition-patch-rewriter.patch
- v32-0004-Row-pattern-recognition-patch-planner.patch
- v32-0005-Row-pattern-recognition-patch-executor.patch
- v32-0006-Row-pattern-recognition-patch-docs.patch
- v32-0007-Row-pattern-recognition-patch-tests.patch
- v32-0008-Row-pattern-recognition-patch-typedefs.list.patch
- v32-0009-Allow-to-print-raw-parse-tree.patch
Attached are the v33 patches for Row pattern recognition. The difference from v32 is that the raw parse tree printing patch is not included in v33. This is because now that we have debug_print_raw_parse. Best regards, -- Tatsuo Ishii SRA OSS K.K. English: http://www.sraoss.co.jp/index_en/ Japanese:http://www.sraoss.co.jp
Вложения
- v33-0001-Row-pattern-recognition-patch-for-raw-parser.patch
- v33-0002-Row-pattern-recognition-patch-parse-analysis.patch
- v33-0003-Row-pattern-recognition-patch-rewriter.patch
- v33-0004-Row-pattern-recognition-patch-planner.patch
- v33-0005-Row-pattern-recognition-patch-executor.patch
- v33-0006-Row-pattern-recognition-patch-docs.patch
- v33-0007-Row-pattern-recognition-patch-tests.patch
- v33-0008-Row-pattern-recognition-patch-typedefs.list.patch
Attached are the v34 patches for Row pattern recognition. Notihing has been changed. Just rebased. Best regards, -- Tatsuo Ishii SRA OSS K.K. English: http://www.sraoss.co.jp/index_en/ Japanese:http://www.sraoss.co.jp
Вложения
- v34-0001-Row-pattern-recognition-patch-for-raw-parser.patch
- v34-0002-Row-pattern-recognition-patch-parse-analysis.patch
- v34-0003-Row-pattern-recognition-patch-rewriter.patch
- v34-0004-Row-pattern-recognition-patch-planner.patch
- v34-0005-Row-pattern-recognition-patch-executor.patch
- v34-0006-Row-pattern-recognition-patch-docs.patch
- v34-0007-Row-pattern-recognition-patch-tests.patch
- v34-0008-Row-pattern-recognition-patch-typedefs.list.patch
Attached are the v35 patches for Row pattern recognition. In v34-0001 gram.y patch, %type for RPR were misplaced. v35-0001 fixes this. Other patches are not changed. Best regards, -- Tatsuo Ishii SRA OSS K.K. English: http://www.sraoss.co.jp/index_en/ Japanese:http://www.sraoss.co.jp
Вложения
- v35-0001-Row-pattern-recognition-patch-for-raw-parser.patch
- v35-0002-Row-pattern-recognition-patch-parse-analysis.patch
- v35-0003-Row-pattern-recognition-patch-rewriter.patch
- v35-0004-Row-pattern-recognition-patch-planner.patch
- v35-0005-Row-pattern-recognition-patch-executor.patch
- v35-0006-Row-pattern-recognition-patch-docs.patch
- v35-0007-Row-pattern-recognition-patch-tests.patch
- v35-0008-Row-pattern-recognition-patch-typedefs.list.patch
Hi Tatsuo-san, I just reviewed 0006 docs changes and 0001. I plan to slowly review the patch, maybe one commit a day. > On Nov 18, 2025, at 10:33, Tatsuo Ishii <ishii@postgresql.org> wrote: > > Attached are the v35 patches for Row pattern recognition. In v34-0001 > gram.y patch, %type for RPR were misplaced. v35-0001 fixes this. Other > patches are not changed. > > Best regards, > -- > Tatsuo Ishii > SRA OSS K.K. > English: http://www.sraoss.co.jp/index_en/ > Japanese:http://www.sraoss.co.jp > <v35-0001-Row-pattern-recognition-patch-for-raw-parser.patch><v35-0002-Row-pattern-recognition-patch-parse-analysis.patch><v35-0003-Row-pattern-recognition-patch-rewriter.patch><v35-0004-Row-pattern-recognition-patch-planner.patch><v35-0005-Row-pattern-recognition-patch-executor.patch><v35-0006-Row-pattern-recognition-patch-docs.patch><v35-0007-Row-pattern-recognition-patch-tests.patch><v35-0008-Row-pattern-recognition-patch-typedefs.list.patch> I got a few comments, maybe just questions: 1 - 0001 - kwlist.h ``` +PG_KEYWORD("define", DEFINE, RESERVED_KEYWORD, BARE_LABEL) ``` Why do we add “define” as a reserved keyword? From the SQL example you put in 0006: ``` <programlisting> SELECT company, tdate, price, first_value(price) OVER w, max(price) OVER w, count(price) OVER w FROM stock WINDOW w AS ( PARTITION BY company ORDER BY tdate ROWS BETWEEN CURRENT ROW AND UNBOUNDED FOLLOWING AFTER MATCH SKIP PAST LAST ROW INITIAL PATTERN (LOWPRICE UP+ DOWN+) DEFINE LOWPRICE AS price <= 100, UP AS price > PREV(price), DOWN AS price < PREV(price) ); </programlisting> ``` PARTITION is at the same level as DEFINE, but it’s not defined as a reserved keyword: ``` PG_KEYWORD("partition", PARTITION, UNRESERVED_KEYWORD, BARE_LABEL) ``` Even in this patch,”initial”,”past”, “pattern” and “seek” are defined as unreserved, why? So I just want to clarify. 2 - 0001 - gram.y ``` opt_row_pattern_initial_or_seek: INITIAL { $$ = true; } | SEEK { ereport(ERROR, (errcode(ERRCODE_FEATURE_NOT_SUPPORTED), errmsg("SEEK is not supported"), errhint("Use INITIAL instead."), parser_errposition(@1))); } | /*EMPTY*/ { $$ = true; } ; ``` As SEEK is specially listed here, I guess it might be supported in future. If that is true, would it be better to defer thesemantic check to later parse phase, which would future work easier. 3 - 0001 - parsenodes.h ``` +/* + * RowPatternCommonSyntax - raw representation of row pattern common syntax + * + */ +typedef struct RPCommonSyntax +{ + NodeTag type; + RPSkipTo rpSkipTo; /* Row Pattern AFTER MATCH SKIP type */ + bool initial; /* true if <row pattern initial or seek> is + * initial */ + List *rpPatterns; /* PATTERN variables (list of A_Expr) */ + List *rpDefs; /* row pattern definitions clause (list of + * ResTarget) */ +} RPCommonSyntax; + /* * WindowDef - raw representation of WINDOW and OVER clauses * @@ -593,6 +618,7 @@ typedef struct WindowDef char *refname; /* referenced window name, if any */ List *partitionClause; /* PARTITION BY expression list */ List *orderClause; /* ORDER BY (list of SortBy) */ + RPCommonSyntax *rpCommonSyntax; /* row pattern common syntax */ ``` RP fields are directly defined in WindowClause, then why do we need a wrapper of RPCommonSyntax? Can we directly define RPfields in WindowRef as well? Best regards, -- Chao Li (Evan) HighGo Software Co., Ltd. https://www.highgo.com/
> On Nov 18, 2025, at 13:03, Chao Li <li.evan.chao@gmail.com> wrote: > > Hi Tatsuo-san, > > I just reviewed 0006 docs changes and 0001. I plan to slowly review the patch, maybe one commit a day. > >> On Nov 18, 2025, at 10:33, Tatsuo Ishii <ishii@postgresql.org> wrote: >> >> Attached are the v35 patches for Row pattern recognition. In v34-0001 >> gram.y patch, %type for RPR were misplaced. v35-0001 fixes this. Other >> patches are not changed. >> >> Best regards, >> -- >> Tatsuo Ishii >> SRA OSS K.K. >> English: http://www.sraoss.co.jp/index_en/ >> Japanese:http://www.sraoss.co.jp >> <v35-0001-Row-pattern-recognition-patch-for-raw-parser.patch><v35-0002-Row-pattern-recognition-patch-parse-analysis.patch><v35-0003-Row-pattern-recognition-patch-rewriter.patch><v35-0004-Row-pattern-recognition-patch-planner.patch><v35-0005-Row-pattern-recognition-patch-executor.patch><v35-0006-Row-pattern-recognition-patch-docs.patch><v35-0007-Row-pattern-recognition-patch-tests.patch><v35-0008-Row-pattern-recognition-patch-typedefs.list.patch> > > I got a few comments, maybe just questions: > > 1 - 0001 - kwlist.h > ``` > +PG_KEYWORD("define", DEFINE, RESERVED_KEYWORD, BARE_LABEL) > ``` > > Why do we add “define” as a reserved keyword? From the SQL example you put in 0006: > ``` > <programlisting> > SELECT company, tdate, price, > first_value(price) OVER w, > max(price) OVER w, > count(price) OVER w > FROM stock > WINDOW w AS ( > PARTITION BY company > ORDER BY tdate > ROWS BETWEEN CURRENT ROW AND UNBOUNDED FOLLOWING > AFTER MATCH SKIP PAST LAST ROW > INITIAL > PATTERN (LOWPRICE UP+ DOWN+) > DEFINE > LOWPRICE AS price <= 100, > UP AS price > PREV(price), > DOWN AS price < PREV(price) > ); > </programlisting> > ``` > > PARTITION is at the same level as DEFINE, but it’s not defined as a reserved keyword: > ``` > PG_KEYWORD("partition", PARTITION, UNRESERVED_KEYWORD, BARE_LABEL) > ``` > > Even in this patch,”initial”,”past”, “pattern” and “seek” are defined as unreserved, why? > > So I just want to clarify. > > 2 - 0001 - gram.y > ``` > opt_row_pattern_initial_or_seek: > INITIAL { $$ = true; } > | SEEK > { > ereport(ERROR, > (errcode(ERRCODE_FEATURE_NOT_SUPPORTED), > errmsg("SEEK is not supported"), > errhint("Use INITIAL instead."), > parser_errposition(@1))); > } > | /*EMPTY*/ { $$ = true; } > ; > ``` > > As SEEK is specially listed here, I guess it might be supported in future. If that is true, would it be better to deferthe semantic check to later parse phase, which would future work easier. > > 3 - 0001 - parsenodes.h > ``` > +/* > + * RowPatternCommonSyntax - raw representation of row pattern common syntax > + * > + */ > +typedef struct RPCommonSyntax > +{ > + NodeTag type; > + RPSkipTo rpSkipTo; /* Row Pattern AFTER MATCH SKIP type */ > + bool initial; /* true if <row pattern initial or seek> is > + * initial */ > + List *rpPatterns; /* PATTERN variables (list of A_Expr) */ > + List *rpDefs; /* row pattern definitions clause (list of > + * ResTarget) */ > +} RPCommonSyntax; > + > /* > * WindowDef - raw representation of WINDOW and OVER clauses > * > @@ -593,6 +618,7 @@ typedef struct WindowDef > char *refname; /* referenced window name, if any */ > List *partitionClause; /* PARTITION BY expression list */ > List *orderClause; /* ORDER BY (list of SortBy) */ > + RPCommonSyntax *rpCommonSyntax; /* row pattern common syntax */ > ``` > > RP fields are directly defined in WindowClause, then why do we need a wrapper of RPCommonSyntax? Can we directly defineRP fields in WindowRef as well? > 4 - 0001 - parsenodes.h ``` + /* Row Pattern AFTER MACH SKIP clause */ ``` Typo: MACH -> MATCH Best regards, -- Chao Li (Evan) HighGo Software Co., Ltd. https://www.highgo.com/
On 18/11/2025 06:03, Chao Li wrote:
> 1 - 0001 - kwlist.h
> ```
> +PG_KEYWORD("define", DEFINE, RESERVED_KEYWORD, BARE_LABEL)
> ```
>
> Why do we add “define” as a reserved keyword? From the SQL example you put in 0006:
> ```
> <programlisting>
> SELECT company, tdate, price,
> first_value(price) OVER w,
> max(price) OVER w,
> count(price) OVER w
> FROM stock
> WINDOW w AS (
> PARTITION BY company
> ORDER BY tdate
> ROWS BETWEEN CURRENT ROW AND UNBOUNDED FOLLOWING
> AFTER MATCH SKIP PAST LAST ROW
> INITIAL
> PATTERN (LOWPRICE UP+ DOWN+)
> DEFINE
> LOWPRICE AS price <= 100,
> UP AS price > PREV(price),
> DOWN AS price < PREV(price)
> );
> </programlisting>
> ```
>
> PARTITION is at the same level as DEFINE, but it’s not defined as a reserved keyword:
> ```
> PG_KEYWORD("partition", PARTITION, UNRESERVED_KEYWORD, BARE_LABEL)
> ```
>
> Even in this patch,”initial”,”past”, “pattern” and “seek” are defined as unreserved, why?
>
> So I just want to clarify.
Because of position. Without making DEFINE a reserved keyword, how do
you know that it isn't another variable in the PATTERN clause?
--
Vik Fearing
> On Nov 18, 2025, at 19:19, Vik Fearing <vik@postgresfriends.org> wrote:
>
>
> Because of position. Without making DEFINE a reserved keyword, how do you know that it isn't another variable in the
PATTERNclause?
>
Ah, thanks for the clarification. Now I got it.
I’m continue to review 0002.
5 - 0002 - parse_clause.c
```
+ ereport(ERROR,
+ (errcode(ERRCODE_SYNTAX_ERROR),
+ errmsg("FRAME must start at current row when row patttern recognition is used")));
```
Nit typo: patttern -> pattern
6 - 0002 - parse_clause.c
```
+ /* DEFINE variable name initials */
+ static char *defineVariableInitials = "abcdefghijklmnopqrstuvwxyz”;
```
This string can also be const, so “static const char *”
7 - 0002 - parse_clause.c
```
+ /*
+ * Create list of row pattern DEFINE variable name's initial. We assign
+ * [a-z] to them (up to 26 variable names are allowed).
+ */
+ restargets = NIL;
+ i = 0;
+ initialLen = strlen(defineVariableInitials);
+
+ foreach(lc, windef->rpCommonSyntax->rpDefs)
+ {
+ char initial[2];
+
+ restarget = (ResTarget *) lfirst(lc);
+ name = restarget->name;
```
Looks like “name” is not used inside “foreach”.
8 - 0002 - parse_clause.c
```
+ /*
+ * Create list of row pattern DEFINE variable name's initial. We assign
+ * [a-z] to them (up to 26 variable names are allowed).
+ */
+ restargets = NIL;
+ i = 0;
+ initialLen = strlen(defineVariableInitials);
+
+ foreach(lc, windef->rpCommonSyntax->rpDefs)
+ {
+ char initial[2];
```
I guess this “foreach” for build initial list can be combined into the previous foreach loop of checking duplication.
Ifan def has no dup, then assign an initial to it.
9 - 0002 - parse_clause.c
```
+static void
+transformPatternClause(ParseState *pstate, WindowClause *wc,
+ WindowDef *windef)
+{
+ ListCell *lc;
+
+ /*
+ * Row Pattern Common Syntax clause exists?
+ */
+ if (windef->rpCommonSyntax == NULL)
+ return;
```
Checking (windef->rpCommonSyntax == NULL) seems a duplicate, because transformPatternClause() is only called by
transformRPR(),and transformRPR() has done the check.
Best regards,
--
Chao Li (Evan)
HighGo Software Co., Ltd.
https://www.highgo.com/
> On 18/11/2025 06:03, Chao Li wrote:
>> 1 - 0001 - kwlist.h
>> ```
>> +PG_KEYWORD("define", DEFINE, RESERVED_KEYWORD, BARE_LABEL)
>> ```
>>
>> Why do we add “define” as a reserved keyword? From the SQL example
>> you put in 0006:
>> ```
>> <programlisting>
>> SELECT company, tdate, price,
>> first_value(price) OVER w,
>> max(price) OVER w,
>> count(price) OVER w
>> FROM stock
>> WINDOW w AS (
>> PARTITION BY company
>> ORDER BY tdate
>> ROWS BETWEEN CURRENT ROW AND UNBOUNDED FOLLOWING
>> AFTER MATCH SKIP PAST LAST ROW
>> INITIAL
>> PATTERN (LOWPRICE UP+ DOWN+)
>> DEFINE
>> LOWPRICE AS price <= 100,
>> UP AS price > PREV(price),
>> DOWN AS price < PREV(price)
>> );
>> </programlisting>
>> ```
>>
>> PARTITION is at the same level as DEFINE, but it’s not defined as a
>> reserved keyword:
>> ```
>> PG_KEYWORD("partition", PARTITION, UNRESERVED_KEYWORD, BARE_LABEL)
>> ```
>>
>> Even in this patch,”initial”,”past”, “pattern” and “seek” are
>> defined as unreserved, why?
>>
>> So I just want to clarify.
>
>
> Because of position. Without making DEFINE a reserved keyword, how do
> you know that it isn't another variable in the PATTERN clause?
I think we don't need to worry about this because PATTERN_P is in the
$nonassoc list in the patch, which gives PATTERN different precedence
from DEFINE.
@@ -888,6 +896,7 @@ static Node *makeRecursiveViewSelect(char *relname, List *aliases, Node *query);
%nonassoc UNBOUNDED NESTED /* ideally would have same precedence as IDENT */
%nonassoc IDENT PARTITION RANGE ROWS GROUPS PRECEDING FOLLOWING CUBE ROLLUP
SET KEYS OBJECT_P SCALAR VALUE_P WITH WITHOUT PATH
+ AFTER INITIAL SEEK PATTERN_P
And I think we could change DEFINE to an unreserved keyword. Attached
is a patch to do that, on top of v35-0001.
Best regards,
--
Tatsuo Ishii
SRA OSS K.K.
English: http://www.sraoss.co.jp/index_en/
Japanese:http://www.sraoss.co.jp
diff --git a/src/backend/parser/gram.y b/src/backend/parser/gram.y
index fccc26964a0..8ec9f1f265c 100644
--- a/src/backend/parser/gram.y
+++ b/src/backend/parser/gram.y
@@ -17982,6 +17982,7 @@ unreserved_keyword:
| DECLARE
| DEFAULTS
| DEFERRED
+ | DEFINE
| DEFINER
| DELETE_P
| DELIMITER
@@ -18402,7 +18403,6 @@ reserved_keyword:
| CURRENT_USER
| DEFAULT
| DEFERRABLE
- | DEFINE
| DESC
| DISTINCT
| DO
diff --git a/src/include/parser/kwlist.h b/src/include/parser/kwlist.h
index 7c60b9b44a8..89dc2a4b95a 100644
--- a/src/include/parser/kwlist.h
+++ b/src/include/parser/kwlist.h
@@ -129,7 +129,7 @@ PG_KEYWORD("default", DEFAULT, RESERVED_KEYWORD, BARE_LABEL)
PG_KEYWORD("defaults", DEFAULTS, UNRESERVED_KEYWORD, BARE_LABEL)
PG_KEYWORD("deferrable", DEFERRABLE, RESERVED_KEYWORD, BARE_LABEL)
PG_KEYWORD("deferred", DEFERRED, UNRESERVED_KEYWORD, BARE_LABEL)
-PG_KEYWORD("define", DEFINE, RESERVED_KEYWORD, BARE_LABEL)
+PG_KEYWORD("define", DEFINE, UNRESERVED_KEYWORD, BARE_LABEL)
PG_KEYWORD("definer", DEFINER, UNRESERVED_KEYWORD, BARE_LABEL)
PG_KEYWORD("delete", DELETE_P, UNRESERVED_KEYWORD, BARE_LABEL)
PG_KEYWORD("delimiter", DELIMITER, UNRESERVED_KEYWORD, BARE_LABEL)
> 2 - 0001 - gram.y
> ```
> opt_row_pattern_initial_or_seek:
> INITIAL { $$ = true; }
> | SEEK
> {
> ereport(ERROR,
> (errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
> errmsg("SEEK is not supported"),
> errhint("Use INITIAL instead."),
> parser_errposition(@1)));
> }
> | /*EMPTY*/ { $$ = true; }
> ;
> ```
>
> As SEEK is specially listed here, I guess it might be supported in future. If that is true, would it be better to
deferthe semantic check to later parse phase, which would future work easier.
From a comment in gram.y:
/*
* If we see PARTITION, RANGE, ROWS, GROUPS, AFTER, INITIAL, SEEK or PATTERN_P
* as the first token after the '(' of a window_specification, we want the
* assumption to be that there is no existing_window_name; but those keywords
* are unreserved and so could be ColIds. We fix this by making them have the
For this purpose, we want INITIAL and SEEK to be unreserved keywords.
> 3 - 0001 - parsenodes.h
> ```
> +/*
> + * RowPatternCommonSyntax - raw representation of row pattern common syntax
> + *
> + */
> +typedef struct RPCommonSyntax
> +{
> + NodeTag type;
> + RPSkipTo rpSkipTo; /* Row Pattern AFTER MATCH SKIP type */
> + bool initial; /* true if <row pattern initial or seek> is
> + * initial */
> + List *rpPatterns; /* PATTERN variables (list of A_Expr) */
> + List *rpDefs; /* row pattern definitions clause (list of
> + * ResTarget) */
> +} RPCommonSyntax;
> +
> /*
> * WindowDef - raw representation of WINDOW and OVER clauses
> *
> @@ -593,6 +618,7 @@ typedef struct WindowDef
> char *refname; /* referenced window name, if any */
> List *partitionClause; /* PARTITION BY expression list */
> List *orderClause; /* ORDER BY (list of SortBy) */
> + RPCommonSyntax *rpCommonSyntax; /* row pattern common syntax */
> ```
>
> RP fields are directly defined in WindowClause, then why do we need a wrapper of RPCommonSyntax? Can we directly
defineRP fields in WindowRef as well?
The row pattern common syntax defined in the standard is not only used
in the WINDOW clause, but in the FROM clause. If we would support RPR
in FROM clause in the future, it would be better to use the same code
of row pattern common syntax in WINDOW clause as much as
possible. That's the reason I created RPCommonSyntax. In the
parse/analysis phase, I am not sure how the parse/analysis code would
be in FROM clause at this point. So I did not define yet another
struct for it.
Best regards,
--
Tatsuo Ishii
SRA OSS K.K.
English: http://www.sraoss.co.jp/index_en/
Japanese:http://www.sraoss.co.jp
> 4 - 0001 - parsenodes.h > ``` > + /* Row Pattern AFTER MACH SKIP clause */ > ``` > > Typo: MACH -> MATCH Thanks for pointing it out. Will fix. Best regards, -- Tatsuo Ishii SRA OSS K.K. English: http://www.sraoss.co.jp/index_en/ Japanese:http://www.sraoss.co.jp
> On Nov 19, 2025, at 12:14, Chao Li <li.evan.chao@gmail.com> wrote:
>
>
> 9 - 0002 - parse_clause.c
I am continuing to review 0003
10 - 0003
```
+ Assert(list_length(patternVariables) == list_length(defineClause));
```
Is this assert true? From what I learned, pattern length doesn’t have to equal to define length. For example, I got an
examplefrom [1]:
```
Example 4
-- This example has three different patterns.
-- Comment two of them, to get error-free query.
SELECT company, price_date, price
FROM stock_price
MATCH_RECOGNIZE (
partition by company
order by price_date
all rows per match
pattern ( limit_50 up up* down down* )
define
limit_50 as price <= 50.00,
up as price > prev(price),
down as price < prev(price)
)
WHERE company = 'B'
ORDER BY price_date;
```
In this example, pattern has 5 elements and define has only 3 elements.
11 - 0004 - plannodes.h
```
+ /* Row Pattern Recognition AFTER MACH SKIP clause */
+ RPSkipTo rpSkipTo; /* Row Pattern Skip To type */
```
Typo: MACH -> MATCH
I’d stop here, and continue 0005 tomorrow.
[1] https://link.springer.com/article/10.1007/s13222-022-00404-3
Best regards,
--
Chao Li (Evan)
HighGo Software Co., Ltd.
https://www.highgo.com/
> On Nov 20, 2025, at 15:33, Chao Li <li.evan.chao@gmail.com> wrote:
>
>
> I’d stop here, and continue 0005 tomorrow.
>
I reviewed 0005 today. I'm still not very familiar with the executor code, so going through 0005 has also been a
valuablelearning process for me.
12 - 0005 - nodeWindowAgg.c
```
+ if (string_set_get_size(str_set) == 0)
+ {
+ /* no match found in the first row */
+ register_reduced_frame_map(winstate, original_pos, RF_UNMATCHED);
+ destroyStringInfo(encoded_str);
+ return;
+ }
```
encoded_str should also be destroyed if not entering the “if” clause.
13 - 0005 - nodeWindowAgg.c
```
static
int
do_pattern_match(char *pattern, char *encoded_str, int len)
{
static regex_t *regcache = NULL;
static regex_t preg;
static char patbuf[1024]; /* most recent 'pattern' is cached here */
```
Using static variable in executor seems something I have never seen, which may persistent data across queries. I think
usuallyper query data are stored in state structures.
14 - 0005 - nodeWindowAgg.c
```
+ MemoryContext oldContext = MemoryContextSwitchTo(TopMemoryContext);
+
+ if (regcache != NULL)
+ pg_regfree(regcache); /* free previous re */
+
+ /* we need to convert to char to pg_wchar */
+ plen = strlen(pattern);
+ data = (pg_wchar *) palloc((plen + 1) * sizeof(pg_wchar));
+ data_len = pg_mb2wchar_with_len(pattern, data, plen);
+ /* compile re */
+ sts = pg_regcomp(&preg, /* compiled re */
+ data, /* target pattern */
+ data_len, /* length of pattern */
+ cflags, /* compile option */
+ C_COLLATION_OID /* collation */
+ );
+ pfree(data);
+
+ MemoryContextSwitchTo(oldContext);
```
Here in do_pattern_match, it switches to top memory context and compiles the re and stores to “preg". Based on the
commentof pg_regcomp:
```
/*
* pg_regcomp - compile regular expression
*
* Note: on failure, no resources remain allocated, so pg_regfree()
* need not be applied to re.
*/
int
pg_regcomp(regex_t *re,
const chr *string,
size_t len,
int flags,
Oid collation)
```
“preg” should be freed by pg_regfree(), given the memory is allocated in TopMemoryContext, this looks like a memory
leak.
Okay, I’d stop here and continue to review 0006 next week.
Best regards,
--
Chao Li (Evan)
HighGo Software Co., Ltd.
https://www.highgo.com/
Hi Chao,
>> On Nov 18, 2025, at 19:19, Vik Fearing <vik@postgresfriends.org> wrote:
>>
>>
>> Because of position. Without making DEFINE a reserved keyword, how do you know that it isn't another variable in the
PATTERNclause?
>>
>
> Ah, thanks for the clarification. Now I got it.
>
> I’m continue to review 0002.
Thanks for the review!
> 5 - 0002 - parse_clause.c
> ```
> + ereport(ERROR,
> + (errcode(ERRCODE_SYNTAX_ERROR),
> + errmsg("FRAME must start at current row when row patttern recognition is used")));
> ```
>
> Nit typo: patttern -> pattern
Will fix (this involves regression test change because this changes
the ERROR messages in the expected file).
> 6 - 0002 - parse_clause.c
> ```
> + /* DEFINE variable name initials */
> + static char *defineVariableInitials = "abcdefghijklmnopqrstuvwxyz”;
> ```
>
> This string can also be const, so “static const char *”
Agreed. Will fix.
> 7 - 0002 - parse_clause.c
> ```
> + /*
> + * Create list of row pattern DEFINE variable name's initial. We assign
> + * [a-z] to them (up to 26 variable names are allowed).
> + */
> + restargets = NIL;
> + i = 0;
> + initialLen = strlen(defineVariableInitials);
> +
> + foreach(lc, windef->rpCommonSyntax->rpDefs)
> + {
> + char initial[2];
> +
> + restarget = (ResTarget *) lfirst(lc);
> + name = restarget->name;
> ```
>
> Looks like “name” is not used inside “foreach”.
Oops. Will fix.
> 8 - 0002 - parse_clause.c
> ```
> + /*
> + * Create list of row pattern DEFINE variable name's initial. We assign
> + * [a-z] to them (up to 26 variable names are allowed).
> + */
> + restargets = NIL;
> + i = 0;
> + initialLen = strlen(defineVariableInitials);
> +
> + foreach(lc, windef->rpCommonSyntax->rpDefs)
> + {
> + char initial[2];
> ```
>
> I guess this “foreach” for build initial list can be combined into the previous foreach loop of checking duplication.
Ifan def has no dup, then assign an initial to it.
You are right. Will change.
> 9 - 0002 - parse_clause.c
> ```
> +static void
> +transformPatternClause(ParseState *pstate, WindowClause *wc,
> + WindowDef *windef)
> +{
> + ListCell *lc;
> +
> + /*
> + * Row Pattern Common Syntax clause exists?
> + */
> + if (windef->rpCommonSyntax == NULL)
> + return;
> ```
>
> Checking (windef->rpCommonSyntax == NULL) seems a duplicate, because transformPatternClause() is only called by
transformRPR(),and transformRPR() has done the check.
Yeah. transformDefineClause() already does the similar check using
Assert. What about using Assert in transPatternClause() as well?
Assert(windef->rpCommonSyntax != NULL);
Best regards,
--
Tatsuo Ishii
SRA OSS K.K.
English: http://www.sraoss.co.jp/index_en/
Japanese:http://www.sraoss.co.jp
> On Nov 21, 2025, at 13:25, Chao Li <li.evan.chao@gmail.com> wrote:
>
>
> Okay, I’d stop here and continue to review 0006 next week.
>
I just finished reviewing 0006, and see some problems:
15 - 0006 - select.sgml
```
+[ <replaceable class="parameter">row_pattern_common_syntax</replaceable> ]
```
row_pattern_common_syntax doesn’t look like a good name. I searched over the docs, and couldn't find a name suffixed by
“_syntax”.I think we can just name it as “row_pattern_recognition_clause” or a shorter name just “row_pattern”.
16 - 0006 - select.sgml
```
+ <synopsis>
+ [ AFTER MATCH SKIP PAST LAST ROW | AFTER MATCH SKIP TO NEXT ROW ]
``
I think the two values are mutually exclusive, so curly braces should added surround them:
[ { AFTER MATCH SKIP PAST LAST ROW | AFTER MATCH SKIP TO NEXT ROW } ]
[] means optional, and {} means choose one from all alternatives.
17 - 0006 - select.sgml
```
PATTERN <replaceable class="parameter">pattern_variable_name</replaceable>[+] [, ...]
```
PATTERN definition should be placed inside (), here you missed ()
18 - 0006 - select.sgml
The same code as 17, <replaceable class="parameter">pattern_variable_name</replaceable>[+], do you only put “+” here, I
thinkSQL allows a lot of pattern quantifiers. From 0001, gram.y, looks like +, * and ? Are supported in this patch. So,
maybewe can do:
```
PATTERN ( pattern_element, [pattern_element …] )
and pattern_element = variable name optionally followed by quantifier (reference to somewhere introducing pattern
quantifier).
```
19 - 0006 - select.sgml
I don’t see INITIAL and SEEK are described. Even if SEEK is not supported currently, it’s worthy mentioning that.
20 - 0006 - select.sgml
```
+ after a match found. With <literal>AFTER MATCH SKIP PAST LAST
+ ROW</literal> (the default) next row position is next to the last row of
```
21 - 0006 - select.sgml
```
[ <replaceable class="parameter">frame_clause</replaceable> ]
+[ <replaceable class="parameter">row_pattern_common_syntax</replaceable> ]
```
Looks like row_pattern_common_syntax belongs to frame_clause, and you have lately added row_pattern_common_syntax to
frame_clause:
```
<synopsis>
-{ RANGE | ROWS | GROUPS } <replaceable>frame_start</replaceable> [ <replaceable>frame_exclusion</replaceable> ]
-{ RANGE | ROWS | GROUPS } BETWEEN <replaceable>frame_start</replaceable> AND <replaceable>frame_end</replaceable> [
<replaceable>frame_exclusion</replaceable>]
+{ RANGE | ROWS | GROUPS } <replaceable>frame_start</replaceable> [ <replaceable>frame_exclusion</replaceable> ]
[row_pattern_common_syntax]
+{ RANGE | ROWS | GROUPS } BETWEEN <replaceable>frame_start</replaceable> AND <replaceable>frame_end</replaceable> [
<replaceable>frame_exclusion</replaceable>] [row_pattern_common_syntax]
</synopsis>
```
So I guess row_pattern_common_syntax after frame_clause should not be added.
22 - 0006 - advance.sgml
```
<programlisting>
DEFINE
LOWPRICE AS price <= 100,
UP AS price > PREV(price),
DOWN AS price < PREV(price)
</programlisting>
Note that <function>PREV</function> returns the price column in the
```
In the “Note” line, “price” refers to a column from above example, so I think it should be tagged by <literal>. This
commentapplies to multiple places.
I will proceed 0007 tomorrow.
Best regards,
--
Chao Li (Evan)
HighGo Software Co., Ltd.
https://www.highgo.com/
Hi Chao,
Thank you for the review!
>> On Nov 20, 2025, at 15:33, Chao Li <li.evan.chao@gmail.com> wrote:
>>
>>
>> I’d stop here, and continue 0005 tomorrow.
>>
>
> I reviewed 0005 today. I'm still not very familiar with the executor code, so going through 0005 has also been a
valuablelearning process for me.
>
> 12 - 0005 - nodeWindowAgg.c
> ```
> + if (string_set_get_size(str_set) == 0)
> + {
> + /* no match found in the first row */
> + register_reduced_frame_map(winstate, original_pos, RF_UNMATCHED);
> + destroyStringInfo(encoded_str);
> + return;
> + }
> ```
>
> encoded_str should also be destroyed if not entering the “if” clause.
Subsequent string_set_discard() will free encode_str since encoded_str
is part of str_set. So no need to call destroyStringInfo(encoded_str)
in not entering "if" clause.
string_set_discard(str_set);
So I think this is Ok.
> 13 - 0005 - nodeWindowAgg.c
> ```
> static
> int
> do_pattern_match(char *pattern, char *encoded_str, int len)
> {
> static regex_t *regcache = NULL;
> static regex_t preg;
> static char patbuf[1024]; /* most recent 'pattern' is cached here */
> ```
>
> Using static variable in executor seems something I have never seen, which may persistent data across queries. I
thinkusually per query data are stored in state structures.
Yeah, such a usage maybe rare. But I hesitate to store the data
(regex_t) in WindowAggState because:
(1) it bloats WindowAggState which we want to avoid if possible
(2) it requires execnodes.h to include regex.h. (maybe this itself is harmless, I am not sure)
> 14 - 0005 - nodeWindowAgg.c
> ```
> + MemoryContext oldContext = MemoryContextSwitchTo(TopMemoryContext);
> +
> + if (regcache != NULL)
> + pg_regfree(regcache); /* free previous re */
> +
> + /* we need to convert to char to pg_wchar */
> + plen = strlen(pattern);
> + data = (pg_wchar *) palloc((plen + 1) * sizeof(pg_wchar));
> + data_len = pg_mb2wchar_with_len(pattern, data, plen);
> + /* compile re */
> + sts = pg_regcomp(&preg, /* compiled re */
> + data, /* target pattern */
> + data_len, /* length of pattern */
> + cflags, /* compile option */
> + C_COLLATION_OID /* collation */
> + );
> + pfree(data);
> +
> + MemoryContextSwitchTo(oldContext);
> ```
>
> Here in do_pattern_match, it switches to top memory context and compiles the re and stores to “preg". Based on the
commentof pg_regcomp:
> ```
> /*
> * pg_regcomp - compile regular expression
> *
> * Note: on failure, no resources remain allocated, so pg_regfree()
> * need not be applied to re.
> */
> int
> pg_regcomp(regex_t *re,
> const chr *string,
> size_t len,
> int flags,
> Oid collation)
> ```
>
> “preg” should be freed by pg_regfree(), given the memory is allocated in TopMemoryContext, this looks like a memory
leak.
I admit current patch leaves the memory unfreed even after a query
finishes. What about adding something like:
static void do_pattern_match_end(void)
{
if (regcache != NULL)
pg_regfree(regcache);
}
and let ExecEndWindowAgg() call it?
> Okay, I’d stop here and continue to review 0006 next week.
>
> Best regards,
> --
> Chao Li (Evan)
> HighGo Software Co., Ltd.
> https://www.highgo.com/
>
>
>
>
Hi Chao,
>> On Nov 21, 2025, at 13:25, Chao Li <li.evan.chao@gmail.com> wrote:
>>
>>
>> Okay, I’d stop here and continue to review 0006 next week.
>>
>
> I just finished reviewing 0006, and see some problems:
>
> 15 - 0006 - select.sgml
> ```
> +[ <replaceable class="parameter">row_pattern_common_syntax</replaceable> ]
> ```
>
> row_pattern_common_syntax doesn’t look like a good name. I searched over the docs, and couldn't find a name suffixed
by“_syntax”. I think we can just name it as “row_pattern_recognition_clause” or a shorter name just “row_pattern”.
I believe these names are based on the SQL standard. All syntax
components do not necessary be suffixed by "clause". For example
in select.sgml,
[ WITH [ RECURSIVE ] <replaceable class="parameter">with_query</replaceable> [, ...] ]
SELECT [ ALL | DISTINCT [ ON ( <replaceable class="parameter">expression</replaceable> [, ...] ) ] ]
[ { * | <replaceable class="parameter">expression</replaceable> [ [ AS ] <replaceable
class="parameter">output_name</replaceable>] } [, ...] ]
[ FROM <replaceable class="parameter">from_item</replaceable> [, ...] ]
[ WHERE <replaceable class="parameter">condition</replaceable> ]
[ GROUP BY { ALL | [ ALL | DISTINCT ] <replaceable class="parameter">grouping_element</replaceable> [, ...] } ]
[ HAVING <replaceable class="parameter">condition</replaceable> ]
[ WINDOW <replaceable class="parameter">window_name</replaceable> AS ( <replaceable
class="parameter">window_definition</replaceable>) [, ...] ]
"window_definition" comes from the standard and does not suffixed by
"clause". If you look into the window clause definition in the
standard, you will see:
<window clause> ::=
WINDOW <window definition list>
<window definition list> ::=
<window definition> [ { <comma> <window definition> }... ]
So the usage of terms in our document matches the standard. Likewise,
we see the definition of row pattern common syntax in the stabdard:
<window clause> ::=
WINDOW <window definition list>
<window definition list> ::=
<window definition> [ { <comma> <window definition> }... ]
<window definition> ::=
<new window name> AS <window specification>
<new window name> ::=
<window name>
<window specification> ::=
<left paren> <window specification details> <right paren>
<window specification details> ::=
[ <existing window name> ]
[ <window partition clause> ]
[ <window order clause> ]
[ <window frame clause> ]
:
:
<window frame clause> ::=
[ <row pattern measures> ]
<window frame units> <window frame extent>
[ <window frame exclusion> ]
[ <row pattern common syntax> ]
So I think "row pattern common syntax" is perfectly appropriate
name. If we change it to “row_pattern_recognition_clause”, it will
just bring confusion. In the standard “row pattern recognition
clause” is defined RPR in FROM clause.
SELECT ... FROM table MATCH RECOGNIZE (....);
Here MATCH RECOGNIZE (....) is the “row pattern recognition clause”.
> 16 - 0006 - select.sgml
> ```
> + <synopsis>
> + [ AFTER MATCH SKIP PAST LAST ROW | AFTER MATCH SKIP TO NEXT ROW ]
> ``
>
> I think the two values are mutually exclusive, so curly braces should added surround them:
>
> [ { AFTER MATCH SKIP PAST LAST ROW | AFTER MATCH SKIP TO NEXT ROW } ]
>
> [] means optional, and {} means choose one from all alternatives.
Agreed. Will fix.
> 17 - 0006 - select.sgml
> ```
> PATTERN <replaceable class="parameter">pattern_variable_name</replaceable>[+] [, ...]
> ```
>
> PATTERN definition should be placed inside (), here you missed ()
Good catch. Will fix.
> 18 - 0006 - select.sgml
> The same code as 17, <replaceable class="parameter">pattern_variable_name</replaceable>[+], do you only put “+” here,
Ithink SQL allows a lot of pattern quantifiers. From 0001, gram.y, looks like +, * and ? Are supported in this patch.
So,maybe we can do:
>
> ```
> PATTERN ( pattern_element, [pattern_element …] )
>
> and pattern_element = variable name optionally followed by quantifier (reference to somewhere introducing pattern
quantifier).
> ```
Currently only *, + and ? are supported and I don't think it's worth
to invent "pattern_element". (Actually the standard defines much more
complex syntax for PATTERN. I think we can revisit this once the basic
support for quantifier *,+ and ? are brought in the core PostgreSQL
code.
> 19 - 0006 - select.sgml
>
> I don’t see INITIAL and SEEK are described. Even if SEEK is not supported currently, it’s worthy mentioning that.
Agreed. Will fix.
> 20 - 0006 - select.sgml
> ```
> + after a match found. With <literal>AFTER MATCH SKIP PAST LAST
> + ROW</literal> (the default) next row position is next to the last row of
> ```
>
> 21 - 0006 - select.sgml
> ```
> [ <replaceable class="parameter">frame_clause</replaceable> ]
> +[ <replaceable class="parameter">row_pattern_common_syntax</replaceable> ]
> ```
>
> Looks like row_pattern_common_syntax belongs to frame_clause, and you have lately added row_pattern_common_syntax to
frame_clause:
> ```
> <synopsis>
> -{ RANGE | ROWS | GROUPS } <replaceable>frame_start</replaceable> [ <replaceable>frame_exclusion</replaceable> ]
> -{ RANGE | ROWS | GROUPS } BETWEEN <replaceable>frame_start</replaceable> AND <replaceable>frame_end</replaceable> [
<replaceable>frame_exclusion</replaceable>]
> +{ RANGE | ROWS | GROUPS } <replaceable>frame_start</replaceable> [ <replaceable>frame_exclusion</replaceable> ]
[row_pattern_common_syntax]
> +{ RANGE | ROWS | GROUPS } BETWEEN <replaceable>frame_start</replaceable> AND <replaceable>frame_end</replaceable> [
<replaceable>frame_exclusion</replaceable>] [row_pattern_common_syntax]
> </synopsis>
> ```
>
> So I guess row_pattern_common_syntax after frame_clause should not be added.
Yes, row_pattern_common_syntax belongs to frame_clause. Will fix.
> 22 - 0006 - advance.sgml
> ```
> <programlisting>
> DEFINE
> LOWPRICE AS price <= 100,
> UP AS price > PREV(price),
> DOWN AS price < PREV(price)
> </programlisting>
>
> Note that <function>PREV</function> returns the price column in the
> ```
>
> In the “Note” line, “price” refers to a column from above example, so I think it should be tagged by <literal>. This
commentapplies to multiple places.
You are right. Will fix.
> I will proceed 0007 tomorrow.
Thanks!
Best regards,
--
Tatsuo Ishii
SRA OSS K.K.
English: http://www.sraoss.co.jp/index_en/
Japanese:http://www.sraoss.co.jp
> On Nov 24, 2025, at 08:59, Chao Li <li.evan.chao@gmail.com> wrote: > > I will proceed 0007 tomorrow. I just finished reviewing 0007 and 0008. This patch set really demonstrates the full lifecycle of adding a SQL feature, fromchanging the syntax in gram.y all the way down to the executor, including tests and docs. I learned a lot from it. Thanks! 23 - 0007 As you mentioned earlier, pattern regular expression support *, + and ?, but I don’t see ? is tested. 24 - 0007 I don’t see negative tests for unsupported quantifiers, like PATTERN (A+?). 25 - 0007 ``` +-- basic test with none greedy pattern ``` Typo: “none greedy” -> non-greedy No comment for 0008. Best regards, -- Chao Li (Evan) HighGo Software Co., Ltd. https://www.highgo.com/
> I just finished reviewing 0007 and 0008. This patch set really demonstrates the full lifecycle of adding a SQL
feature,from changing the syntax in gram.y all the way down to the executor, including tests and docs. I learned a lot
fromit. Thanks!
You are welcome!
> 23 - 0007
>
> As you mentioned earlier, pattern regular expression support *, + and ?, but I don’t see ? is tested.
Good catch. I will add the test case. In the mean time I found a bug
with gram.y part which handles '?' quantifier. gram.y can be
successfully compiled but there's no way to put '?' quantifier in a
SQL statement. We cannot write directly "ColId '?'" like other
quantifier '*' and '+' in the grammer because '?' is not a "self"
token. So we let '?' matches Op first then check if it's '?' or
not.
| ColId Op
{
if (strcmp("?", $2))
ereport(ERROR,
errcode(ERRCODE_SYNTAX_ERROR),
errmsg("unsupported quantifier"),
parser_errposition(@2));
$$ = (Node *) makeSimpleA_Expr(AEXPR_OP, "?", (Node *)makeString($1), NULL, @1);
}
Another bug was with executor (nodeWindowAgg.c). The code to check
greedy quantifiers missed the case '?'.
> 24 - 0007
>
> I don’t see negative tests for unsupported quantifiers, like PATTERN (A+?).
I will add such test cases.
> 25 - 0007
> ```
> +-- basic test with none greedy pattern
> ```
>
> Typo: “none greedy” -> non-greedy
Will fix.
> No comment for 0008.
Thanks again for the review. I will post v35 patch set soon. Attached
is the diff from v34.
Best regards,
--
Tatsuo Ishii
SRA OSS K.K.
English: http://www.sraoss.co.jp/index_en/
Japanese:http://www.sraoss.co.jp
diff --git a/doc/src/sgml/advanced.sgml b/doc/src/sgml/advanced.sgml
index 7514f2a3848..eec2a0a9346 100644
--- a/doc/src/sgml/advanced.sgml
+++ b/doc/src/sgml/advanced.sgml
@@ -559,13 +559,14 @@ DEFINE
DOWN AS price < PREV(price)
</programlisting>
- Note that <function>PREV</function> returns the price column in the
- previous row if it's called in a context of row pattern recognition. Thus
- in the second line the definition variable "UP" is <literal>TRUE</literal>
- when the price column in the current row is greater than the price column
- in the previous row. Likewise, "DOWN" is <literal>TRUE</literal> when the
- price column in the current row is lower than the price column in the
- previous row.
+ Note that <function>PREV</function> returns the <literal>price</literal>
+ column in the previous row if it's called in a context of row pattern
+ recognition. Thus in the second line the definition variable "UP"
+ is <literal>TRUE</literal> when the price column in the current row is
+ greater than the price column in the previous row. Likewise, "DOWN"
+ is <literal>TRUE</literal> when the
+ <literal>price</literal> column in the current row is lower than
+ the <literal>price</literal> column in the previous row.
</para>
<para>
Once <literal>DEFINE</literal> exists, <literal>PATTERN</literal> can be
@@ -624,10 +625,10 @@ FROM stock
<para>
When a query involves multiple window functions, it is possible to write
out each one with a separate <literal>OVER</literal> clause, but this is
- duplicative and error-prone if the same windowing behavior is wanted
- for several functions. Instead, each windowing behavior can be named
- in a <literal>WINDOW</literal> clause and then referenced in <literal>OVER</literal>.
- For example:
+ duplicative and error-prone if the same windowing behavior is wanted for
+ several functions. Instead, each windowing behavior can be named in
+ a <literal>WINDOW</literal> clause and then referenced
+ in <literal>OVER</literal>. For example:
<programlisting>
SELECT sum(salary) OVER w, avg(salary) OVER w
diff --git a/doc/src/sgml/ref/select.sgml b/doc/src/sgml/ref/select.sgml
index 428bd4f7372..aebc797545e 100644
--- a/doc/src/sgml/ref/select.sgml
+++ b/doc/src/sgml/ref/select.sgml
@@ -937,7 +937,6 @@ WINDOW <replaceable class="parameter">window_name</replaceable> AS ( <replaceabl
[ PARTITION BY <replaceable class="parameter">expression</replaceable> [, ...] ]
[ ORDER BY <replaceable class="parameter">expression</replaceable> [ ASC | DESC | USING <replaceable
class="parameter">operator</replaceable>] [ NULLS { FIRST | LAST } ] [, ...] ]
[ <replaceable class="parameter">frame_clause</replaceable> ]
-[ <replaceable class="parameter">row_pattern_common_syntax</replaceable> ]
</synopsis>
</para>
@@ -1097,8 +1096,9 @@ EXCLUDE NO OTHERS
includes following subclauses.
<synopsis>
-[ AFTER MATCH SKIP PAST LAST ROW | AFTER MATCH SKIP TO NEXT ROW ]
-PATTERN <replaceable class="parameter">pattern_variable_name</replaceable>[+] [, ...]
+[ { AFTER MATCH SKIP PAST LAST ROW | AFTER MATCH SKIP TO NEXT ROW } ]
+[ INITIAL | SEEK ]
+PATTERN ( <replaceable class="parameter">pattern_variable_name</replaceable>[*|+|?] [, ...] )
DEFINE <replaceable class="parameter">definition_varible_name</replaceable> AS <replaceable
class="parameter">expression</replaceable>[, ...]
</synopsis>
<literal>AFTER MATCH SKIP PAST LAST ROW</literal> or <literal>AFTER MATCH
@@ -1107,9 +1107,16 @@ DEFINE <replaceable class="parameter">definition_varible_name</replaceable> AS <
ROW</literal> (the default) next row position is next to the last row of
previous match. On the other hand, with <literal>AFTER MATCH SKIP TO NEXT
ROW</literal> next row position is always next to the last row of previous
- match. <literal>DEFINE</literal> defines definition variables along with a
- boolean expression. <literal>PATTERN</literal> defines a sequence of rows
- that satisfies certain conditions using variables defined
+ match. <literal>INITIAL</literal> or <literal>SEEK</literal> defines how a
+ successful pattern matching starts from which row in a
+ frame. If <literal>INITIAL</literal> is specified, the match must start
+ from the first row in the frame. If <literal>SEEK</literal> is specified,
+ the set of matching rows do not necessarily start from the first row. The
+ default is <literal>INITIAL</literal>. Currently
+ only <literal>INITIAL</literal> is supported. <literal>DEFINE</literal>
+ defines definition variables along with a boolean
+ expression. <literal>PATTERN</literal> defines a sequence of rows that
+ satisfies certain conditions using variables defined
in <literal>DEFINE</literal> clause. If the variable is not defined in
the <literal>DEFINE</literal> clause, it is implicitly assumed following
is defined in the <literal>DEFINE</literal> clause.
diff --git a/src/backend/executor/nodeWindowAgg.c b/src/backend/executor/nodeWindowAgg.c
index 9ce072434b4..d230f43e607 100644
--- a/src/backend/executor/nodeWindowAgg.c
+++ b/src/backend/executor/nodeWindowAgg.c
@@ -4754,7 +4754,7 @@ update_reduced_frame(WindowObject winobj, int64 pos)
{
char *quantifier = strVal(lfirst(lc1));
- if (*quantifier == '+' || *quantifier == '*')
+ if (*quantifier == '+' || *quantifier == '*' || *quantifier == '?')
{
greedy = true;
break;
diff --git a/src/backend/parser/gram.y b/src/backend/parser/gram.y
index fccc26964a0..fdfe14d54e5 100644
--- a/src/backend/parser/gram.y
+++ b/src/backend/parser/gram.y
@@ -16825,7 +16825,21 @@ row_pattern_term:
ColId { $$ = (Node *) makeSimpleA_Expr(AEXPR_OP, "", (Node *)makeString($1), NULL, @1); }
| ColId '*' { $$ = (Node *) makeSimpleA_Expr(AEXPR_OP, "*", (Node *)makeString($1), NULL, @1); }
| ColId '+' { $$ = (Node *) makeSimpleA_Expr(AEXPR_OP, "+", (Node *)makeString($1), NULL, @1); }
- | ColId '?' { $$ = (Node *) makeSimpleA_Expr(AEXPR_OP, "?", (Node *)makeString($1), NULL, @1); }
+/*
+ * '?' quantifier. We cannot write this directly "ColId '?'" because '?' is
+ * not a "self" token. So we let '?' matches Op first then check if it's '?'
+ * or not.
+ */
+ | ColId Op
+ {
+ if (strcmp("?", $2))
+ ereport(ERROR,
+ errcode(ERRCODE_SYNTAX_ERROR),
+ errmsg("unsupported quantifier"),
+ parser_errposition(@2));
+
+ $$ = (Node *) makeSimpleA_Expr(AEXPR_OP, "?", (Node *)makeString($1), NULL, @1);
+ }
;
row_pattern_definition_list:
@@ -17982,6 +17996,7 @@ unreserved_keyword:
| DECLARE
| DEFAULTS
| DEFERRED
+ | DEFINE
| DEFINER
| DELETE_P
| DELIMITER
@@ -18402,7 +18417,6 @@ reserved_keyword:
| CURRENT_USER
| DEFAULT
| DEFERRABLE
- | DEFINE
| DESC
| DISTINCT
| DO
diff --git a/src/backend/parser/parse_clause.c b/src/backend/parser/parse_clause.c
index 3521d2780e2..8e4dc732861 100644
--- a/src/backend/parser/parse_clause.c
+++ b/src/backend/parser/parse_clause.c
@@ -3920,7 +3920,7 @@ transformRPR(ParseState *pstate, WindowClause *wc, WindowDef *windef,
if ((wc->frameOptions & FRAMEOPTION_START_CURRENT_ROW) == 0)
ereport(ERROR,
(errcode(ERRCODE_SYNTAX_ERROR),
- errmsg("FRAME must start at current row when row patttern recognition is used")));
+ errmsg("FRAME must start at current row when row pattern recognition is used")));
/* Transform AFTER MACH SKIP TO clause */
wc->rpSkipTo = windef->rpCommonSyntax->rpSkipTo;
@@ -3949,7 +3949,7 @@ transformDefineClause(ParseState *pstate, WindowClause *wc, WindowDef *windef,
List **targetlist)
{
/* DEFINE variable name initials */
- static char *defineVariableInitials = "abcdefghijklmnopqrstuvwxyz";
+ static const char *defineVariableInitials = "abcdefghijklmnopqrstuvwxyz";
ListCell *lc,
*l;
@@ -3959,7 +3959,7 @@ transformDefineClause(ParseState *pstate, WindowClause *wc, WindowDef *windef,
List *defineClause;
char *name;
int initialLen;
- int i;
+ int numinitials;
/*
* If Row Definition Common Syntax exists, DEFINE clause must exist. (the
@@ -4030,8 +4030,12 @@ transformDefineClause(ParseState *pstate, WindowClause *wc, WindowDef *windef,
* equivalent.
*/
restargets = NIL;
+ numinitials = 0;
+ initialLen = strlen(defineVariableInitials);
foreach(lc, windef->rpCommonSyntax->rpDefs)
{
+ char initial[2];
+
restarget = (ResTarget *) lfirst(lc);
name = restarget->name;
@@ -4071,26 +4075,12 @@ transformDefineClause(ParseState *pstate, WindowClause *wc, WindowDef *windef,
name),
parser_errposition(pstate, exprLocation((Node *) r))));
}
- restargets = lappend(restargets, restarget);
- }
- list_free(restargets);
-
- /*
- * Create list of row pattern DEFINE variable name's initial. We assign
- * [a-z] to them (up to 26 variable names are allowed).
- */
- restargets = NIL;
- i = 0;
- initialLen = strlen(defineVariableInitials);
-
- foreach(lc, windef->rpCommonSyntax->rpDefs)
- {
- char initial[2];
- restarget = (ResTarget *) lfirst(lc);
- name = restarget->name;
-
- if (i >= initialLen)
+ /*
+ * Create list of row pattern DEFINE variable name's initial. We
+ * assign [a-z] to them (up to 26 variable names are allowed).
+ */
+ if (numinitials >= initialLen)
{
ereport(ERROR,
(errcode(ERRCODE_SYNTAX_ERROR),
@@ -4099,12 +4089,16 @@ transformDefineClause(ParseState *pstate, WindowClause *wc, WindowDef *windef,
parser_errposition(pstate,
exprLocation((Node *) restarget))));
}
- initial[0] = defineVariableInitials[i++];
+ initial[0] = defineVariableInitials[numinitials++];
initial[1] = '\0';
wc->defineInitial = lappend(wc->defineInitial,
makeString(pstrdup(initial)));
+
+ restargets = lappend(restargets, restarget);
}
+ list_free(restargets);
+ /* turns a list of ResTarget's into a list of TargetEntry's */
defineClause = transformTargetList(pstate, windef->rpCommonSyntax->rpDefs,
EXPR_KIND_RPR_DEFINE);
@@ -4120,6 +4114,8 @@ transformDefineClause(ParseState *pstate, WindowClause *wc, WindowDef *windef,
/*
* transformPatternClause
* Process PATTERN clause and return PATTERN clause in the raw parse tree
+ *
+ * windef->rpCommonSyntax must exist.
*/
static void
transformPatternClause(ParseState *pstate, WindowClause *wc,
@@ -4127,11 +4123,7 @@ transformPatternClause(ParseState *pstate, WindowClause *wc,
{
ListCell *lc;
- /*
- * Row Pattern Common Syntax clause exists?
- */
- if (windef->rpCommonSyntax == NULL)
- return;
+ Assert(windef->rpCommonSyntax != NULL);
wc->patternVariable = NIL;
wc->patternRegexp = NIL;
diff --git a/src/include/nodes/parsenodes.h b/src/include/nodes/parsenodes.h
index 9e17abbaec5..3cc0ce720b2 100644
--- a/src/include/nodes/parsenodes.h
+++ b/src/include/nodes/parsenodes.h
@@ -1621,7 +1621,7 @@ typedef struct WindowClause
Index winref; /* ID referenced by window functions */
/* did we copy orderClause from refname? */
bool copiedOrder pg_node_attr(query_jumble_ignore);
- /* Row Pattern AFTER MACH SKIP clause */
+ /* Row Pattern AFTER MATCH SKIP clause */
RPSkipTo rpSkipTo; /* Row Pattern Skip To type */
bool initial; /* true if <row pattern initial or seek> is
* initial */
diff --git a/src/include/parser/kwlist.h b/src/include/parser/kwlist.h
index 7c60b9b44a8..89dc2a4b95a 100644
--- a/src/include/parser/kwlist.h
+++ b/src/include/parser/kwlist.h
@@ -129,7 +129,7 @@ PG_KEYWORD("default", DEFAULT, RESERVED_KEYWORD, BARE_LABEL)
PG_KEYWORD("defaults", DEFAULTS, UNRESERVED_KEYWORD, BARE_LABEL)
PG_KEYWORD("deferrable", DEFERRABLE, RESERVED_KEYWORD, BARE_LABEL)
PG_KEYWORD("deferred", DEFERRED, UNRESERVED_KEYWORD, BARE_LABEL)
-PG_KEYWORD("define", DEFINE, RESERVED_KEYWORD, BARE_LABEL)
+PG_KEYWORD("define", DEFINE, UNRESERVED_KEYWORD, BARE_LABEL)
PG_KEYWORD("definer", DEFINER, UNRESERVED_KEYWORD, BARE_LABEL)
PG_KEYWORD("delete", DELETE_P, UNRESERVED_KEYWORD, BARE_LABEL)
PG_KEYWORD("delimiter", DELIMITER, UNRESERVED_KEYWORD, BARE_LABEL)
diff --git a/src/test/regress/expected/rpr.out b/src/test/regress/expected/rpr.out
index 59cfed180e7..312b62fb489 100644
--- a/src/test/regress/expected/rpr.out
+++ b/src/test/regress/expected/rpr.out
@@ -165,7 +165,45 @@ SELECT company, tdate, price, first_value(price) OVER w, last_value(price) OVER
company2 | 07-10-2023 | 1300 | | |
(20 rows)
--- basic test with none greedy pattern
+-- basic test using PREV. Use '?'
+SELECT company, tdate, price, first_value(price) OVER w, last_value(price) OVER w,
+ nth_value(tdate, 2) OVER w AS nth_second
+ FROM stock
+ WINDOW w AS (
+ PARTITION BY company
+ ROWS BETWEEN CURRENT ROW AND UNBOUNDED FOLLOWING
+ INITIAL
+ PATTERN (START UP? DOWN+)
+ DEFINE
+ START AS TRUE,
+ UP AS price > PREV(price),
+ DOWN AS price < PREV(price)
+);
+ company | tdate | price | first_value | last_value | nth_second
+----------+------------+-------+-------------+------------+------------
+ company1 | 07-01-2023 | 100 | 100 | 140 | 07-02-2023
+ company1 | 07-02-2023 | 200 | | |
+ company1 | 07-03-2023 | 150 | | |
+ company1 | 07-04-2023 | 140 | | |
+ company1 | 07-05-2023 | 150 | 150 | 90 | 07-06-2023
+ company1 | 07-06-2023 | 90 | | |
+ company1 | 07-07-2023 | 110 | 110 | 120 | 07-08-2023
+ company1 | 07-08-2023 | 130 | | |
+ company1 | 07-09-2023 | 120 | | |
+ company1 | 07-10-2023 | 130 | | |
+ company2 | 07-01-2023 | 50 | 50 | 1400 | 07-02-2023
+ company2 | 07-02-2023 | 2000 | | |
+ company2 | 07-03-2023 | 1500 | | |
+ company2 | 07-04-2023 | 1400 | | |
+ company2 | 07-05-2023 | 1500 | 1500 | 60 | 07-06-2023
+ company2 | 07-06-2023 | 60 | | |
+ company2 | 07-07-2023 | 1100 | 1100 | 1200 | 07-08-2023
+ company2 | 07-08-2023 | 1300 | | |
+ company2 | 07-09-2023 | 1200 | | |
+ company2 | 07-10-2023 | 1300 | | |
+(20 rows)
+
+-- basic test with none-greedy pattern
SELECT company, tdate, price, count(*) OVER w
FROM stock
WINDOW w AS (
@@ -927,7 +965,7 @@ SELECT company, tdate, price, first_value(price) OVER w, last_value(price) OVER
ERROR: aggregate functions are not allowed in DEFINE
LINE 11: LOWPRICE AS price < count(*)
^
--- FRAME must start at current row when row patttern recognition is used
+-- FRAME must start at current row when row pattern recognition is used
SELECT company, tdate, price, first_value(price) OVER w, last_value(price) OVER w
FROM stock
WINDOW w AS (
@@ -941,7 +979,7 @@ SELECT company, tdate, price, first_value(price) OVER w, last_value(price) OVER
UP AS price > PREV(price),
DOWN AS price < PREV(price)
);
-ERROR: FRAME must start at current row when row patttern recognition is used
+ERROR: FRAME must start at current row when row pattern recognition is used
-- SEEK is not supported
SELECT company, tdate, price, first_value(price) OVER w, last_value(price) OVER w
FROM stock
@@ -977,3 +1015,38 @@ SELECT company, tdate, price, first_value(price) OVER w, last_value(price) OVER
DOWN AS price < PREV(1)
);
ERROR: row pattern navigation operation's argument must include at least one column reference
+-- Unsupported quantifier
+SELECT company, tdate, price, first_value(price) OVER w, last_value(price) OVER w
+ FROM stock
+ WINDOW w AS (
+ PARTITION BY company
+ ORDER BY tdate
+ ROWS BETWEEN CURRENT ROW AND UNBOUNDED FOLLOWING
+ AFTER MATCH SKIP TO NEXT ROW
+ INITIAL
+ PATTERN (START UP~ DOWN+)
+ DEFINE
+ START AS TRUE,
+ UP AS price > PREV(1),
+ DOWN AS price < PREV(1)
+);
+ERROR: unsupported quantifier
+LINE 9: PATTERN (START UP~ DOWN+)
+ ^
+SELECT company, tdate, price, first_value(price) OVER w, last_value(price) OVER w
+ FROM stock
+ WINDOW w AS (
+ PARTITION BY company
+ ORDER BY tdate
+ ROWS BETWEEN CURRENT ROW AND UNBOUNDED FOLLOWING
+ AFTER MATCH SKIP TO NEXT ROW
+ INITIAL
+ PATTERN (START UP+? DOWN+)
+ DEFINE
+ START AS TRUE,
+ UP AS price > PREV(1),
+ DOWN AS price < PREV(1)
+);
+ERROR: unsupported quantifier
+LINE 9: PATTERN (START UP+? DOWN+)
+ ^
diff --git a/src/test/regress/sql/rpr.sql b/src/test/regress/sql/rpr.sql
index 47e67334994..ffcf5476198 100644
--- a/src/test/regress/sql/rpr.sql
+++ b/src/test/regress/sql/rpr.sql
@@ -75,7 +75,22 @@ SELECT company, tdate, price, first_value(price) OVER w, last_value(price) OVER
DOWN AS price < PREV(price)
);
--- basic test with none greedy pattern
+-- basic test using PREV. Use '?'
+SELECT company, tdate, price, first_value(price) OVER w, last_value(price) OVER w,
+ nth_value(tdate, 2) OVER w AS nth_second
+ FROM stock
+ WINDOW w AS (
+ PARTITION BY company
+ ROWS BETWEEN CURRENT ROW AND UNBOUNDED FOLLOWING
+ INITIAL
+ PATTERN (START UP? DOWN+)
+ DEFINE
+ START AS TRUE,
+ UP AS price > PREV(price),
+ DOWN AS price < PREV(price)
+);
+
+-- basic test with none-greedy pattern
SELECT company, tdate, price, count(*) OVER w
FROM stock
WINDOW w AS (
@@ -438,7 +453,7 @@ SELECT company, tdate, price, first_value(price) OVER w, last_value(price) OVER
LOWPRICE AS price < count(*)
);
--- FRAME must start at current row when row patttern recognition is used
+-- FRAME must start at current row when row pattern recognition is used
SELECT company, tdate, price, first_value(price) OVER w, last_value(price) OVER w
FROM stock
WINDOW w AS (
@@ -484,3 +499,34 @@ SELECT company, tdate, price, first_value(price) OVER w, last_value(price) OVER
UP AS price > PREV(1),
DOWN AS price < PREV(1)
);
+
+-- Unsupported quantifier
+SELECT company, tdate, price, first_value(price) OVER w, last_value(price) OVER w
+ FROM stock
+ WINDOW w AS (
+ PARTITION BY company
+ ORDER BY tdate
+ ROWS BETWEEN CURRENT ROW AND UNBOUNDED FOLLOWING
+ AFTER MATCH SKIP TO NEXT ROW
+ INITIAL
+ PATTERN (START UP~ DOWN+)
+ DEFINE
+ START AS TRUE,
+ UP AS price > PREV(1),
+ DOWN AS price < PREV(1)
+);
+
+SELECT company, tdate, price, first_value(price) OVER w, last_value(price) OVER w
+ FROM stock
+ WINDOW w AS (
+ PARTITION BY company
+ ORDER BY tdate
+ ROWS BETWEEN CURRENT ROW AND UNBOUNDED FOLLOWING
+ AFTER MATCH SKIP TO NEXT ROW
+ INITIAL
+ PATTERN (START UP+? DOWN+)
+ DEFINE
+ START AS TRUE,
+ UP AS price > PREV(1),
+ DOWN AS price < PREV(1)
+);
Hi Chao,
Any comment on this?
>> 13 - 0005 - nodeWindowAgg.c
>> ```
>> static
>> int
>> do_pattern_match(char *pattern, char *encoded_str, int len)
>> {
>> static regex_t *regcache = NULL;
>> static regex_t preg;
>> static char patbuf[1024]; /* most recent 'pattern' is cached here */
>> ```
>>
>> Using static variable in executor seems something I have never seen, which may persistent data across queries. I
thinkusually per query data are stored in state structures.
>
>Yeah, such a usage maybe rare. But I hesitate to store the data
>(regex_t) in WindowAggState because:
>
>(1) it bloats WindowAggState which we want to avoid if possible
>(2) it requires execnodes.h to include regex.h. (maybe this itself is harmless, I am not sure)
>
>> 14 - 0005 - nodeWindowAgg.c
>> ```
>> + MemoryContext oldContext = MemoryContextSwitchTo(TopMemoryContext);
>> +
>> + if (regcache != NULL)
>> + pg_regfree(regcache); /* free previous re */
>> +
>> + /* we need to convert to char to pg_wchar */
>> + plen = strlen(pattern);
>> + data = (pg_wchar *) palloc((plen + 1) * sizeof(pg_wchar));
>> + data_len = pg_mb2wchar_with_len(pattern, data, plen);
>> + /* compile re */
>> + sts = pg_regcomp(&preg, /* compiled re */
>> + data, /* target pattern */
>> + data_len, /* length of pattern */
>> + cflags, /* compile option */
>> + C_COLLATION_OID /* collation */
>> + );
>> + pfree(data);
>> +
>> + MemoryContextSwitchTo(oldContext);
>> ```
>>
>> Here in do_pattern_match, it switches to top memory context and compiles the re and stores to “preg". Based on the
commentof pg_regcomp:
>> ```
>> /*
>> * pg_regcomp - compile regular expression
>> *
>> * Note: on failure, no resources remain allocated, so pg_regfree()
>> * need not be applied to re.
>> */
>> int
>> pg_regcomp(regex_t *re,
>> const chr *string,
>> size_t len,
>> int flags,
>> Oid collation)
>> ```
>>
>> “preg” should be freed by pg_regfree(), given the memory is allocated in TopMemoryContext, this looks like a memory
leak.
>
>I admit current patch leaves the memory unfreed even after a query
>finishes. What about adding something like:
>
>static void do_pattern_match_end(void)
>{
> if (regcache != NULL)
> pg_regfree(regcache);
>}
>
>and let ExecEndWindowAgg() call it?
Best regards,
--
Tatsuo Ishii
SRA OSS K.K.
English: http://www.sraoss.co.jp/index_en/
Japanese:http://www.sraoss.co.jp
> On Nov 27, 2025, at 11:10, Tatsuo Ishii <ishii@postgresql.org> wrote:
>
> Hi Chao,
>
> Any comment on this?
>
>>> 13 - 0005 - nodeWindowAgg.c
>>> ```
>>> static
>>> int
>>> do_pattern_match(char *pattern, char *encoded_str, int len)
>>> {
>>> static regex_t *regcache = NULL;
>>> static regex_t preg;
>>> static char patbuf[1024]; /* most recent 'pattern' is cached here */
>>> ```
>>>
>>> Using static variable in executor seems something I have never seen, which may persistent data across queries. I
thinkusually per query data are stored in state structures.
>>
>> Yeah, such a usage maybe rare. But I hesitate to store the data
>> (regex_t) in WindowAggState because:
>>
>> (1) it bloats WindowAggState which we want to avoid if possible
>> (2) it requires execnodes.h to include regex.h. (maybe this itself is harmless, I am not sure)
With the static function-scope variables, those data persist across queries, which is error prone. I am afraid that
evenif I let this pass, other reviewers might have the same concern.
Maybe define a sub structure, say WindowAggCache, and put a pointer of WindowAggCache in WindowAggState, and only
allocatememory when a query needs to.
>>
>>> 14 - 0005 - nodeWindowAgg.c
>>> ```
>>> + MemoryContext oldContext = MemoryContextSwitchTo(TopMemoryContext);
>>> +
>>> + if (regcache != NULL)
>>> + pg_regfree(regcache); /* free previous re */
>>> +
>>> + /* we need to convert to char to pg_wchar */
>>> + plen = strlen(pattern);
>>> + data = (pg_wchar *) palloc((plen + 1) * sizeof(pg_wchar));
>>> + data_len = pg_mb2wchar_with_len(pattern, data, plen);
>>> + /* compile re */
>>> + sts = pg_regcomp(&preg, /* compiled re */
>>> + data, /* target pattern */
>>> + data_len, /* length of pattern */
>>> + cflags, /* compile option */
>>> + C_COLLATION_OID /* collation */
>>> + );
>>> + pfree(data);
>>> +
>>> + MemoryContextSwitchTo(oldContext);
>>> ```
>>>
>>> Here in do_pattern_match, it switches to top memory context and compiles the re and stores to “preg". Based on the
commentof pg_regcomp:
>>> ```
>>> /*
>>> * pg_regcomp - compile regular expression
>>> *
>>> * Note: on failure, no resources remain allocated, so pg_regfree()
>>> * need not be applied to re.
>>> */
>>> int
>>> pg_regcomp(regex_t *re,
>>> const chr *string,
>>> size_t len,
>>> int flags,
>>> Oid collation)
>>> ```
>>>
>>> “preg” should be freed by pg_regfree(), given the memory is allocated in TopMemoryContext, this looks like a memory
leak.
>>
>> I admit current patch leaves the memory unfreed even after a query
>> finishes. What about adding something like:
>>
>> static void do_pattern_match_end(void)
>> {
>> if (regcache != NULL)
>> pg_regfree(regcache);
>> }
>>
>> and let ExecEndWindowAgg() call it?
>
I’m not sure. But I think if we move the data into WindowAggState, then I guess we don’t have to use TopmemoryContext
here.
Best regards,
--
Chao Li (Evan)
HighGo Software Co., Ltd.
https://www.highgo.com/
Hi Chao, Sorry, I missed this email. >> 9 - 0002 - parse_clause.c > > I am continuing to review 0003 > > 10 - 0003 > ``` > + Assert(list_length(patternVariables) == list_length(defineClause)); > ``` > > Is this assert true? From what I learned, pattern length doesn’t have to equal to define length. For example, I got anexample from [1]: You are right. If PATTERN clause uses the same pattern variable more than once (which is perfectly valid), the assertion fails. I will remove the assersion and fix the subsequent forboth loop. > ``` > Example 4 > -- This example has three different patterns. > -- Comment two of them, to get error-free query. > SELECT company, price_date, price > FROM stock_price > MATCH_RECOGNIZE ( > partition by company > order by price_date > all rows per match > pattern ( limit_50 up up* down down* ) > define > limit_50 as price <= 50.00, > up as price > prev(price), > down as price < prev(price) > ) > WHERE company = 'B' > ORDER BY price_date; > ``` > > In this example, pattern has 5 elements and define has only 3 elements. Yes. > 11 - 0004 - plannodes.h > ``` > + /* Row Pattern Recognition AFTER MACH SKIP clause */ > + RPSkipTo rpSkipTo; /* Row Pattern Skip To type */ > ``` > > Typo: MACH -> MATCH Will fix/ > I’d stop here, and continue 0005 tomorrow. Thanks! > [1] https://link.springer.com/article/10.1007/s13222-022-00404-3 > > Best regards, > -- > Chao Li (Evan) > HighGo Software Co., Ltd. > https://www.highgo.com/ > > > >
>> On Nov 27, 2025, at 11:10, Tatsuo Ishii <ishii@postgresql.org> wrote:
>>
>> Hi Chao,
>>
>> Any comment on this?
>>
>>>> 13 - 0005 - nodeWindowAgg.c
>>>> ```
>>>> static
>>>> int
>>>> do_pattern_match(char *pattern, char *encoded_str, int len)
>>>> {
>>>> static regex_t *regcache = NULL;
>>>> static regex_t preg;
>>>> static char patbuf[1024]; /* most recent 'pattern' is cached here */
>>>> ```
>>>>
>>>> Using static variable in executor seems something I have never seen, which may persistent data across queries. I
thinkusually per query data are stored in state structures.
>>>
>>> Yeah, such a usage maybe rare. But I hesitate to store the data
>>> (regex_t) in WindowAggState because:
>>>
>>> (1) it bloats WindowAggState which we want to avoid if possible
>>> (2) it requires execnodes.h to include regex.h. (maybe this itself is harmless, I am not sure)
>
> With the static function-scope variables, those data persist across queries, which is error prone. I am afraid that
evenif I let this pass, other reviewers might have the same concern.
>
> Maybe define a sub structure, say WindowAggCache, and put a pointer of WindowAggCache in WindowAggState, and only
allocatememory when a query needs to.
>
>>>
>>>> 14 - 0005 - nodeWindowAgg.c
>>>> ```
>>>> + MemoryContext oldContext = MemoryContextSwitchTo(TopMemoryContext);
>>>> +
>>>> + if (regcache != NULL)
>>>> + pg_regfree(regcache); /* free previous re */
>>>> +
>>>> + /* we need to convert to char to pg_wchar */
>>>> + plen = strlen(pattern);
>>>> + data = (pg_wchar *) palloc((plen + 1) * sizeof(pg_wchar));
>>>> + data_len = pg_mb2wchar_with_len(pattern, data, plen);
>>>> + /* compile re */
>>>> + sts = pg_regcomp(&preg, /* compiled re */
>>>> + data, /* target pattern */
>>>> + data_len, /* length of pattern */
>>>> + cflags, /* compile option */
>>>> + C_COLLATION_OID /* collation */
>>>> + );
>>>> + pfree(data);
>>>> +
>>>> + MemoryContextSwitchTo(oldContext);
>>>> ```
>>>>
>>>> Here in do_pattern_match, it switches to top memory context and compiles the re and stores to “preg". Based on the
commentof pg_regcomp:
>>>> ```
>>>> /*
>>>> * pg_regcomp - compile regular expression
>>>> *
>>>> * Note: on failure, no resources remain allocated, so pg_regfree()
>>>> * need not be applied to re.
>>>> */
>>>> int
>>>> pg_regcomp(regex_t *re,
>>>> const chr *string,
>>>> size_t len,
>>>> int flags,
>>>> Oid collation)
>>>> ```
>>>>
>>>> “preg” should be freed by pg_regfree(), given the memory is allocated in TopMemoryContext, this looks like a
memoryleak.
>>>
>>> I admit current patch leaves the memory unfreed even after a query
>>> finishes. What about adding something like:
>>>
>>> static void do_pattern_match_end(void)
>>> {
>>> if (regcache != NULL)
>>> pg_regfree(regcache);
>>> }
>>>
>>> and let ExecEndWindowAgg() call it?
>>
>
> I’m not sure. But I think if we move the data into WindowAggState, then I guess we don’t have to use TopmemoryContext
here.
I decided to add new fields to WindowAggState:
/* regular expression compiled result cache. Used for RPR. */
char *patbuf; /* pattern to be compiled */
regex_t preg; /* compiled re pattern */
Then allocate the memory for them using
winstate->ss.ps.ps_ExprContext->ecxt_per_query_memory;
Best regards,
--
Tatsuo Ishii
SRA OSS K.K.
English: http://www.sraoss.co.jp/index_en/
Japanese:http://www.sraoss.co.jp
>> I just finished reviewing 0007 and 0008. This patch set really demonstrates the full lifecycle of adding a SQL
feature,from changing the syntax in gram.y all the way down to the executor, including tests and docs. I learned a lot
fromit. Thanks!
>
> You are welcome!
>
>> 23 - 0007
>>
>> As you mentioned earlier, pattern regular expression support *, + and ?, but I don’t see ? is tested.
>
> Good catch. I will add the test case. In the mean time I found a bug
> with gram.y part which handles '?' quantifier. gram.y can be
> successfully compiled but there's no way to put '?' quantifier in a
> SQL statement. We cannot write directly "ColId '?'" like other
> quantifier '*' and '+' in the grammer because '?' is not a "self"
> token. So we let '?' matches Op first then check if it's '?' or
> not.
>
> | ColId Op
> {
> if (strcmp("?", $2))
> ereport(ERROR,
> errcode(ERRCODE_SYNTAX_ERROR),
> errmsg("unsupported quantifier"),
> parser_errposition(@2));
>
> $$ = (Node *) makeSimpleA_Expr(AEXPR_OP, "?", (Node *)makeString($1), NULL, @1);
> }
>
> Another bug was with executor (nodeWindowAgg.c). The code to check
> greedy quantifiers missed the case '?'.
>
>> 24 - 0007
>>
>> I don’t see negative tests for unsupported quantifiers, like PATTERN (A+?).
>
> I will add such test cases.
>
>> 25 - 0007
>> ```
>> +-- basic test with none greedy pattern
>> ```
>>
>> Typo: “none greedy” -> non-greedy
>
> Will fix.
>
>> No comment for 0008.
>
> Thanks again for the review. I will post v35 patch set soon. Attached
> is the diff from v34.
Attached are the v35 patches for Row pattern recognition. Chao Li
reviewed v34 thoroughly. Thank you! v35 reflects the review
comments. Major differences from v34 include:
- Make "DEFINE" an unreserved keyword. Previously it was a reserved keyword.
- Refactor transformDefineClause() to make two foreach loops into single foreach loop.
- Make '?' quantifier in PATTERN work as advertised. Test for '?' quantifier is added too.
- Unsupported quantifier test added.
- Fix get_rule_define().
- Fix memory leak related to regcomp.
- Move regcomp compiled result cache from static data to WindowAggState.
- Fix several typos.
Best regards,
--
Tatsuo Ishii
SRA OSS K.K.
English: http://www.sraoss.co.jp/index_en/
Japanese:http://www.sraoss.co.jp
Вложения
- v35-0001-Row-pattern-recognition-patch-for-raw-parser.patch
- v35-0002-Row-pattern-recognition-patch-parse-analysis.patch
- v35-0003-Row-pattern-recognition-patch-rewriter.patch
- v35-0004-Row-pattern-recognition-patch-planner.patch
- v35-0005-Row-pattern-recognition-patch-executor.patch
- v35-0006-Row-pattern-recognition-patch-docs.patch
- v35-0007-Row-pattern-recognition-patch-tests.patch
- v35-0008-Row-pattern-recognition-patch-typedefs.list.patch