Re: Row pattern recognition
От | Jacob Champion |
---|---|
Тема | Re: Row pattern recognition |
Дата | |
Msg-id | 022a392f-2a9e-3df1-a509-a5bd51160fa5@timescale.com обсуждение исходный текст |
Ответ на | Re: Row pattern recognition (Tatsuo Ishii <ishii@sraoss.co.jp>) |
Ответы |
Re: Row pattern recognition
(Vik Fearing <vik@postgresfriends.org>)
Re: Row pattern recognition (Tatsuo Ishii <ishii@sraoss.co.jp>) Re: Row pattern recognition (Tatsuo Ishii <ishii@sraoss.co.jp>) |
Список | pgsql-hackers |
Hi Ishii-san, On 7/19/23 22:15, Tatsuo Ishii wrote: > Currently my patch has a limitation for the sake of simple > implementation: a pattern like "A+" is parsed and analyzed in the raw > parser. This makes subsequent process much easier because the pattern > element variable (in this case "A") and the quantifier (in this case > "+") is already identified by the raw parser. However there are much > more cases are allowed in the standard as you already pointed out. For > those cases probably we should give up to parse PATTERN items in the > raw parser, and instead the raw parser just accepts the elements as > Sconst? Is there a concern that the PATTERN grammar can't be represented in Bison? I thought it was all context-free... Or is the concern that the parse tree of the pattern is hard to feed into a regex engine? > Any comments, especially on the PREV/NEXT implementation part is > welcome. Currently the DEFINE expression like "price > PREV(price)" is > prepared in ExecInitWindowAgg using ExecInitExpr,tweaking var->varno > in Var node so that PREV uses OUTER_VAR, NEXT uses INNER_VAR. Then > evaluate the expression in ExecWindowAgg using ExecEvalExpr, setting > previous row TupleSlot to ExprContext->ecxt_outertuple, and next row > TupleSlot to ExprContext->ecxt_innertuple. I think this is temporary > hack and should be gotten ride of before v1 is committed. Better idea? I'm not familiar enough with this code yet to offer very concrete suggestions, sorry... But at some point in the future, we need to be able to skip forward and backward from arbitrary points, like DEFINE B AS B.price > PREV(FIRST(A.price), 3) so there won't be just one pair of "previous and next" tuples. Maybe that can help clarify the design? It feels like it'll need to eventually be a "real" function that operates on the window state, even if it doesn't support all of the possible complexities in v1. -- Taking a closer look at the regex engine: It looks like the + qualifier has trouble when it meets the end of the frame. For instance, try removing the last row of the 'stock' table in rpr.sql; some of the final matches will disappear unexpectedly. Or try a pattern like PATTERN ( a+ ) DEFINE a AS TRUE which doesn't seem to match anything in my testing. There's also the issue of backtracking in the face of reclassification, as I think Vik was alluding to upthread. The pattern PATTERN ( a+ b+ ) DEFINE a AS col = 2, b AS col = 2 doesn't match a sequence of values (2 2 ...) with the current implementation, even with a dummy row at the end to avoid the end-of-frame bug. (I've attached two failing tests against v2, to hopefully better illustrate, along with what I _think_ should be the correct results.) I'm not quite understanding the match loop in evaluate_pattern(). It looks like we're building up a string to pass to the regex engine, but by the we call regexp_instr, don't we already know whether or not the pattern will match based on the expression evaluation we've done? Thanks, --Jacob
Вложения
В списке pgsql-hackers по дате отправления: