Re: Row pattern recognition

Поиск
Список
Период
Сортировка
От Jacob Champion
Тема Re: Row pattern recognition
Дата
Msg-id CAAWbhmjq3NY1+Am-QHJ4AFh7mi=2eiiGqj518f3-j-C3EfffPg@mail.gmail.com
обсуждение исходный текст
Ответ на Re: Row pattern recognition  (Tatsuo Ishii <ishii@sraoss.co.jp>)
Ответы Re: Row pattern recognition  (Tatsuo Ishii <ishii@sraoss.co.jp>)
Список pgsql-hackers
On Sat, Sep 9, 2023 at 4:21 AM Tatsuo Ishii <ishii@sraoss.co.jp> wrote:
> Then we will get for str_set:
> r0: B
> r1: AB
>
> Because r0 only has classifier B, r1 can have A and B.  Problem is,
> r2. If we choose A at r1, then r2 = B. But if we choose B at t1, then
> r2 = AB. I guess this is the issue you pointed out.

Right.

> Yeah, probably we have delay evaluation of such pattern variables like
> A, then reevaluate A after the first scan.
>
> What about leaving this (reevaluation) for now? Because:
>
> 1) we don't have CLASSIFIER
> 2) we don't allow to give CLASSIFIER to PREV as its arggument
>
> so I think we don't need to worry about this for now.

Sure. I'm all for deferring features to make it easier to iterate; I
just want to make sure the architecture doesn't hit a dead end. Or at
least, not without being aware of it.

Also: is CLASSIFIER the only way to run into this issue?

> What if we don't follow the standard, instead we follow POSIX EREs?  I
> think this is better for users unless RPR's REs has significant merit
> for users.

Piggybacking off of what Vik wrote upthread, I think we would not be
doing ourselves any favors by introducing a non-compliant
implementation that performs worse than a traditional NFA. Those would
be some awful bug reports.

> > - I think we have to implement a parallel parser regardless (RPR PATTERN
> > syntax looks incompatible with POSIX)
>
> I am not sure if we need to worry about this because of the reason I
> mentioned above.

Even if we adopted POSIX NFA semantics, we'd still have to implement
our own parser for the PATTERN part of the query. I don't think
there's a good way for us to reuse the parser in src/backend/regex.

> > Does that seem like a workable approach? (Worst-case, my code is just
> > horrible, and we throw it in the trash.)
>
> Yes, it seems workable. I think for the first cut of RPR needs at
> least the +quantifier with reasonable performance. The current naive
> implementation seems to have issue because of exhaustive search.

+1

Thanks!
--Jacob



В списке pgsql-hackers по дате отправления:

Предыдущее
От: Thomas Munro
Дата:
Сообщение: Re: lockup in parallel hash join on dikkop (freebsd 14.0-current)
Следующее
От: Peter Smith
Дата:
Сообщение: Re: [PoC] pg_upgrade: allow to upgrade publisher node