Re: Row pattern recognition

Поиск
Список
Период
Сортировка
От Tatsuo Ishii
Тема Re: Row pattern recognition
Дата
Msg-id 20251124.135725.1126381089646688370.ishii@postgresql.org
обсуждение исходный текст
Ответ на Re: Row pattern recognition  (Chao Li <li.evan.chao@gmail.com>)
Список pgsql-hackers
Hi Chao,

Thank you for the review!

>> On Nov 20, 2025, at 15:33, Chao Li <li.evan.chao@gmail.com> wrote:
>> 
>> 
>> I’d stop here, and continue 0005 tomorrow.
>> 
> 
> I reviewed 0005 today. I'm still not very familiar with the executor code, so going through 0005 has also been a
valuablelearning process for me.
 
> 
> 12 - 0005 - nodeWindowAgg.c
> ```
> +    if (string_set_get_size(str_set) == 0)
> +    {
> +        /* no match found in the first row */
> +        register_reduced_frame_map(winstate, original_pos, RF_UNMATCHED);
> +        destroyStringInfo(encoded_str);
> +        return;
> +    }
> ```
> 
> encoded_str should also be destroyed if not entering the “if” clause.

Subsequent string_set_discard() will free encode_str since encoded_str
is part of str_set. So no need to call destroyStringInfo(encoded_str)
in not entering "if" clause.

    string_set_discard(str_set);

So I think this is Ok.

> 13 - 0005 - nodeWindowAgg.c
> ```
> static
> int
> do_pattern_match(char *pattern, char *encoded_str, int len)
> {
> static regex_t *regcache = NULL;
> static regex_t preg;
> static char patbuf[1024]; /* most recent 'pattern' is cached here */
> ```
> 
> Using static variable in executor seems something I have never seen, which may persistent data across queries. I
thinkusually per query data are stored in state structures.
 

Yeah, such a usage maybe rare. But I hesitate to store the data
(regex_t) in WindowAggState because:

(1) it bloats WindowAggState which we want to avoid if possible
(2) it requires execnodes.h to include regex.h. (maybe this itself is harmless, I am not sure)

> 14 - 0005 - nodeWindowAgg.c
> ```
> +        MemoryContext oldContext = MemoryContextSwitchTo(TopMemoryContext);
> +
> +        if (regcache != NULL)
> +            pg_regfree(regcache);    /* free previous re */
> +
> +        /* we need to convert to char to pg_wchar */
> +        plen = strlen(pattern);
> +        data = (pg_wchar *) palloc((plen + 1) * sizeof(pg_wchar));
> +        data_len = pg_mb2wchar_with_len(pattern, data, plen);
> +        /* compile re */
> +        sts = pg_regcomp(&preg, /* compiled re */
> +                         data,    /* target pattern */
> +                         data_len,    /* length of pattern */
> +                         cflags,    /* compile option */
> +                         C_COLLATION_OID    /* collation */
> +            );
> +        pfree(data);
> +
> +        MemoryContextSwitchTo(oldContext);
> ```
> 
> Here in do_pattern_match, it switches to top memory context and compiles the re and stores to “preg". Based on the
commentof pg_regcomp:
 
> ```
> /*
> * pg_regcomp - compile regular expression
> *
> * Note: on failure, no resources remain allocated, so pg_regfree()
> * need not be applied to re.
> */
> int
> pg_regcomp(regex_t *re,
> const chr *string,
> size_t len,
> int flags,
> Oid collation)
> ```
> 
> “preg” should be freed by pg_regfree(), given the memory is allocated in TopMemoryContext, this looks like a memory
leak.

I admit current patch leaves the memory unfreed even after a query
finishes. What about adding something like:

static void do_pattern_match_end(void)
{
    if (regcache != NULL)
        pg_regfree(regcache);
}

and let ExecEndWindowAgg() call it?

> Okay, I’d stop here and continue to review 0006 next week.
> 
> Best regards,
> --
> Chao Li (Evan)
> HighGo Software Co., Ltd.
> https://www.highgo.com/
> 
> 
> 
> 



В списке pgsql-hackers по дате отправления: