Re: Do we want a hashset type?

Поиск
Список
Период
Сортировка
От Joel Jacobson
Тема Re: Do we want a hashset type?
Дата
Msg-id 34881a04-d5ee-4b63-9548-374b350c87d6@app.fastmail.com
обсуждение исходный текст
Ответ на Re: Do we want a hashset type?  (jian he <jian.universality@gmail.com>)
Список pgsql-hackers
On Thu, Jun 15, 2023, at 06:29, jian he wrote:
> I am not sure the following results are correct.
> with cte as (
>     select hashset(x) as x
>             ,hashset_capacity(hashset(x))
>             ,hashset_count(hashset(x))
>     from generate_series(1,10) g(x))
> select *
>         ,'|' as delim
>         , hashset_add(x,11111::int)
>         ,hashset_capacity(hashset_add(x,11111::int))
>         ,hashset_count(hashset_add(x,11111::int))
> from    cte \gx
>
>
> results:  
> -[ RECORD 1 ]----+-----------------------------
> x                | {8,1,10,3,9,4,6,2,11111,5,7}
> hashset_capacity | 64
> hashset_count    | 10
> delim            | |
> hashset_add      | {8,1,10,3,9,4,6,2,11111,5,7}
> hashset_capacity | 64
> hashset_count    | 11

Nice catch, you found a bug!

Fixed in attached patch:

---
Ensure hashset_add and hashset_merge operate on copied data

Previously, the hashset_add() and hashset_merge() functions were
modifying the original hashset in-place. This was leading to unexpected
results because the original data in the hashset was being altered.

This commit introduces the macro PG_GETARG_INT4HASHSET_COPY(), ensuring
a copy of the hashset is created and modified, leaving the original
hashset untouched.

This adjustment ensures hashset_add() and hashset_merge() operate
correctly on the copied hashset and prevent modification of the
original data.

A new regression test file `reported_bugs.sql` has been added to
validate the proper functionality of these changes. Future reported
bugs and their corresponding tests will also be added to this file.
---

I wonder if this function:

static int4hashset_t *
int4hashset_copy(int4hashset_t *src)
{
    return src;
}

...that was previously named hashset_copy(),
should be implemented to actually copy the struct,
instead of just returning the input?

It is being used by int4hashset_agg_combine() like this:

/* copy the hashset into the right long-lived memory context */
oldcontext = MemoryContextSwitchTo(aggcontext);
src = int4hashset_copy(src);
MemoryContextSwitchTo(oldcontext);

/Joel
Вложения

В списке pgsql-hackers по дате отправления:

Предыдущее
От: Tom Lane
Дата:
Сообщение: Re: Memory leak in incremental sort re-scan
Следующее
От: Masahiko Sawada
Дата:
Сообщение: Re: subscription/033_run_as_table_owner is not listed in the meson.build