Re: BUG #17947: Combination of replslots pgstat issues causes error/assertion failure

Поиск
Список
Период
Сортировка
От Kyotaro Horiguchi
Тема Re: BUG #17947: Combination of replslots pgstat issues causes error/assertion failure
Дата
Msg-id 20230601.084245.1524588196504400333.horikyota.ntt@gmail.com
обсуждение исходный текст
Ответ на BUG #17947: Combination of replslots pgstat issues causes error/assertion failure  (PG Bug reporting form <noreply@postgresql.org>)
Ответы Re: BUG #17947: Combination of replslots pgstat issues causes error/assertion failure  (Kyotaro Horiguchi <horikyota.ntt@gmail.com>)
Re: BUG #17947: Combination of replslots pgstat issues causes error/assertion failure  (Andres Freund <andres@anarazel.de>)
Список pgsql-bugs
At Fri, 26 May 2023 11:00:01 +0000, PG Bug reporting form <noreply@postgresql.org> wrote in 
> The following bug has been logged on the website:
> 
> Bug reference:      17947
> Logged by:          Alexander Lakhin
> Email address:      exclusion@gmail.com
> PostgreSQL version: 15.3
> Operating system:   Ubuntu 22.04
> Description:        
> 
> The following script:

It is reproduced here.

It looks like the function pgstat_get_entry_ref_cached returned a
faulty reference, which is directing us to a shared entry which is
already reinited for another replication slot. In the problem
scenario, the first backend successfully reuses the entry intended to
be dropped, which is pointed to by the cached entry, then the backend
re-drops it again. When the second backend obtains a cached entry for
another replication slot, the function returns an entry that points to
the same shared entry with the first backend.  Consequently, the two
backends end up sharing the same shared stats entry, but for different
slots.

The attached ad-hoc patch appears to be working somehow for this
specific scenario. (It can contain any defects including possible
shared entry leaks.) We need to find a better approach to prevent the
reuse of an already-reinited entry.  I believe it can be fixed by
adding a reuse count to both the cached entry and shared entry, then
we could compare these numbers to verify the cached entry. However, I
can't think of a solution that wouldn't require additional struct
members for now.  Thus I'm not sure how to fix this for older versions
without them..

regards.

-- 
Kyotaro Horiguchi
NTT Open Source Software Center

Вложения

В списке pgsql-bugs по дате отправления:

Предыдущее
От: "Wetmore, Matthew (CTR)"
Дата:
Сообщение: RE: Order of operations in postgreSQL.
Следующее
От: "Ken McClaren"
Дата:
Сообщение: Re: Order of operations in postgreSQL.