Re: Fix bug in multixact Oldest*MXactId initialization and access

Поиск
Список
Период
Сортировка
От Heikki Linnakangas
Тема Re: Fix bug in multixact Oldest*MXactId initialization and access
Дата
Msg-id 366b6f95-613d-4ef8-82e4-bb2fd06d9c4d@iki.fi
обсуждение исходный текст
Ответ на Re: Fix bug in multixact Oldest*MXactId initialization and access  (Yura Sokolov <y.sokolov@postgrespro.ru>)
Ответы Re: Fix bug in multixact Oldest*MXactId initialization and access
Список pgsql-hackers
On 27/02/2026 07:00, Yura Sokolov wrote:
> 26.02.2026 23:44, Sami Imseih пишет:
>>> (Sorry, I've used Google Translate to write this sentence).
>>>
>>> When you write assert, you protect yourself from shooting your leg far in
>>> the future. Believe me.
>>
>> The issue to me seems that we a few code paths that rely on the
>> total proc calculation in several places, and changing one and
>> forgetting to change another is what broke things.
> 
> If there were asserts, then tests would have failed.
> Period.
> 
> There would no the bug. There would no this discussion.
> If only array bounds were checked with asserts.

Yeah, more asserts == good, usually.

Here's my version of this. It's the same basic idea, some minor changes:

- Instead of the MaxChildren, TotalProcs, TotalXactProcs variables, I 
added FIRST_PREPARED_XACT_PROC_NUMBER. Rationale: The variables still 
required you to know the layout of the PGPROCs, i.e. you still had to 
know that aux processes come after regular backends and dummy pgprocs 
after aux processes. An FIRST_PREPARED_XACT_PROC_NUMBER is more explicit 
in the places where it's needed.

   I have been thinking of adding variables like that anyway, because 
the current calculations with MaxBackends et al. are just so confusing. 
So I'm vaguely in favor of doing something like that in the future. But 
I think FIRST_PREPARED_XACT_PROC_NUMBER is more clear for this patch, 
and those variable names as done here (MaxBackends, MaxChildren, 
Totalprocs and TotalXactProcs) were still super confusing.

- I added a slightly different the set of Getter/Setter functions: 
MyOldestMemberMXactIdSlot(), 
PreparedXactOldestMemberMXactIdSlot(ProcNumber), and 
MyOldestVisibleMXactIdSlot(). I think this is slightly less error-prone, 
making it harder to pass proc number to wrong function (although the 
assertions would catch such misuses pretty quick anyway). And these 
functions return a pointer to the slot, so we don't need separate getter 
and setter functions. That's a matter of taste, having a little less 
code looks nicer to me.

What do you think?

I've been pondering what kind of issues this bug could cause. Because 
the OldestVisibleMXactId array is allocated just after the 
OldestMemberMXactId array, you don't get a buffer overflow, but a 
prepared xact can overwrite a backend's value in the 
OldestVisibleMXactId array. That can surely cause trouble later, but not 
sure what exactly, and it seems pretty hard to write a repro script for it.

- Heikki

Вложения

В списке pgsql-hackers по дате отправления: