Re: IPC/MultixactCreation on the Standby server

Поиск
Список
Период
Сортировка
От Heikki Linnakangas
Тема Re: IPC/MultixactCreation on the Standby server
Дата
Msg-id 6eb048a2-239d-4a47-984c-7e5f5e826cc5@iki.fi
обсуждение исходный текст
Ответ на Re: IPC/MultixactCreation on the Standby server  (Andrey Borodin <x4mmm@yandex-team.ru>)
Ответы Re: IPC/MultixactCreation on the Standby server
Re: IPC/MultixactCreation on the Standby server
Список pgsql-hackers
On 30/11/2025 14:15, Andrey Borodin wrote:
> On 29 Nov 2025, at 00:51, Heikki Linnakangas <hlinnaka@iki.fi> wrote:
>> I didn't understand why the 'kill9' and 'poll_start' stuff is
>> needed. We have plenty of tests that kill the server with regular
>> "$node->stop('immediate')", and restart the server normally. The
>> checkpoint in the middle of the tests seems unnecessary too. I
>> removed all that, and the test still seems to work. Was there a
>> particular reason for them?
> 
> In current shutdown sequence test seems to be reproducing corruption
> without checkpointing. I recollect that in July standby deadlock was
> reachable without checkpoint, but corruption was not. But now it
> seems test is working.

Ok.

>> I moved the wraparound test to a separate test file and commit.
>> More test coverage is good, but it's quite separate from the
>> bugfix and the wraparound related test shares very little with the
>> other test. The wraparound test needs a little more cleanup: use
>> plain perl instead of 'dd' and 'rm' for the file operations, for
>> example. (I did that with the tests in the 64-bit mxoff patches,
>> so we could copy from there.)
> 
> PFA test version without dd and rm.

Thanks! I will focus on the main patch and TAP test now, but will commit 
the wraparound test separately afterwards. At quick glance, it looks 
good now.

> Did I get your right, that we do not backport wraparound test,
> backport fixes for 001_multixact.pl test down to 17 where it
> appeared?
Yes, that's my plan. Except that 001_multixact.pl appeared in v18, not v17.

> First two patches are v13 intact, second pair is my suggestions.

Thanks, here's a new set of patches, now with backpatched versions for 
all the branches. As you said, there were a number of differences 
between branches:

- On master, don't include the compatibility hacks for reading WAL 
generated with older minor versions. Because WAL is not compatible 
across major versions anyway.

- REL_18_STABLE didn't have the SimpleLruZeroAndWritePage() function 
(introduced in commit c616785516).

- REL_17_STABLE didn't have the 001_multixact.pl TAP test. So I didn't 
backport the new TAP test to v17 and below either.

- REL_16_STABLE used 32-bit SLRU page numbers, didn't have bank locks, 
and used a simple sleep-loop instead of the condition variable.

- REL_15_STABLE and REL_14_STABLE: no conflicts from REL_16_STABLE

All of those conflicts were pretty straightforward to handle, but it's 
enough code churn for silly mistakes to slip in, especially when the TAP 
test didn't apply. So if you have a chance, please help to review and 
test each of these backpatched versions too.

In addition to the backpatching, I did some more cosmetic cleanups to 
the TAP test.

- Heikki

Вложения

В списке pgsql-hackers по дате отправления: