Re: pg_rewind with cascade standby doesn't work well

Поиск

Список

Период

Сортировка

От	Kuwamura Masaki
Тема	Re: pg_rewind with cascade standby doesn't work well
Дата	20 сентября 2023 г. 05:46:45
Msg-id	CAMyC8qryE7mKyvPvGHCt5GpANAmp8sS_tRbraqXcPBx14viy6g@mail.gmail.com обсуждение исходный текст
Ответ на	Re: pg_rewind with cascade standby doesn't work well (Michael Paquier <michael@paquier.xyz>)
Ответы	Re: pg_rewind with cascade standby doesn't work well (Michael Paquier <michael@paquier.xyz>) Re: pg_rewind with cascade standby doesn't work well (Aleksander Alekseev <aleksander@timescale.com>)
Список	pgsql-hackers

Дерево обсуждения

>> IMO a test is needed that makes sure no one is going to break this in
>> the future.
>
> You definitely need more complex test scenarios for that. If you can
> come up with new ways to make the TAP tests of pg_rewind mode modular
> in handling more complicated node setups, that would be a nice
> addition, for example.

I'm sorry for lacking tests. For now, I started off with a simple test
that cause the problem I mentioned. The updated WIP patch 0001 includes
the new test for pg_rewind.
And also, I'm afraid that I'm not sure what kind of tests I have to make
for fix this behavior. Would you mind giving me some advice?

> This patch is at least incorrect in its handling of crash recovery,
> because these two should *never* be set in this case as we want to
> replay up to the end of WAL. For example, see xlog.c or the top of
> xlogrecovery.c about the assumptions behind these variables:
> /* crash recovery should always recover to the end of WAL */
> ControlFile->minRecoveryPoint = InvalidXLogRecPtr;
> ControlFile->minRecoveryPointTLI = 0;
>
> If an end-of-recovery record is replayed during crash recovery, these
> assumptions are plain broken.

That make sense! I really appreciate your knowledgeable review.

> One thing that we could consider is to be more aggressive with
> restartpoints when replaying this record for a standby, see a few
> lines above the lines added by your patch, for example. And we could
> potentially emulate a post-promotion restart point to get a refresh of
> the control file as it should, with the correct code paths involved in
> the updates of minRecoveryPoint when the checkpointer does the job.

I'm not confident but you meant we could make restartpoint
(i.e., call `RequestCheckpoint()`) instead of my old patch?
The patch 0001 also contains my understanding.

I also found a bug (maybe). If we call `CreateRestartPoint()` during
crash-recovery, the assertion fails at ComputeXidHorizon() in procarray.c.
It's inherently orthogonal to the problem I already reported. So you can
reproduce this at HEAD with this procedure.

1. Start primary and standby server
2. Modify checkpoint_timeout to 1h on standby
3. Insert 10^10 records and concurrently run CHECKPOINT every second on primary
4. Do an immediate stop on both standby and primary at the end of the insert
5. Modify checkpoint_timeout to 30 on standby
6. Remove standby.signal on standby
7. Restart standby (it will start crash-recovery)
8. Assertion failure is raised shortly

I think this is because `TruncateSUBTRANS();` in `CreateRestartPoint()` is
called but `StartupSUBTRANS()` isn't called yet. In `StartupXLOG()`, we call
`StartupSUBTRANS()` if `(ArchiveRecoveryRequested && EnableHotStandby)`.
However, in `CreateRestartPoint()`, we call `TruncateSUBTRANS()` if
`(EnableHotStandby)`. I guess the difference causes this bug. The latter
possibly be called even crash-recovery while former isn't.
The attached patch 0002 fixes it. I think we could discuss about this bug in
another thread if needed.

Best regards!

Masaki Kuwamura

Вложения

В списке pgsql-hackers по дате отправления:

Предыдущее

От: Nathan Bossart
Дата: 20 сентября 2023 г., 05:30:33
Сообщение: Re: Inefficiency in parallel pg_restore with many tables

Следующее

От: Paul Jungwirth
Дата: 20 сентября 2023 г., 05:50:10
Сообщение: Re: SQL:2011 application time

Вход в личный кабинет

Восстановление пароля

Подтверждение аккаунта

Изменение пароля

Re: pg_rewind with cascade standby doesn't work well

Вложения

Предыдущее

Следующее