pg_rewind race condition just after promotion

Поиск
Список
Период
Сортировка
От Heikki Linnakangas
Тема pg_rewind race condition just after promotion
Дата
Msg-id 9f568c97-87fe-a716-bd39-65299b8a60f4@iki.fi
обсуждение исходный текст
Ответы Re: pg_rewind race condition just after promotion  (Kyotaro Horiguchi <horikyota.ntt@gmail.com>)
Список pgsql-hackers
There's a race condition between the checkpoint at promotion and 
pg_rewind. When a server is promoted, the startup process writes an 
end-of-recovery checkpoint that includes the new TLI, and the server is 
immediate opened for business. The startup process requests the 
checkpointer process to perform a checkpoint, but it can take a few 
seconds or more to complete. If you run pg_rewind, using the just 
promoted server as the source, pg_rewind will think that the server is 
still on the old timeline, because it only looks at TLI in the control 
file's copy of the checkpoint record. That's not updated until the 
checkpoint is finished.

This isn't a new issue. Stephen Frost first reported it back 2015 [1]. 
Back then, it was deemed just a small annoyance, and we just worked 
around it in the tests by issuing a checkpoint command after promotion, 
to wait for the checkpoint to finish. I just ran into it again today, 
with the new pg_rewind test, and silenced it in the similar way.

I think we should fix this properly. I'm not sure if it can lead to a 
broken cluster, but at least it can cause pg_rewind to fail 
unnecessarily and in a user-unfriendly way. But this is actually pretty 
simple to fix. pg_rewind looks at the control file to find out the 
timeline the server is on. When promotion happens, the startup process 
updates minRecoveryPoint and minRecoveryPointTLI fields in the control 
file. We just need to read it from there. Patch attached.

I think we should also backpatch this. Back in 2015, we decided that we 
can live with this, but it's always been a bit bogus, and seems simple 
enough to fix.

Thoughts?

[1] 
https://www.postgresql.org/message-id/20150428180253.GU30322%40tamriel.snowman.net

- Heikki

Вложения

В списке pgsql-hackers по дате отправления:

Предыдущее
От: Stephen Frost
Дата:
Сообщение: Change default of checkpoint_completion_target
Следующее
От: Peter Geoghegan
Дата:
Сообщение: Re: Yet another fast GiST build