pgsql: Fix slotsync worker blocking promotion when stuck in wait
| От | Fujii Masao |
|---|---|
| Тема | pgsql: Fix slotsync worker blocking promotion when stuck in wait |
| Дата | |
| Msg-id | E1wAIcU-003U3V-1o@gemulon.postgresql.org обсуждение исходный текст |
| Список | pgsql-committers |
Fix slotsync worker blocking promotion when stuck in wait Previously, on standby promotion, the startup process sent SIGUSR1 to the slotsync worker (or a backend performing slot synchronization) and waited for it to exit. This worked in most cases, but if the process was blocked waiting for a response from the primary (e.g., due to a network failure), SIGUSR1 would not interrupt the wait. As a result, the process could remain stuck, causing the startup process to wait for a long time and delaying promotion. This commit fixes the issue by introducing a new procsignal reason, PROCSIG_SLOTSYNC_MESSAGE. On promotion, the startup process sends this signal, and the handler sets interrupt flags so the process exits (or errors out) promptly at CHECK_FOR_INTERRUPTS(), allowing promotion to complete without delay. Backpatch to v17, where slotsync was introduced. Author: Nisha Moond <nisha.moond412@gmail.com> Reviewed-by: shveta malik <shveta.malik@gmail.com> Reviewed-by: Amit Kapila <amit.kapila16@gmail.com> Reviewed-by: Zhijie Hou <houzj.fnst@fujitsu.com> Reviewed-by: Fujii Masao <masao.fujii@gmail.com> Discussion: https://postgr.es/m/CAHGQGwFzNYroAxSoyJhqTU-pH=t4Ej6RyvhVmBZ91Exj_TPMMQ@mail.gmail.com Backpatch-through: 17 Branch ------ REL_17_STABLE Details ------- https://git.postgresql.org/pg/commitdiff/15910b1c363f47b3984d24a91ed75ddac36070d8 Modified Files -------------- src/backend/replication/logical/slotsync.c | 138 ++++++++++++++++++++--------- src/backend/storage/ipc/procsignal.c | 4 + src/backend/tcop/postgres.c | 4 + src/include/replication/slotsync.h | 7 ++ src/include/storage/procsignal.h | 1 + 5 files changed, 111 insertions(+), 43 deletions(-)
В списке pgsql-committers по дате отправления: