Improve read_local_xlog_page_guts by replacing polling with latch-based waiting
От | Xuneng Zhou |
---|---|
Тема | Improve read_local_xlog_page_guts by replacing polling with latch-based waiting |
Дата | |
Msg-id | CABPTF7UNft368x-RgOXkfj475OwEbp+VVO-wEXz7StgjD_=6sw@mail.gmail.com обсуждение исходный текст |
Ответы |
Re: Improve read_local_xlog_page_guts by replacing polling with latch-based waiting
|
Список | pgsql-hackers |
Hi hackers, During a performance run [1], I observed heavy polling in read_local_xlog_page_guts(). Heikki’s comment from a few months ago suggests replacing the current check–sleep–repeat loop with the condition-variable (CV) infrastructure used by the walsender: 1) Problem and Background /* * Loop waiting for xlog to be available if necessary * * TODO: The walsender has its own version of this function, which uses a * condition variable to wake up whenever WAL is flushed. We could use the * same infrastructure here, instead of the check/sleep/repeat style of * loop. */ Because read_local_xlog_page_guts() waits for a specific flush or replay LSN, polling becomes inefficient when waits are long. I built a POC patch that swaps polling for CVs, but a single global CV (or even separate “flush” and “replay” CVs) isn’t ideal: • The wake-up routines don’t know which LSN each waiter cares about, so they would need to broadcast on every flush/replay. • Caching the minimum outstanding target LSN could reduce spurious wake-ups but won’t eliminate them when multiple backends wait for different LSNs simultaneously. • The walsender accepts some broadcast overhead via two CVs for different waiters. A more precise approach would require a request queue that maps waiters to target LSNs and issues targeted wake-ups—adding complexity. 2) Proposal I came across the thread “Implement waiting for WAL LSN replay: reloaded” [2] by Alexander. The “Implement WAIT FOR” patch in that thread provides a well-established infrastructure for waiting on WAL replay in backends. With modest adjustments, it could be generalized. Main changes in patch v1 Improve read_local_xlog_page_guts by replacing polling with latch-based waiting: • Introduce WaitForLSNFlush, analogous to WaitForLSNReplay from the “WAIT FOR” work. • Replace the busy-wait in read_local_xlog_page_guts() with WaitForLSNFlush and WaitForLSNReplay. • Add wake-up calls in XLogFlush and XLogBackgroundFlush. Edge Case: Timeline Switch During Wait /* * Check which timeline to get the record from. * * We have to do it each time through the loop because if we're in * recovery as a cascading standby, the current timeline might've * become historical. We can't rely on RecoveryInProgress() because in * a standby configuration like * * A => B => C * * if we're a logical decoding session on C, and B gets promoted, our * timeline will change while we remain in recovery. * * We can't just keep reading from the old timeline as the last WAL * archive in the timeline will get renamed to .partial by * StartupXLOG(). read_local_xlog_page_guts() re-evaluates the active timeline on each loop iteration because, on a cascading standby, the current timeline can become historical. Once that happens, there’s no need to keep waiting for that timeline. A timeline switch could therefore render an in-progress wait unnecessary. One option is to add a wake-up at the point where the timeline switch occurs, so waiting processes exit promptly. The current approach chooses not to do this, given that most waits are short and timeline changes in cascading standby are rare. Supporting timeline-switch wake-ups would also require additional handling in both WaitForLSNFlush and WaitForLSNReplay, increasing complexity. Comments and suggestions are welcome. [1] https://www.postgresql.org/message-id/CABPTF7VuFYm9TtA9vY8ZtS77qsT+yL_HtSDxUFnW3XsdB5b9ew@mail.gmail.com [2] https://www.postgresql.org/message-id/flat/CAPpHfdsjtZLVzxjGT8rJHCYbM0D5dwkO%2BBBjcirozJ6nYbOW8Q%40mail.gmail.com Best, Xuneng
Вложения
В списке pgsql-hackers по дате отправления: