Re: Improving the latch handling between logical replication launcher and worker processes.
От | Alexander Lakhin |
---|---|
Тема | Re: Improving the latch handling between logical replication launcher and worker processes. |
Дата | |
Msg-id | e4b7f246-c0a4-421d-aac7-91049378d82d@gmail.com обсуждение исходный текст |
Ответ на | Re: Improving the latch handling between logical replication launcher and worker processes. (Heikki Linnakangas <hlinnaka@iki.fi>) |
Список | pgsql-hackers |
Hello, 04.09.2024 16:53, Heikki Linnakangas wrote: > On 04/09/2024 14:24, vignesh C wrote: >> >> I agree that this approach is more simple than the other approach. How >> about something like the attached patch to handle the same. > > I haven't looked at these new patches from the last few days, but please also note the work at > https://www.postgresql.org/message-id/476672e7-62f1-4cab-a822-f3a8e949dd3f%40iki.fi. If those "interrupts" patches are > committed, this is pretty straightforward to fix by using a separate interrupt bit for this, as the patch on that > thread does. > I'd also like to add that this issue leads to buildfarm test failures, because of the race condition between #define DEFAULT_NAPTIME_PER_CYCLE 180000L and $timeout_default = 180 That is, in the situation described above "the apply worker does not get started for the new subscription created immediately and gets started after the timeout of 180 seconds", 014_binary.pl can fail if wait_for_log()'s 180 seconds passed sooner: https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=snakefly&dt=2025-02-09%2011%3A45%3A05 I reproduced this failure locally when running 50 014_binary tests in parallel and got failures on iterations 4, 14, 10. But with PG_TEST_TIMEOUT_DEFAULT=190, 30 iterations passed for me (8 of them took 180+ seconds). Best regards, Alexander Lakhin Neon (https://neon.tech)
В списке pgsql-hackers по дате отправления: