Re: Streaming replica hangs periodically for ~ 1 second - how to diagnose/debug
От | hubert depesz lubaczewski |
---|---|
Тема | Re: Streaming replica hangs periodically for ~ 1 second - how to diagnose/debug |
Дата | |
Msg-id | aKiNDmLNsNe0OEio@depesz.com обсуждение исходный текст |
Ответ на | Re: Streaming replica hangs periodically for ~ 1 second - how to diagnose/debug (Tom Lane <tgl@sss.pgh.pa.us>) |
Ответы |
Re: Streaming replica hangs periodically for ~ 1 second - how to diagnose/debug
Re: Streaming replica hangs periodically for ~ 1 second - how to diagnose/debug |
Список | pgsql-general |
On Fri, Aug 22, 2025 at 11:21:22AM -0400, Tom Lane wrote: > hubert depesz lubaczewski <depesz@depesz.com> writes: > > I got repeatable case today. Is is breaking on its own everyy > > ~ 5 minutes. > > Interesting. That futex call is presumably caused by interaction > with some other process within the standby server, and the only > plausible candidate really is the startup process (which is replaying > WAL received from the primary). There are cases where WAL replay > will take locks that can block queries on the standby. Can you > correlate the delays on the standby server with any DDL events > occurring on the primary? Nope. Plus there is certain repetition of these cases, so even if I'd miss *some* create table/alter, it just isn't going to be happening every 4-5 minutes. For example, looking at logs for the last ~2h, and just checking situation when there are more than 20 messages in the same milisecond, I can see: 108 14:02:03.149 25 14:04:01.619 110 14:05:36.924 77 14:05:36.925 108 14:09:28.155 38 14:13:52.481 63 14:13:52.482 73 14:13:52.484 146 14:18:19.338 39 14:18:19.339 24 14:20:01.694 82 14:23:07.352 55 14:23:07.353 37 14:23:07.353 45 14:27:44.125 132 14:27:44.126 109 14:31:41.593 70 14:31:41.594 24 14:32:01.205 21 14:34:01.477 79 14:35:36.761 104 14:35:36.762 22 14:39:49.541 151 14:39:49.542 22 14:39:49.543 112 14:44:15.607 73 14:44:15.608 28 14:48:01.256 50 14:48:25.588 131 14:48:25.589 139 14:52:44.391 74 14:57:02.369 117 14:57:02.370 20 15:00:02.008 137 15:00:43.982 34 15:00:43.983 20 15:01:01.110 22 15:04:21.037 153 15:04:21.038 20 15:08:01.136 31 15:08:55.798 126 15:08:55.799 76 15:13:46.654 83 15:13:46.655 20 15:17:01.700 107 15:18:42.112 72 15:18:42.113 124 15:23:48.689 32 15:23:48.690 25 15:23:48.690 28 15:24:01.397 So, while there are outliers, I'd say that most of the problems happens every 3-5 minutes. depesz
В списке pgsql-general по дате отправления: