Обсуждение: could sent_lsn be lower than write/flush/replay_lsn?
Hi, We have a customer that for the second time have most of its logical replicas (13 of 16) in a catchcup state, they have been working fine for some time now and suddenly the pg_stat_replication view shows something like this for all of the replicas in catchup state: """ pid | 2667517 state | catchup sent_lsn | 38B4/67C403A8 write_lsn | 38B7/D2C9C038 flush_lsn | 38B7/D2C9C038 replay_lsn | 38B7/D2C9C038 """ This doesn't make sense for me. This is 16.9 btw. The pg_stat_activity says: """ wait_event_type | IO wait_event | ReorderBufferWrite state | active backend_xid | backend_xmin | query_id | query | START_REPLICATION SLOT "sub_down_tables" LOGICAL 38B7/CEBC9330 (proto_version '4', origin 'any', publication_names '"pub_down_tables"') backend_type | walsender """ And the logs keeps showing this: """ 2025-12-26 12:17:41.861 -05 [pid=2667517;l=1;tx=0] LOG: 38B7/CEBC9330 has been already streamed, forwarding to 38B7/D2C9C038 2025-12-26 12:17:41.861 -05 [pid=2667517;l=2;tx=0] STATEMENT: START_REPLICATION SLOT "sub_down_tables_central_trx001" LOGICAL 38B7/CEBC9330 (proto_version '4', origin 'any', publication_names '"pub_elipsys_cresio_down_tables"') 2025-12-26 12:17:41.867 -05 [pid=2667517;l=3;tx=0] LOG: starting logical decoding for slot "sub_down_tables_central_trx001" 2025-12-26 12:17:41.867 -05 [pid=2667517;l=4;tx=0] DETAIL: Streaming transactions committing after 38B7/D2C9C038, reading WAL from 38B0/2261B890. 2025-12-26 12:17:41.867 -05 [pid=2667517;l=5;tx=0] STATEMENT: START_REPLICATION SLOT "sub_down_tables_central_trx001" LOGICAL 38B7/CEBC9330 (proto_version '4', origin 'any', publication_names '"pub_elipsys_cresio_down_tables"') 2025-12-26 12:17:41.868 -05 [pid=2667517;l=6;tx=0] LOG: logical decoding found consistent point at 38B0/2261B890 2025-12-26 12:17:41.868 -05 [pid=2667517;l=7;tx=0] DETAIL: Logical decoding will begin using saved snapshot. 2025-12-26 12:17:41.868 -05 [pid=2667517;l=8;tx=0] STATEMENT: START_REPLICATION SLOT "sub_down_tables_central_trx001" LOGICAL 38B7/CEBC9330 (proto_version '4', origin 'any', publication_names '"pub_elipsys_cresio_down_tables"') 2025-12-26 12:30:35.953 -05 [pid=2678504;l=1;tx=0] ERROR: replication slot "sub_down_tables_central_trx001" is active for PID 2667517 2025-12-26 12:30:40.959 -05 [pid=2678564;l=1;tx=0] ERROR: replication slot "sub_down_tables_central_trx001" is active for PID 2667517 """ any idea what to check? -- Jaime Casanova SYSTEMGUARDS S.A.
Hi Jaime, The $subject can happen during logical replication start, it's just fine. > 2025-12-26 12:17:41.861 -05 [pid=2667517;l=1;tx=0] LOG: 38B7/CEBC9330 > has been already streamed, forwarding to 38B7/D2C9C038 The logical replica requests starting at 38B7/CEBC9330, but the slot's confirmed_flush_lsn is at 38B7/D2C9C038, so forwarding to it. > 2025-12-26 12:17:41.867 -05 [pid=2667517;l=4;tx=0] DETAIL: Streaming > transactions committing after 38B7/D2C9C038, reading WAL from > 38B0/2261B890. The publisher must read WAL from the slot's restart_lsn (38B0/2261B890), but only transactions committed after 38B7/D2C9C038 will be sent to the logical replica. The sent_lsn is just where the wal sender currently reading, so it could be lower than write/flush/replay_lsn. Hope this can answer your question. -- Regards, ChangAo Chen
Hi Jaime, On Sat, Dec 27, 2025 at 1:18 PM cca5507 <2624345507@qq.com> wrote: > > Hi Jaime, > > The $subject can happen during logical replication start, it's just fine. > > > 2025-12-26 12:17:41.861 -05 [pid=2667517;l=1;tx=0] LOG: 38B7/CEBC9330 > > has been already streamed, forwarding to 38B7/D2C9C038 > > The logical replica requests starting at 38B7/CEBC9330, but the slot's confirmed_flush_lsn > is at 38B7/D2C9C038, so forwarding to it. > > > 2025-12-26 12:17:41.867 -05 [pid=2667517;l=4;tx=0] DETAIL: Streaming > > transactions committing after 38B7/D2C9C038, reading WAL from > > 38B0/2261B890. > > The publisher must read WAL from the slot's restart_lsn (38B0/2261B890), but only > transactions committed after 38B7/D2C9C038 will be sent to the logical replica. > > The sent_lsn is just where the wal sender currently reading, so it could be lower than > write/flush/replay_lsn. +1. I guess, the logical replication is restarting in a loop. If that's the case, you will find multiple errors happening in the loop. Another guess is it's because of the walsender/receiver timeout. Do you see timeout error from the corresponding background workers? What's downstream? -- Best Wishes, Ashutosh Bapat
On Mon, Dec 29, 2025 at 2:13 AM Ashutosh Bapat <ashutosh.bapat.oss@gmail.com> wrote: > > On Sat, Dec 27, 2025 at 1:18 PM cca5507 <2624345507@qq.com> wrote: > > > > The sent_lsn is just where the wal sender currently reading, so it could be lower than > > write/flush/replay_lsn. > > +1. > > I guess, the logical replication is restarting in a loop. If that's > the case, you will find multiple errors happening in the loop. Another > guess is it's because of the walsender/receiver timeout. Do you see > timeout error from the corresponding background workers? What's > downstream? > Thanks both of you for clarifying this, it was actually a timeout error. It seems for some reason all the subscriber got disconnected from provider and for a problem we had some years ago (when using pglogical in this same customer) wal_sender_timeout was set to 1 hour... which AFAIU made the wal_sender process keep active for 1 hour while the subscriber tried to reconnect ans saw a walsender already connected to another (the oldest already died) PID. We returned wal_sender_timeout to its original value and everything started to flow... -- Jaime Casanova SYSTEMGUARDS S.A.