Обсуждение: could sent_lsn be lower than write/flush/replay_lsn?

Поиск
Список
Период
Сортировка

could sent_lsn be lower than write/flush/replay_lsn?

От
Jaime Casanova
Дата:
Hi,

We have a customer that for the second time have most of its logical
replicas (13 of 16) in a catchcup state, they have been working fine
for some time now and suddenly the pg_stat_replication view shows
something like this for all of the replicas in catchup state:

"""
pid              | 2667517
state            | catchup
sent_lsn         | 38B4/67C403A8
write_lsn        | 38B7/D2C9C038
flush_lsn        | 38B7/D2C9C038
replay_lsn       | 38B7/D2C9C038
"""

This doesn't make sense for me. This is 16.9 btw.

The pg_stat_activity says:

"""
wait_event_type  | IO
wait_event       | ReorderBufferWrite
state            | active
backend_xid      |
backend_xmin     |
query_id         |
query            | START_REPLICATION SLOT "sub_down_tables" LOGICAL
38B7/CEBC9330 (proto_version '4', origin 'any', publication_names
'"pub_down_tables"')
backend_type     | walsender
"""

And the logs keeps showing this:

"""
2025-12-26 12:17:41.861 -05 [pid=2667517;l=1;tx=0] LOG:  38B7/CEBC9330
has been already streamed, forwarding to 38B7/D2C9C038
2025-12-26 12:17:41.861 -05 [pid=2667517;l=2;tx=0] STATEMENT:
START_REPLICATION SLOT "sub_down_tables_central_trx001" LOGICAL
38B7/CEBC9330 (proto_version '4', origin 'any', publication_names
'"pub_elipsys_cresio_down_tables"')
2025-12-26 12:17:41.867 -05 [pid=2667517;l=3;tx=0] LOG:  starting
logical decoding for slot "sub_down_tables_central_trx001"
2025-12-26 12:17:41.867 -05 [pid=2667517;l=4;tx=0] DETAIL:  Streaming
transactions committing after 38B7/D2C9C038, reading WAL from
38B0/2261B890.
2025-12-26 12:17:41.867 -05 [pid=2667517;l=5;tx=0] STATEMENT:
START_REPLICATION SLOT "sub_down_tables_central_trx001" LOGICAL
38B7/CEBC9330 (proto_version '4', origin 'any', publication_names
'"pub_elipsys_cresio_down_tables"')
2025-12-26 12:17:41.868 -05 [pid=2667517;l=6;tx=0] LOG:  logical
decoding found consistent point at 38B0/2261B890
2025-12-26 12:17:41.868 -05 [pid=2667517;l=7;tx=0] DETAIL:  Logical
decoding will begin using saved snapshot.
2025-12-26 12:17:41.868 -05 [pid=2667517;l=8;tx=0] STATEMENT:
START_REPLICATION SLOT "sub_down_tables_central_trx001" LOGICAL
38B7/CEBC9330 (proto_version '4', origin 'any', publication_names
'"pub_elipsys_cresio_down_tables"')
2025-12-26 12:30:35.953 -05 [pid=2678504;l=1;tx=0] ERROR:  replication
slot "sub_down_tables_central_trx001" is active for PID 2667517
2025-12-26 12:30:40.959 -05 [pid=2678564;l=1;tx=0] ERROR:  replication
slot "sub_down_tables_central_trx001" is active for PID 2667517
"""

any idea what to check?

--
Jaime Casanova
SYSTEMGUARDS S.A.



Re: could sent_lsn be lower than write/flush/replay_lsn?

От
"cca5507"
Дата:
Hi Jaime,

The $subject can happen during logical replication start, it's just fine.

> 2025-12-26 12:17:41.861 -05 [pid=2667517;l=1;tx=0] LOG:  38B7/CEBC9330
> has been already streamed, forwarding to 38B7/D2C9C038

The logical replica requests starting at 38B7/CEBC9330, but the slot's confirmed_flush_lsn
is at 38B7/D2C9C038, so forwarding to it.

> 2025-12-26 12:17:41.867 -05 [pid=2667517;l=4;tx=0] DETAIL:  Streaming
> transactions committing after 38B7/D2C9C038, reading WAL from
> 38B0/2261B890.

The publisher must read WAL from the slot's restart_lsn (38B0/2261B890), but only
transactions committed after 38B7/D2C9C038 will be sent to the logical replica.

The sent_lsn is just where the wal sender currently reading, so it could be lower than
write/flush/replay_lsn.

Hope this can answer your question.

--
Regards,
ChangAo Chen

Re: could sent_lsn be lower than write/flush/replay_lsn?

От
Ashutosh Bapat
Дата:
Hi Jaime,

On Sat, Dec 27, 2025 at 1:18 PM cca5507 <2624345507@qq.com> wrote:
>
> Hi Jaime,
>
> The $subject can happen during logical replication start, it's just fine.
>
> > 2025-12-26 12:17:41.861 -05 [pid=2667517;l=1;tx=0] LOG:  38B7/CEBC9330
> > has been already streamed, forwarding to 38B7/D2C9C038
>
> The logical replica requests starting at 38B7/CEBC9330, but the slot's confirmed_flush_lsn
> is at 38B7/D2C9C038, so forwarding to it.
>
> > 2025-12-26 12:17:41.867 -05 [pid=2667517;l=4;tx=0] DETAIL:  Streaming
> > transactions committing after 38B7/D2C9C038, reading WAL from
> > 38B0/2261B890.
>
> The publisher must read WAL from the slot's restart_lsn (38B0/2261B890), but only
> transactions committed after 38B7/D2C9C038 will be sent to the logical replica.
>
> The sent_lsn is just where the wal sender currently reading, so it could be lower than
> write/flush/replay_lsn.

+1.

I guess, the logical replication is restarting in a loop. If that's
the case, you will find multiple errors happening in the loop. Another
guess is it's because of the walsender/receiver timeout. Do you see
timeout error from the corresponding background workers? What's
downstream?


--
Best Wishes,
Ashutosh Bapat



Re: could sent_lsn be lower than write/flush/replay_lsn?

От
Jaime Casanova
Дата:
On Mon, Dec 29, 2025 at 2:13 AM Ashutosh Bapat
<ashutosh.bapat.oss@gmail.com> wrote:
>
> On Sat, Dec 27, 2025 at 1:18 PM cca5507 <2624345507@qq.com> wrote:
> >
> > The sent_lsn is just where the wal sender currently reading, so it could be lower than
> > write/flush/replay_lsn.
>
> +1.
>
> I guess, the logical replication is restarting in a loop. If that's
> the case, you will find multiple errors happening in the loop. Another
> guess is it's because of the walsender/receiver timeout. Do you see
> timeout error from the corresponding background workers? What's
> downstream?
>

Thanks both of you for clarifying this, it was actually a timeout
error. It seems for some reason all the subscriber got disconnected
from provider and for a problem we had some years ago (when using
pglogical in this same customer) wal_sender_timeout was set to 1
hour... which AFAIU made the wal_sender process keep active for 1 hour
while the subscriber tried to reconnect ans saw a walsender already
connected to another (the oldest already died) PID.

We returned wal_sender_timeout to its original value and everything
started to flow...


--
Jaime Casanova
SYSTEMGUARDS S.A.