Обсуждение: pg_recvlogical: Prevent flushed data from being re-sent after restarting replication

Поиск
Список
Период
Сортировка

pg_recvlogical: Prevent flushed data from being re-sent after restarting replication

От
Fujii Masao
Дата:
Hi,

When pg_recvlogical loses connection, it reconnects and restarts
replication unless
--no-loop option is used. I noticed that in this scenario, data that
has already been
flushed can be re-sent after restarting replication. This happens
because the replication
start position used when restarting replication is taken from the write position
in the last status update message, which may be older than the actual
position of
the last flushed data. As a result, some flushed data could exist newer than
the replication start position and be re-sent. Is this a bug?

To fix this issue, I'd like to propose the attached patch that fixes
this by ensuring
all written data is flushed to disk before restarting replication and by using
the last flushed position as the replication start point. This prevents already
flushed data from being re-sent.

Additionally, when the --no-loop option is used, I found that pg_recvlogical
could previously exit without flushing written data, risking data loss.
The attached patch fixes this issue by also ensuring that all data is flushed
to disk before exiting with --no-loop.

Thought?

Regards,


-- 
Fujii Masao

Вложения

Re: pg_recvlogical: Prevent flushed data from being re-sent after restarting replication

От
Mircea Cadariu
Дата:
Hi Fujii,


I see what you mean. For reviewing I started with writing a test that 
just reproduces the bug and documents the current behaviour.

I expected that by applying your patch the test would fail, but then we 
would just update the test accordingly. Surprisingly the test continues 
to pass.

I attached the test for your consideration. I'll have a look again 
tomorrow at both.


Kind regards,

Mircea Cadariu

Вложения

Re: pg_recvlogical: Prevent flushed data from being re-sent after restarting replication

От
Mircea Cadariu
Дата:
Hi,


Update: I added another test to the attached test-only patch. This new 
test uses pg_terminate_backend to trigger reconnection.

Assuming the tests are fully correct (your input appreciated on this) we 
can use them to validate the solution.


Kind regards,

Mircea Cadariu


Вложения

Re: pg_recvlogical: Prevent flushed data from being re-sent after restarting replication

От
Fujii Masao
Дата:
On Wed, Nov 19, 2025 at 11:36 PM Mircea Cadariu
<cadariu.mircea@gmail.com> wrote:
>
> Hi,
>
>
> Update: I added another test to the attached test-only patch. This new
> test uses pg_terminate_backend to trigger reconnection.
>
> Assuming the tests are fully correct (your input appreciated on this) we
> can use them to validate the solution.

Thanks for testing!

BTW, I reproduced the issue as follows:

#1. Start the server

#2. Start pg_recvlogical:
$ pg_recvlogical -S myslot -d postgres --create-slot --start -f test.out

#3. Insert data every second:
$ psql
=# create table t(i serial);
=# insert into t values(default);
=# \watch 1

#4. In a separate session, terminate walsender to force pg_recvlogical
to restart replication:
$ psql
=# select pg_terminate_backend(pid) from pg_stat_replication;

#5. Wait for pg_recvlogical to restart replication

#6. You will see duplicate records written to the output file, for example:
$ cat test.out
BEGIN 798
table public.t: INSERT: i[integer]:42
COMMIT 798
BEGIN 799
table public.t: INSERT: i[integer]:43
COMMIT 799
BEGIN 792
table public.t: INSERT: i[integer]:36
COMMIT 792
BEGIN 793
table public.t: INSERT: i[integer]:37
COMMIT 793
BEGIN 794
table public.t: INSERT: i[integer]:38
COMMIT 794
BEGIN 795
table public.t: INSERT: i[integer]:39
COMMIT 795
BEGIN 796
table public.t: INSERT: i[integer]:40
COMMIT 796
BEGIN 797
table public.t: INSERT: i[integer]:41
COMMIT 797
BEGIN 798
table public.t: INSERT: i[integer]:42
COMMIT 798
BEGIN 799
table public.t: INSERT: i[integer]:43
COMMIT 799


With the patch applied, these duplicate records no longer appear in
the pg_recvlogical output.

Regards,

--
Fujii Masao



Re: pg_recvlogical: Prevent flushed data from being re-sent after restarting replication

От
Mircea Cadariu
Дата:
On 19/11/2025 14:54, Fujii Masao wrote:

> With the patch applied, these duplicate records no longer appear in
> the pg_recvlogical output.

Thanks! Works like a charm. I confirm duplicates no longer appear with 
the patch applied.

You can consider adding a test about this in "030_pg_recvlogical.pl", 
proposal below:


use IPC::Run qw(start);
my $outfile = $node->basedir . '/reconnect.out';

$node->command_ok(
     [
         'pg_recvlogical',
         '--slot' => 'reconnect_test',
         '--dbname' => $node->connstr('postgres'),
         '--create-slot',
     ],
     'slot created for reconnection test');

$node->safe_psql('postgres', 'CREATE TABLE t(x int);');
$node->safe_psql('postgres', 'INSERT INTO t VALUES (1);');

my $recv = start [
     'pg_recvlogical',
     '--slot', 'reconnect_test',
     '--dbname', $node->connstr('postgres'),
     '--start',
     '--file', $outfile,
     '--fsync-interval', '1',
     '--status-interval', '60',
     '--verbose'
], '>', \my $out, '2>', \my $err;

sleep(3);

my $backend_pid = $node->safe_psql('postgres',
     "SELECT active_pid FROM pg_replication_slots WHERE slot_name = 
'reconnect_test'");

if ($backend_pid ne '')
{
     $node->safe_psql('postgres', "SELECT 
pg_terminate_backend($backend_pid)");
}

sleep(6);

$node->safe_psql('postgres', 'INSERT INTO t VALUES (2);');

sleep(3);

$recv->signal('TERM');
$recv->finish();

open(my $file, '<', $outfile);
my $count = 0;
while (<$file>) {
     if (/INSERT/) {
         $count = $count + 1;
     }
}
close($file);

cmp_ok($count, '==', 2, 'two INSERTs');

$node->command_ok(
     [
         'pg_recvlogical',
         '--slot' => 'reconnect_test',
         '--dbname' => $node->connstr('postgres'),
         '--drop-slot'
     ],
     'reconnect_test slot dropped');


-- 
Regards,
Mircea Cadariu




Re: pg_recvlogical: Prevent flushed data from being re-sent after restarting replication

От
Mircea Cadariu
Дата:
Hi,


An update: I have two topics from the review perspective.

On the test I proposed to be added to 030_pg_recvlogical.pl, I found a 
way to write it without using sleeps (which risk flakyness in CI). I've 
attached it as a patch for your consideration. I checked the test in the 
following way: on master it fails, but with your patch it passes.

Secondly I noticed in function sendFeedback at line 166, the startpos is 
set to output_written_lsn. This seems to counter conceptually the change 
you made in the patch, however it seems to not affect correctness. Shall 
we remove this line as to avoid confusion?

Вложения

Re: pg_recvlogical: Prevent flushed data from being re-sent after restarting replication

От
Fujii Masao
Дата:
On Mon, Nov 24, 2025 at 6:03 PM Mircea Cadariu <cadariu.mircea@gmail.com> wrote:
>
> Hi,
>
>
> An update: I have two topics from the review perspective.
>
> On the test I proposed to be added to 030_pg_recvlogical.pl, I found a
> way to write it without using sleeps (which risk flakyness in CI). I've
> attached it as a patch for your consideration. I checked the test in the
> following way: on master it fails, but with your patch it passes.

Thanks for writing the test case and turning it into a patch. I agree that
we should add a regression test to ensure the reported issue doesn't recur.

It looks like the v1 patch you attached accidentally includes
the patch file itself. Could you remove that?

After applying the patch, git diff --check reported trailing whitespace.
Could you fix that?

+    '--fsync-interval', '1',
+    '--status-interval', '1',

Wouldn't it be safer to use a larger value (e.g., 100) for --status-interval?
With a very small interval, the status feedback might happen before
the walsender is terminated and pg_recvlogical reconnects, which could
prevent the duplicate data from appearing even without the patch.

+use IPC::Run qw(start);
<snip>
+my $recv = start [

For simplicity, would it be better to avoid "use IPC::Run qw(start)" and
just call "IPC::Run::start" directly?

+# Wait only for initial connection
+$node->poll_query_until('postgres',
+    "SELECT active_pid IS NOT NULL FROM pg_replication_slots WHERE
slot_name = 'reconnect_test'");

This is unlikely, but pg_recvlogical's connection could be terminated
immediately after connecting, before receiving any data. If that happens,
the test might behave unexpectedly. To make the test more robust,
should we instead poll on:

        SELECT pg_read_file('$outfile') ~ 'INSERT'

instead, to ensure that some data has actually been received before
terminating the backend?


> Secondly I noticed in function sendFeedback at line 166, the startpos is
> set to output_written_lsn. This seems to counter conceptually the change
> you made in the patch, however it seems to not affect correctness. Shall
> we remove this line as to avoid confusion?

Isn't this necessary when - is specified for --file, causing OutputFsync() to
be skipped?

Regards,

--
Fujii Masao



Re: pg_recvlogical: Prevent flushed data from being re-sent after restarting replication

От
Mircea Cadariu
Дата:
Hi,

On 25/11/2025 17:16, Fujii Masao wrote:
> Thanks for writing the test case and turning it into a patch. I agree that
> we should add a regression test to ensure the reported issue doesn't recur.
Thanks for your feedback, updated patch is attached. Again, I checked 
that it fails in master, but passes with your patch.
> It looks like the v1 patch you attached accidentally includes
> the patch file itself. Could you remove that?
Whoops, not sure what happened there, fixed.
> +    '--fsync-interval', '1',
> +    '--status-interval', '1',
>
> Wouldn't it be safer to use a larger value (e.g., 100) for --status-interval?
> With a very small interval, the status feedback might happen before
> the walsender is terminated and pg_recvlogical reconnects, which could
> prevent the duplicate data from appearing even without the patch.
Yes indeed good one. I actually had it set to 60 in the previous version 
I sent earlier.

> +use IPC::Run qw(start);
> <snip>
> +my $recv = start [
>
> For simplicity, would it be better to avoid "use IPC::Run qw(start)" and
> just call "IPC::Run::start" directly?
Indeed, done.

> +# Wait only for initial connection
> +$node->poll_query_until('postgres',
> +    "SELECT active_pid IS NOT NULL FROM pg_replication_slots WHERE
> slot_name = 'reconnect_test'");
>
> This is unlikely, but pg_recvlogical's connection could be terminated
> immediately after connecting, before receiving any data. If that happens,
> the test might behave unexpectedly. To make the test more robust,
> should we instead poll on:
>
>          SELECT pg_read_file('$outfile') ~ 'INSERT'
>
> instead, to ensure that some data has actually been received before
> terminating the backend?

> +# Wait only for initial connection
> +$node->poll_query_until('postgres',
> +    "SELECT active_pid IS NOT NULL FROM pg_replication_slots WHERE
> slot_name = 'reconnect_test'");
>
> This is unlikely, but pg_recvlogical's connection could be terminated
> immediately after connecting, before receiving any data. If that happens,
> the test might behave unexpectedly. To make the test more robust,
> should we instead poll on:
>
>          SELECT pg_read_file('$outfile') ~ 'INSERT'
>
> instead, to ensure that some data has actually been received before
> terminating the backend?
>
>
>
>> Secondly I noticed in function sendFeedback at line 166, the startpos is
>> set to output_written_lsn. This seems to counter conceptually the change
>> you made in the patch, however it seems to not affect correctness. Shall
>> we remove this line as to avoid confusion?
> Isn't this necessary when - is specified for --file, causing 
> OutputFsync() to be skipped? Regards, 
Yes, for sure. Would really like to avoid introducing flake in CI due to 
this test.

> Isn't this necessary when - is specified for --file, causing 
> OutputFsync() to be skipped? 
Upon another look, indeed. When writing to a regular file (--file -) 
that assignment is redundant but harmless. But like you said, when 
writing to stdout, without that line, startpos would never be updated.

> Additionally, when the --no-loop option is used, I found that 
> pg_recvlogical
> could previously exit without flushing written data, risking data loss.
> The attached patch fixes this issue by also ensuring that all data is 
> flushed
> to disk before exiting with --no-loop.

Should we think of some kind of test also for this part?

-- 
Thanks,
Mircea Cadariu

Вложения

Re: pg_recvlogical: Prevent flushed data from being re-sent after restarting replication

От
Fujii Masao
Дата:
On Wed, Nov 26, 2025 at 10:25 PM Mircea Cadariu
<cadariu.mircea@gmail.com> wrote:
>
> Hi,
>
> On 25/11/2025 17:16, Fujii Masao wrote:
> > Thanks for writing the test case and turning it into a patch. I agree that
> > we should add a regression test to ensure the reported issue doesn't recur.
> Thanks for your feedback, updated patch is attached. Again, I checked
> that it fails in master, but passes with your patch.

Thanks for updating the patch and testing!

I've made a few minor adjustments to the test patch.
The updated version is attached.

Changes include:
- Tweaked and added some comments in the test.
- Ran pgperltidy to clean up the formatting of 030_pg_recvlogical.pl.
- Reused the existing table test_table instead of creating a new table t.
    (While considering a better name for t, I noticed test_table was
already available)
- Used the "option => value" style in IPC::Run::start() for
consistency with other tests.
- Simplified the SQL used to wait for INSERT to appear in
pg_recvlogical's output file.
- Switched from open() to slurp_file(), since other tests use
slurp_file() for reading files.

Thought?


> > Additionally, when the --no-loop option is used, I found that
> > pg_recvlogical
> > could previously exit without flushing written data, risking data loss.
> > The attached patch fixes this issue by also ensuring that all data is
> > flushed
> > to disk before exiting with --no-loop.
>
> Should we think of some kind of test also for this part?

I'm not sure if it's really worth adding such test...

Regards,

--
Fujii Masao

Вложения

Re: pg_recvlogical: Prevent flushed data from being re-sent after restarting replication

От
Mircea Cadariu
Дата:

Hi,

On 28/11/2025 02:15, Fujii Masao wrote:
I've made a few minor adjustments to the test patch.
The updated version is attached.

Thanks for the updated patch! Nice improvements.

Two futher proposals for the current version of the test. 

Shall we use slurp_file then everywhere we need file reads? (instead of pg_read_file)

The following can be seen as nits for your consideration. 

We can consider making the string provided in the "or die" to be consistent with the comment. We can pick one of the options below and specify the same for each.  

* receive and write the first INSERT / receive first INSERT

* establish a new connection / to reconnect

* receive and write  / receive

If we are mentioning multiple INSERTs instead of just one, might read better if we add the "s" at the end. This might be just my preference though, I leave it up to you. 

-- 
Thanks,
Mircea Cadariu

Re:Re: pg_recvlogical: Prevent flushed data from being re-sent after restarting replication

От
"Yilin Zhang"
Дата:
On 28/11/2025 02:15, Fujii Masao <masao(dot)fujii(at)gmail(dot)com> wrote:
> I've made a few minor adjustments to the test patch.
> The updated version is attached.

Hi,
I was reading your code and had a question about the new code you added in the main() function of pg_recvlogical.c:
  if (outfd != -1 && strcmp(outfile, "-") != 0)
   OutputFsync(feGetCurrentTimestamp());
In the stream loop, the StreamLogicalLog() function already contains similar code:
  if (outfd != -1 &&
   feTimestampDifferenceExceeds(output_last_fsync, now,
           fsync_interval))
  {
   if (!OutputFsync(now))
    goto error;
  }

If the outfile becomes unwritable due to external reasons, would the error reporting here be redundant with the error handling in StreamLogicalLog()?

Best regards,
--
Yilin Zhang

Re: pg_recvlogical: Prevent flushed data from being re-sent after restarting replication

От
Fujii Masao
Дата:
On Wed, Dec 3, 2025 at 5:59 PM Mircea Cadariu <cadariu.mircea@gmail.com> wrote:
>
> Hi,
>
> On 28/11/2025 02:15, Fujii Masao wrote:
>
> I've made a few minor adjustments to the test patch.
> The updated version is attached.
>
> Thanks for the updated patch! Nice improvements.
>
> Two futher proposals for the current version of the test.
>
> Shall we use slurp_file then everywhere we need file reads? (instead of pg_read_file)

Maybe it's better to use slurp_file(). We already have wait_for_log() to
wait for a message in the cluster's log file, but there's no helper function
to wait for specific content to appear in an arbitrary file.

To support waiting for output in pg_recvlogical's output file,
I added a new helper that uses slurp_file() (see the attached 0002 patch).
I also updated the 0003 patch (the pg_recvlogical reconnection test) to
use this helper instead of pg_read_file(). Thoughts?


> The following can be seen as nits for your consideration.
>
> We can consider making the string provided in the "or die" to be consistent with the comment. We can pick one of the
optionsbelow and specify the same for each. 
>
> * receive and write the first INSERT / receive first INSERT
>
> * establish a new connection / to reconnect
>
> * receive and write  / receive

As a result of removing poll_query_until() with pg_read_file(), the patch
now contains only one "or die" code. In that case, I chose "to reconnect"
rather than "establish a new connection".


> If we are mentioning multiple INSERTs instead of just one, might read better if we add the "s" at the end. This might
bejust my preference though, I leave it up to you. 

Agreed,  I've applied that change.

Regards,

--
Fujii Masao

Вложения

Re: Re: pg_recvlogical: Prevent flushed data from being re-sent after restarting replication

От
Fujii Masao
Дата:
On Thu, Dec 4, 2025 at 6:41 PM Yilin Zhang <jiezhilove@126.com> wrote:
>
> On 28/11/2025 02:15, Fujii Masao <masao(dot)fujii(at)gmail(dot)com> wrote:
> > I've made a few minor adjustments to the test patch.
> > The updated version is attached.
>
> Hi,
> I was reading your code and had a question about the new code you added in the main() function of pg_recvlogical.c:
>   if (outfd != -1 && strcmp(outfile, "-") != 0)
>    OutputFsync(feGetCurrentTimestamp());
> In the stream loop, the StreamLogicalLog() function already contains similar code:
>   if (outfd != -1 &&
>    feTimestampDifferenceExceeds(output_last_fsync, now,
>            fsync_interval))
>   {
>    if (!OutputFsync(now))
>     goto error;
>   }
>
> If the outfile becomes unwritable due to external reasons, would the error reporting here be redundant with the error
handlingin StreamLogicalLog()? 

Are you suggesting that the existing code checks the return
value of OutputFsync(), but since it never returns false,
that check is unnecessary and can be removed? If so, I agree.
The attached 0004 patch does that.

Regards,

--
Fujii Masao

Вложения

Re: pg_recvlogical: Prevent flushed data from being re-sent after restarting replication

От
Mircea Cadariu
Дата:

Hi, 

Thanks for the patch updates.  

On 26/12/2025 10:28, Fujii Masao wrote:
Maybe it's better to use slurp_file(). We already have wait_for_log() to
wait for a message in the cluster's log file, but there's no helper function
to wait for specific content to appear in an arbitrary file.

To support waiting for output in pg_recvlogical's output file,
I added a new helper that uses slurp_file() (see the attached 0002 patch).
I also updated the 0003 patch (the pg_recvlogical reconnection test) to
use this helper instead of pg_read_file(). Thoughts?

Agreed, nice addition. 

I applied the v3-000* patch set and it builds successfully and passes the tests on my laptop. 

However the CI seems not completely happy yet, with previous 2 runs not green for Windows. Could it be there's an issue with executing the test on Windows? 

-- 
Thanks,
Mircea Cadariu
On Mon, Dec 29, 2025 at 9:45 PM Mircea Cadariu <cadariu.mircea@gmail.com> wrote:
>
> Hi,
>
> Thanks for the patch updates.
>
> On 26/12/2025 10:28, Fujii Masao wrote:
>
> Maybe it's better to use slurp_file(). We already have wait_for_log() to
> wait for a message in the cluster's log file, but there's no helper function
> to wait for specific content to appear in an arbitrary file.
>
> To support waiting for output in pg_recvlogical's output file,
> I added a new helper that uses slurp_file() (see the attached 0002 patch).
> I also updated the 0003 patch (the pg_recvlogical reconnection test) to
> use this helper instead of pg_read_file(). Thoughts?
>
> Agreed, nice addition.
>
> I applied the v3-000* patch set and it builds successfully and passes the tests on my laptop.
>
> However the CI seems not completely happy yet, with previous 2 runs not green for Windows. Could it be there's an
issuewith executing the test on Windows? 

Thanks for the report!

The TAP test failed on Windows because it attempted to terminate
pg_recvlogical using a TERM signal, which isn't available there.
As a result, the test waited indefinitely for pg_recvlogical to exit
and finally timed out.

To address this, I updated the 0003 patch so that the test passes
--endpos to pg_recvlogical on Windows only. This allows pg_recvlogical
to terminate without signals, by generating WAL until the current
position reaches the specified end position. OTOH, on non-Windows
platforms, the test continues to use signals to terminate pg_recvlogical.

This approach may be somewhat unstable. If there's a more robust
way to terminate pg_recvlogical on Windows, I'd be happy to switch
to it, but I couldn't come up with a better option.

Updated patches are attached.

Regards,

--
Fujii Masao

Вложения

Hi Fujii,

On 07/01/2026 03:36, Fujii Masao wrote:
This approach may be somewhat unstable. If there's a more robust
way to terminate pg_recvlogical on Windows, I'd be happy to switch
to it, but I couldn't come up with a better option.

Thanks for the updated patch set!

I applied them and it builds correctly and the test passes on my laptop.

About termination on Windows, I've noticed the following in file 006_logical_decoding.pl: 

# some Windows Perls at least don't like IPC::Run's start/kill_kill regime.
skip "Test fails on Windows perl", 2 if $Config{osname} eq 'MSWin32'; 
It seems that previously, in a similar situation the decision was to skip it on Windows. 

If we adopt the same approach for this test it should be fine, the production code in the patch is not Windows-specific.

Alternatively if we keep the approach in v4, we can consider updating 006_logical_decoding.pl as well so they're consistent. 

What do you think? 

-- 
Thanks,
Mircea Cadariu
On Sat, Jan 10, 2026 at 12:53 AM Mircea Cadariu
<cadariu.mircea@gmail.com> wrote:
> Thanks for the updated patch set!
>
> I applied them and it builds correctly and the test passes on my laptop.

Thanks for the test! Barring any objections, I will commit the patches.


> About termination on Windows, I've noticed the following in file 006_logical_decoding.pl:
>
> # some Windows Perls at least don't like IPC::Run's start/kill_kill regime.
> skip "Test fails on Windows perl", 2 if $Config{osname} eq 'MSWin32';
>
> It seems that previously, in a similar situation the decision was to skip it on Windows.
>
> If we adopt the same approach for this test it should be fine, the production code in the patch is not
Windows-specific.

Yes, I also noticed that Windows-specific handling while investigating
the test failure. Since IMO it's worth verifying that pg_recvlogical behaves
as expected on Windows, I'd prefer the approach taken in the v4 patch.


> Alternatively if we keep the approach in v4, we can consider updating 006_logical_decoding.pl as well so they're
consistent.

That's possible. But TBH I'm not sure how much effort is justified here.
The test uses pg_recvlogical to activate the slot and doesn't really test
pg_recvlogical itself. It's unclear how valuable it is to additionally run
this test on Windows...

Regards,

--
Fujii Masao



Hi Fujii,
At 2026-01-11 17:21:19, "Fujii Masao" <masao.fujii@gmail.com> wrote:
>That's possible. But TBH I'm not sure how much effort is justified here.
>The test uses pg_recvlogical to activate the slot and doesn't really test
>pg_recvlogical itself. It's unclear how valuable it is to additionally run
>this test on Windows...
>
I applied the V4 patch and tested it on a CentOS 7 x86_64 platform. The test steps are as follows:
1. Create a table:
`create table test_id(id integer);`
2. Create a function to close the connection:
`create or replace function test_f(id integer) returns integer as $$
declare
var1 integer;
begin
SELECT active_pid into var1 FROM pg_replication_slots WHERE slot_name = 'reconnect_test';
perform pg_terminate_backend(var1);
return 1;
end; $$ language plpgsql;`
3. Execute the command to receive logs:
`./pg_recvlogical --create-slot --slot reconnect_test --dbname postgres --start --file decoding.out --fsync-interval 200 --status-interval 100 --verbose`
4. Execute the following shell script:
`while true
do
./psql -d postgres<<EOF
select test_f(1);
\q
EOF
done`
5. Execute data insertion using psql:
`insert into test_id values(1);
insert into test_id values(2);`
6. `tail -f decoding.out`
I found duplicate insert statements in the file.
I don't know if this is a problem.
Additionally, I tried moving the two lines involving `Stream LogicalLog` outside the loop
in the `main` function, and then it worked correctly.
`output_written_lsn = InvalidXLogRecPtr;`
`output_fsync_lsn = InvalidXLogRecPtr;`


> On Jan 7, 2026, at 11:36, Fujii Masao <masao.fujii@gmail.com> wrote:
>
> On Mon, Dec 29, 2025 at 9:45 PM Mircea Cadariu <cadariu.mircea@gmail.com> wrote:
>>
>> Hi,
>>
>> Thanks for the patch updates.
>>
>> On 26/12/2025 10:28, Fujii Masao wrote:
>>
>> Maybe it's better to use slurp_file(). We already have wait_for_log() to
>> wait for a message in the cluster's log file, but there's no helper function
>> to wait for specific content to appear in an arbitrary file.
>>
>> To support waiting for output in pg_recvlogical's output file,
>> I added a new helper that uses slurp_file() (see the attached 0002 patch).
>> I also updated the 0003 patch (the pg_recvlogical reconnection test) to
>> use this helper instead of pg_read_file(). Thoughts?
>>
>> Agreed, nice addition.
>>
>> I applied the v3-000* patch set and it builds successfully and passes the tests on my laptop.
>>
>> However the CI seems not completely happy yet, with previous 2 runs not green for Windows. Could it be there's an
issuewith executing the test on Windows? 
>
> Thanks for the report!
>
> The TAP test failed on Windows because it attempted to terminate
> pg_recvlogical using a TERM signal, which isn't available there.
> As a result, the test waited indefinitely for pg_recvlogical to exit
> and finally timed out.
>
> To address this, I updated the 0003 patch so that the test passes
> --endpos to pg_recvlogical on Windows only. This allows pg_recvlogical
> to terminate without signals, by generating WAL until the current
> position reaches the specified end position. OTOH, on non-Windows
> platforms, the test continues to use signals to terminate pg_recvlogical.
>
> This approach may be somewhat unstable. If there's a more robust
> way to terminate pg_recvlogical on Windows, I'd be happy to switch
> to it, but I couldn't come up with a better option.
>
> Updated patches are attached.
>
> Regards,
>
> --
> Fujii Masao
>
<v4-0004-pg_recvlogical-remove-unnecessary-OutputFsync-ret.patch><v4-0003-Add-test-for-pg_recvlogical-reconnection-behavior.patch><v4-0001-pg_recvlogical-Prevent-flushed-data-from-being-re.patch><v4-0002-Add-a-new-helper-function-wait_for_file-to-Utils..patch>

Hi Fujii-san,

Thanks for the patch. Here are my comments on v4.

1 - 0001
```
+    /*
+     * Save the last flushed position as the replication start point. On
+     * reconnect, replication resumes from there to avoid re-sending flushed
+     * data.
+     */
+    startpos = output_fsync_lsn;
```

Looking at function OutputFsync(), fsync() may fail and there a few branches to return early without fsync(), so should
weonly update startpos after fsync()? 

2 - 0001
```
+        if (outfd != -1 && strcmp(outfile, "-") != 0)
+            OutputFsync(feGetCurrentTimestamp());
```

Do we need to check return code of OutputFsync()? I checked this file, the only caller that doesn’t check return code
ofOutputFsync() has a comment for why: 
```
        /* no need to jump to error on failure here, we're finishing anyway */
        OutputFsync(t);
```

I saw 0004 has changed OutputFsync to return nothing. So, it’s ok to not adding a comment here. But I just feel that,
ifwe want to make the commit self-contained, it would be better to add a comment here, but that’s not a strong opinion. 


3 - 0001

No a direct issue of this patch. I noticed that, everywhere that calls OutputFsync(), it checks outfd != -1 &&
strcmp(outfile,"-") != 0. However, two places miss the check: 

```
        if (outfd != -1 &&
            feTimestampDifferenceExceeds(output_last_fsync, now,
                                         fsync_interval))
        {
            if (!OutputFsync(now))
                goto error;
        }
```
This place doesn’t check strcmp(outfile, "-") != 0.

```
static bool
flushAndSendFeedback(PGconn *conn, TimestampTz *now)
{
    /* flush data to disk, so that we send a recent flush pointer */
    if (!OutputFsync(*now))
        return false;
    *now = feGetCurrentTimestamp();
    if (!sendFeedback(conn, *now, true, false))
        return false;

    return true;
}
```
flushAndSendFeedback() doesn’t check the both conditions. I don’t why the checks can be skipped.

I want to hear your opinion. If you consider this is a problem, then you may address it in this patch; or you want me
toaddress it in a separate patch? 

4 - 0002
```
+    croak "timed out waiting for match: $regexp”;
```

Is it more helpful to include filename in the error message?

5 - 0003
```
+my ($stdout, $stderr);
+my $recv = IPC::Run::start(
+    [@pg_recvlogical_cmd],
+    '>' => \$stdout,
+    '2>' => \$stderr);
```

$stdout and $stderr are never used.

6 - 0004 LGTM. As OutputFsync() only returned true, making it as void makes the code neater.

Best regards,
--
Chao Li (Evan)
HighGo Software Co., Ltd.
https://www.highgo.com/







On Sun, Jan 11, 2026 at 6:49 PM Dewei Dai <daidewei1970@163.com> wrote:
>
> Hi Fujii,
>
> At 2026-01-11 17:21:19, "Fujii Masao" <masao.fujii@gmail.com> wrote:
> >That's possible. But TBH I'm not sure how much effort is justified here.
> >The test uses pg_recvlogical to activate the slot and doesn't really test
> >pg_recvlogical itself. It's unclear how valuable it is to additionally run
> >this test on Windows...
> >
> I applied the V4 patch and tested it on a CentOS 7 x86_64 platform. The test steps are as follows:
>
> 1. Create a table:
> `create table test_id(id integer);`
> 2. Create a function to close the connection:
> `create or replace function test_f(id integer) returns integer as $$
> declare
>  var1 integer;
> begin
>   SELECT active_pid into var1 FROM pg_replication_slots WHERE slot_name = 'reconnect_test';
>   perform pg_terminate_backend(var1);
>   return 1;
> end; $$ language plpgsql;`
>
> 3. Execute the command to receive logs:
> `./pg_recvlogical --create-slot --slot reconnect_test --dbname postgres --start --file decoding.out --fsync-interval
200--status-interval 100 --verbose` 
> 4. Execute the following shell script:
> `while true
> do
>  ./psql -d postgres<<EOF
> select test_f(1);
> \q
> EOF
> done`
>
> 5. Execute data insertion using psql:
> `insert into test_id values(1);
> insert into test_id values(2);`
> 6. `tail -f decoding.out`
> I found duplicate insert statements in the file.
> I don't know if this is a problem.
> Additionally, I tried moving the two lines involving `Stream LogicalLog` outside the loop
>  in the `main` function, and then it worked correctly.
> `output_written_lsn = InvalidXLogRecPtr;`
> `output_fsync_lsn = InvalidXLogRecPtr;`

Thanks for the test and the investigation!

I was able to reproduce the issue as well. It occurs when the pg_recvlogical
connection is terminated before it has received any messages. The problematic
sequence is roughly:

1. The pg_recvlogical connection is terminated after running for some time.
2. StreamLogicalLog() is called again and initializes
output_written_lsn to InvalidXLogRecPtr.
3. pg_recvlogical reconnects and starts replication from valid startpos.
4. The connection is terminated again.
5. StreamLogicalLog() exits and OutputFsync() sets startpos to
output_written_lsn (i.e., InvalidXLogRecPtr).

As a result, the next StreamLogicalLog() starts replication with
startpos = InvalidXLogRecPtr, which can cause the server to resend
already-streamed data and lead to duplicate output.

The root cause is that StreamLogicalLog() reinitializes output_written_lsn and
output_fsync_lsn on every call. As you suggested, removing that initialization
fixes the issue.

I’ve updated the 0001 patch accordingly.

Attached is the updated version of the patches.

Regards,

--
Fujii Masao

Вложения
On Mon, Jan 12, 2026 at 4:08 PM Chao Li <li.evan.chao@gmail.com> wrote:
> Thanks for the patch. Here are my comments on v4.

Thanks for the review!


> 1 - 0001
> ```
> +       /*
> +        * Save the last flushed position as the replication start point. On
> +        * reconnect, replication resumes from there to avoid re-sending flushed
> +        * data.
> +        */
> +       startpos = output_fsync_lsn;
> ```
>
> Looking at function OutputFsync(), fsync() may fail and there a few branches to return early without fsync(), so
shouldwe only update startpos after fsync()? 

Maybe not, but I might be missing something. Could you clarify what
concrete scenario would be problematic with the current code?


> 2 - 0001
> ```
> +               if (outfd != -1 && strcmp(outfile, "-") != 0)
> +                       OutputFsync(feGetCurrentTimestamp());
> ```
>
> Do we need to check return code of OutputFsync()? I checked this file, the only caller that doesn’t check return code
ofOutputFsync() has a comment for why: 
> ```
>         /* no need to jump to error on failure here, we're finishing anyway */
>         OutputFsync(t);
> ```
>
> I saw 0004 has changed OutputFsync to return nothing. So, it’s ok to not adding a comment here. But I just feel that,
ifwe want to make the commit self-contained, it would be better to add a comment here, but that’s not a strong opinion. 

Yeah, I think that making the patch "self-contained" in that sense isn't
really worth the extra effort.


> 3 - 0001
>
> No a direct issue of this patch. I noticed that, everywhere that calls OutputFsync(), it checks outfd != -1 &&
strcmp(outfile,"-") != 0. However, two places miss the check: 

The 0001 patch updates pg_recvlogical to call OutputFsync() before
restarting replication. That call is guarded by strcmp(outfile, "-") != 0.
Your comment makes me reconsider this: when "--file -" is used,
OutputFsync() would be skipped, so startpos would not be updated before
restarting replication. That can lead to duplicate output on stdout,
which is clearly problematic. For this reason, I removed that check
in the latest 0001 patch.

In other places where we check strcmp(outfile, "-") != 0, such as:

if (outfd != -1 && output_reopen && strcmp(outfile, "-") != 0)
{
now = feGetCurrentTimestamp();
OutputFsync(now);
close(outfd);
outfd = -1;
}

on second thought, the check seems necessary for close(outfd) and
"outfd = -1", but not for calling OutputFsync(). If that understanding is
correct, it might make sense to adjust where the check is applied.
However, I think that should be handled in a separate patch.


> 4 - 0002
> ```
> +       croak "timed out waiting for match: $regexp”;
> ```
>
> Is it more helpful to include filename in the error message?

OK, I've updated the message to include the filename.


> 5 - 0003
> ```
> +my ($stdout, $stderr);
> +my $recv = IPC::Run::start(
> +       [@pg_recvlogical_cmd],
> +       '>' => \$stdout,
> +       '2>' => \$stderr);
> ```
>
> $stdout and $stderr are never used.

Yes, but I'm fine with keeping them as they are.

I've attached the updated version of the patches upthread.

Regards,

--
Fujii Masao




> On Jan 14, 2026, at 09:26, Fujii Masao <masao.fujii@gmail.com> wrote:
>
> On Mon, Jan 12, 2026 at 4:08 PM Chao Li <li.evan.chao@gmail.com> wrote:
>> Thanks for the patch. Here are my comments on v4.
>
> Thanks for the review!
>
>
>> 1 - 0001
>> ```
>> +       /*
>> +        * Save the last flushed position as the replication start point. On
>> +        * reconnect, replication resumes from there to avoid re-sending flushed
>> +        * data.
>> +        */
>> +       startpos = output_fsync_lsn;
>> ```
>>
>> Looking at function OutputFsync(), fsync() may fail and there a few branches to return early without fsync(), so
shouldwe only update startpos after fsync()? 
>
> Maybe not, but I might be missing something. Could you clarify what
> concrete scenario would be problematic with the current code?
>

I just reviewed the patch again, and I think I was wrong wrt this comment:

* If fsync() fails, the process will fail out, no reconnect will happen, so wether or not updating startpos doesn’t
matter;
* if (fsync_interval <= 0), fsync is not required, but we still need to update startpos
* if (!output_needs_fsync), meaning nothing new to fsync, but we still need to update startpos if startpos has not been
updated

So, I withdraw this comment.

V5 LGTM.

Best regards,
--
Chao Li (Evan)
HighGo Software Co., Ltd.
https://www.highgo.com/







On Wed, Jan 14, 2026 at 5:47 PM Chao Li <li.evan.chao@gmail.com> wrote:
>
>
>
> > On Jan 14, 2026, at 09:26, Fujii Masao <masao.fujii@gmail.com> wrote:
> >
> > On Mon, Jan 12, 2026 at 4:08 PM Chao Li <li.evan.chao@gmail.com> wrote:
> >> Thanks for the patch. Here are my comments on v4.
> >
> > Thanks for the review!
> >
> >
> >> 1 - 0001
> >> ```
> >> +       /*
> >> +        * Save the last flushed position as the replication start point. On
> >> +        * reconnect, replication resumes from there to avoid re-sending flushed
> >> +        * data.
> >> +        */
> >> +       startpos = output_fsync_lsn;
> >> ```
> >>
> >> Looking at function OutputFsync(), fsync() may fail and there a few branches to return early without fsync(), so
shouldwe only update startpos after fsync()? 
> >
> > Maybe not, but I might be missing something. Could you clarify what
> > concrete scenario would be problematic with the current code?
> >
>
> I just reviewed the patch again, and I think I was wrong wrt this comment:
>
> * If fsync() fails, the process will fail out, no reconnect will happen, so wether or not updating startpos doesn’t
matter;
> * if (fsync_interval <= 0), fsync is not required, but we still need to update startpos
> * if (!output_needs_fsync), meaning nothing new to fsync, but we still need to update startpos if startpos has not
beenupdated 
>
> So, I withdraw this comment.
>
> V5 LGTM.

Thanks for the review! I've pushed the patches.

Regards,

--
Fujii Masao