Обсуждение: pg_recvlogical: Prevent flushed data from being re-sent after restarting replication
pg_recvlogical: Prevent flushed data from being re-sent after restarting replication
Hi, When pg_recvlogical loses connection, it reconnects and restarts replication unless --no-loop option is used. I noticed that in this scenario, data that has already been flushed can be re-sent after restarting replication. This happens because the replication start position used when restarting replication is taken from the write position in the last status update message, which may be older than the actual position of the last flushed data. As a result, some flushed data could exist newer than the replication start position and be re-sent. Is this a bug? To fix this issue, I'd like to propose the attached patch that fixes this by ensuring all written data is flushed to disk before restarting replication and by using the last flushed position as the replication start point. This prevents already flushed data from being re-sent. Additionally, when the --no-loop option is used, I found that pg_recvlogical could previously exit without flushing written data, risking data loss. The attached patch fixes this issue by also ensuring that all data is flushed to disk before exiting with --no-loop. Thought? Regards, -- Fujii Masao
Вложения
Re: pg_recvlogical: Prevent flushed data from being re-sent after restarting replication
Hi Fujii, I see what you mean. For reviewing I started with writing a test that just reproduces the bug and documents the current behaviour. I expected that by applying your patch the test would fail, but then we would just update the test accordingly. Surprisingly the test continues to pass. I attached the test for your consideration. I'll have a look again tomorrow at both. Kind regards, Mircea Cadariu
Вложения
Re: pg_recvlogical: Prevent flushed data from being re-sent after restarting replication
Hi, Update: I added another test to the attached test-only patch. This new test uses pg_terminate_backend to trigger reconnection. Assuming the tests are fully correct (your input appreciated on this) we can use them to validate the solution. Kind regards, Mircea Cadariu
Вложения
On Wed, Nov 19, 2025 at 11:36 PM Mircea Cadariu <cadariu.mircea@gmail.com> wrote: > > Hi, > > > Update: I added another test to the attached test-only patch. This new > test uses pg_terminate_backend to trigger reconnection. > > Assuming the tests are fully correct (your input appreciated on this) we > can use them to validate the solution. Thanks for testing! BTW, I reproduced the issue as follows: #1. Start the server #2. Start pg_recvlogical: $ pg_recvlogical -S myslot -d postgres --create-slot --start -f test.out #3. Insert data every second: $ psql =# create table t(i serial); =# insert into t values(default); =# \watch 1 #4. In a separate session, terminate walsender to force pg_recvlogical to restart replication: $ psql =# select pg_terminate_backend(pid) from pg_stat_replication; #5. Wait for pg_recvlogical to restart replication #6. You will see duplicate records written to the output file, for example: $ cat test.out BEGIN 798 table public.t: INSERT: i[integer]:42 COMMIT 798 BEGIN 799 table public.t: INSERT: i[integer]:43 COMMIT 799 BEGIN 792 table public.t: INSERT: i[integer]:36 COMMIT 792 BEGIN 793 table public.t: INSERT: i[integer]:37 COMMIT 793 BEGIN 794 table public.t: INSERT: i[integer]:38 COMMIT 794 BEGIN 795 table public.t: INSERT: i[integer]:39 COMMIT 795 BEGIN 796 table public.t: INSERT: i[integer]:40 COMMIT 796 BEGIN 797 table public.t: INSERT: i[integer]:41 COMMIT 797 BEGIN 798 table public.t: INSERT: i[integer]:42 COMMIT 798 BEGIN 799 table public.t: INSERT: i[integer]:43 COMMIT 799 With the patch applied, these duplicate records no longer appear in the pg_recvlogical output. Regards, -- Fujii Masao
Re: pg_recvlogical: Prevent flushed data from being re-sent after restarting replication
On 19/11/2025 14:54, Fujii Masao wrote:
> With the patch applied, these duplicate records no longer appear in
> the pg_recvlogical output.
Thanks! Works like a charm. I confirm duplicates no longer appear with
the patch applied.
You can consider adding a test about this in "030_pg_recvlogical.pl",
proposal below:
use IPC::Run qw(start);
my $outfile = $node->basedir . '/reconnect.out';
$node->command_ok(
[
'pg_recvlogical',
'--slot' => 'reconnect_test',
'--dbname' => $node->connstr('postgres'),
'--create-slot',
],
'slot created for reconnection test');
$node->safe_psql('postgres', 'CREATE TABLE t(x int);');
$node->safe_psql('postgres', 'INSERT INTO t VALUES (1);');
my $recv = start [
'pg_recvlogical',
'--slot', 'reconnect_test',
'--dbname', $node->connstr('postgres'),
'--start',
'--file', $outfile,
'--fsync-interval', '1',
'--status-interval', '60',
'--verbose'
], '>', \my $out, '2>', \my $err;
sleep(3);
my $backend_pid = $node->safe_psql('postgres',
"SELECT active_pid FROM pg_replication_slots WHERE slot_name =
'reconnect_test'");
if ($backend_pid ne '')
{
$node->safe_psql('postgres', "SELECT
pg_terminate_backend($backend_pid)");
}
sleep(6);
$node->safe_psql('postgres', 'INSERT INTO t VALUES (2);');
sleep(3);
$recv->signal('TERM');
$recv->finish();
open(my $file, '<', $outfile);
my $count = 0;
while (<$file>) {
if (/INSERT/) {
$count = $count + 1;
}
}
close($file);
cmp_ok($count, '==', 2, 'two INSERTs');
$node->command_ok(
[
'pg_recvlogical',
'--slot' => 'reconnect_test',
'--dbname' => $node->connstr('postgres'),
'--drop-slot'
],
'reconnect_test slot dropped');
--
Regards,
Mircea Cadariu
Re: pg_recvlogical: Prevent flushed data from being re-sent after restarting replication
Hi, An update: I have two topics from the review perspective. On the test I proposed to be added to 030_pg_recvlogical.pl, I found a way to write it without using sleeps (which risk flakyness in CI). I've attached it as a patch for your consideration. I checked the test in the following way: on master it fails, but with your patch it passes. Secondly I noticed in function sendFeedback at line 166, the startpos is set to output_written_lsn. This seems to counter conceptually the change you made in the patch, however it seems to not affect correctness. Shall we remove this line as to avoid confusion?
Вложения
On Mon, Nov 24, 2025 at 6:03 PM Mircea Cadariu <cadariu.mircea@gmail.com> wrote:
>
> Hi,
>
>
> An update: I have two topics from the review perspective.
>
> On the test I proposed to be added to 030_pg_recvlogical.pl, I found a
> way to write it without using sleeps (which risk flakyness in CI). I've
> attached it as a patch for your consideration. I checked the test in the
> following way: on master it fails, but with your patch it passes.
Thanks for writing the test case and turning it into a patch. I agree that
we should add a regression test to ensure the reported issue doesn't recur.
It looks like the v1 patch you attached accidentally includes
the patch file itself. Could you remove that?
After applying the patch, git diff --check reported trailing whitespace.
Could you fix that?
+ '--fsync-interval', '1',
+ '--status-interval', '1',
Wouldn't it be safer to use a larger value (e.g., 100) for --status-interval?
With a very small interval, the status feedback might happen before
the walsender is terminated and pg_recvlogical reconnects, which could
prevent the duplicate data from appearing even without the patch.
+use IPC::Run qw(start);
<snip>
+my $recv = start [
For simplicity, would it be better to avoid "use IPC::Run qw(start)" and
just call "IPC::Run::start" directly?
+# Wait only for initial connection
+$node->poll_query_until('postgres',
+ "SELECT active_pid IS NOT NULL FROM pg_replication_slots WHERE
slot_name = 'reconnect_test'");
This is unlikely, but pg_recvlogical's connection could be terminated
immediately after connecting, before receiving any data. If that happens,
the test might behave unexpectedly. To make the test more robust,
should we instead poll on:
SELECT pg_read_file('$outfile') ~ 'INSERT'
instead, to ensure that some data has actually been received before
terminating the backend?
> Secondly I noticed in function sendFeedback at line 166, the startpos is
> set to output_written_lsn. This seems to counter conceptually the change
> you made in the patch, however it seems to not affect correctness. Shall
> we remove this line as to avoid confusion?
Isn't this necessary when - is specified for --file, causing OutputFsync() to
be skipped?
Regards,
--
Fujii Masao
Re: pg_recvlogical: Prevent flushed data from being re-sent after restarting replication
Hi,
On 25/11/2025 17:16, Fujii Masao wrote:
> Thanks for writing the test case and turning it into a patch. I agree that
> we should add a regression test to ensure the reported issue doesn't recur.
Thanks for your feedback, updated patch is attached. Again, I checked
that it fails in master, but passes with your patch.
> It looks like the v1 patch you attached accidentally includes
> the patch file itself. Could you remove that?
Whoops, not sure what happened there, fixed.
> + '--fsync-interval', '1',
> + '--status-interval', '1',
>
> Wouldn't it be safer to use a larger value (e.g., 100) for --status-interval?
> With a very small interval, the status feedback might happen before
> the walsender is terminated and pg_recvlogical reconnects, which could
> prevent the duplicate data from appearing even without the patch.
Yes indeed good one. I actually had it set to 60 in the previous version
I sent earlier.
> +use IPC::Run qw(start);
> <snip>
> +my $recv = start [
>
> For simplicity, would it be better to avoid "use IPC::Run qw(start)" and
> just call "IPC::Run::start" directly?
Indeed, done.
> +# Wait only for initial connection
> +$node->poll_query_until('postgres',
> + "SELECT active_pid IS NOT NULL FROM pg_replication_slots WHERE
> slot_name = 'reconnect_test'");
>
> This is unlikely, but pg_recvlogical's connection could be terminated
> immediately after connecting, before receiving any data. If that happens,
> the test might behave unexpectedly. To make the test more robust,
> should we instead poll on:
>
> SELECT pg_read_file('$outfile') ~ 'INSERT'
>
> instead, to ensure that some data has actually been received before
> terminating the backend?
> +# Wait only for initial connection
> +$node->poll_query_until('postgres',
> + "SELECT active_pid IS NOT NULL FROM pg_replication_slots WHERE
> slot_name = 'reconnect_test'");
>
> This is unlikely, but pg_recvlogical's connection could be terminated
> immediately after connecting, before receiving any data. If that happens,
> the test might behave unexpectedly. To make the test more robust,
> should we instead poll on:
>
> SELECT pg_read_file('$outfile') ~ 'INSERT'
>
> instead, to ensure that some data has actually been received before
> terminating the backend?
>
>
>
>> Secondly I noticed in function sendFeedback at line 166, the startpos is
>> set to output_written_lsn. This seems to counter conceptually the change
>> you made in the patch, however it seems to not affect correctness. Shall
>> we remove this line as to avoid confusion?
> Isn't this necessary when - is specified for --file, causing
> OutputFsync() to be skipped? Regards,
Yes, for sure. Would really like to avoid introducing flake in CI due to
this test.
> Isn't this necessary when - is specified for --file, causing
> OutputFsync() to be skipped?
Upon another look, indeed. When writing to a regular file (--file -)
that assignment is redundant but harmless. But like you said, when
writing to stdout, without that line, startpos would never be updated.
> Additionally, when the --no-loop option is used, I found that
> pg_recvlogical
> could previously exit without flushing written data, risking data loss.
> The attached patch fixes this issue by also ensuring that all data is
> flushed
> to disk before exiting with --no-loop.
Should we think of some kind of test also for this part?
--
Thanks,
Mircea Cadariu
Вложения
On Wed, Nov 26, 2025 at 10:25 PM Mircea Cadariu
<cadariu.mircea@gmail.com> wrote:
>
> Hi,
>
> On 25/11/2025 17:16, Fujii Masao wrote:
> > Thanks for writing the test case and turning it into a patch. I agree that
> > we should add a regression test to ensure the reported issue doesn't recur.
> Thanks for your feedback, updated patch is attached. Again, I checked
> that it fails in master, but passes with your patch.
Thanks for updating the patch and testing!
I've made a few minor adjustments to the test patch.
The updated version is attached.
Changes include:
- Tweaked and added some comments in the test.
- Ran pgperltidy to clean up the formatting of 030_pg_recvlogical.pl.
- Reused the existing table test_table instead of creating a new table t.
(While considering a better name for t, I noticed test_table was
already available)
- Used the "option => value" style in IPC::Run::start() for
consistency with other tests.
- Simplified the SQL used to wait for INSERT to appear in
pg_recvlogical's output file.
- Switched from open() to slurp_file(), since other tests use
slurp_file() for reading files.
Thought?
> > Additionally, when the --no-loop option is used, I found that
> > pg_recvlogical
> > could previously exit without flushing written data, risking data loss.
> > The attached patch fixes this issue by also ensuring that all data is
> > flushed
> > to disk before exiting with --no-loop.
>
> Should we think of some kind of test also for this part?
I'm not sure if it's really worth adding such test...
Regards,
--
Fujii Masao
Вложения
Re: pg_recvlogical: Prevent flushed data from being re-sent after restarting replication
Hi,
I've made a few minor adjustments to the test patch. The updated version is attached.
Thanks for the updated patch! Nice improvements.
Two futher proposals for the current version of the test.
Shall we use slurp_file then everywhere we need file reads? (instead of pg_read_file)
The following can be seen as nits for your consideration.
We can consider making the string provided in the "or die" to be consistent with the comment. We can pick one of the options below and specify the same for each.
* receive and write the first INSERT / receive first INSERT
* establish a new connection / to reconnect
* receive and write / receive
If we are mentioning multiple INSERTs instead of just one, might read better if we add the "s" at the end. This might be just my preference though, I leave it up to you.
-- Thanks, Mircea Cadariu
Re:Re: pg_recvlogical: Prevent flushed data from being re-sent after restarting replication
On Wed, Dec 3, 2025 at 5:59 PM Mircea Cadariu <cadariu.mircea@gmail.com> wrote: > > Hi, > > On 28/11/2025 02:15, Fujii Masao wrote: > > I've made a few minor adjustments to the test patch. > The updated version is attached. > > Thanks for the updated patch! Nice improvements. > > Two futher proposals for the current version of the test. > > Shall we use slurp_file then everywhere we need file reads? (instead of pg_read_file) Maybe it's better to use slurp_file(). We already have wait_for_log() to wait for a message in the cluster's log file, but there's no helper function to wait for specific content to appear in an arbitrary file. To support waiting for output in pg_recvlogical's output file, I added a new helper that uses slurp_file() (see the attached 0002 patch). I also updated the 0003 patch (the pg_recvlogical reconnection test) to use this helper instead of pg_read_file(). Thoughts? > The following can be seen as nits for your consideration. > > We can consider making the string provided in the "or die" to be consistent with the comment. We can pick one of the optionsbelow and specify the same for each. > > * receive and write the first INSERT / receive first INSERT > > * establish a new connection / to reconnect > > * receive and write / receive As a result of removing poll_query_until() with pg_read_file(), the patch now contains only one "or die" code. In that case, I chose "to reconnect" rather than "establish a new connection". > If we are mentioning multiple INSERTs instead of just one, might read better if we add the "s" at the end. This might bejust my preference though, I leave it up to you. Agreed, I've applied that change. Regards, -- Fujii Masao
Вложения
On Thu, Dec 4, 2025 at 6:41 PM Yilin Zhang <jiezhilove@126.com> wrote:
>
> On 28/11/2025 02:15, Fujii Masao <masao(dot)fujii(at)gmail(dot)com> wrote:
> > I've made a few minor adjustments to the test patch.
> > The updated version is attached.
>
> Hi,
> I was reading your code and had a question about the new code you added in the main() function of pg_recvlogical.c:
> if (outfd != -1 && strcmp(outfile, "-") != 0)
> OutputFsync(feGetCurrentTimestamp());
> In the stream loop, the StreamLogicalLog() function already contains similar code:
> if (outfd != -1 &&
> feTimestampDifferenceExceeds(output_last_fsync, now,
> fsync_interval))
> {
> if (!OutputFsync(now))
> goto error;
> }
>
> If the outfile becomes unwritable due to external reasons, would the error reporting here be redundant with the error
handlingin StreamLogicalLog()?
Are you suggesting that the existing code checks the return
value of OutputFsync(), but since it never returns false,
that check is unnecessary and can be removed? If so, I agree.
The attached 0004 patch does that.
Regards,
--
Fujii Masao
Вложения
Re: pg_recvlogical: Prevent flushed data from being re-sent after restarting replication
Hi,
Thanks for the patch updates.
Maybe it's better to use slurp_file(). We already have wait_for_log() to wait for a message in the cluster's log file, but there's no helper function to wait for specific content to appear in an arbitrary file. To support waiting for output in pg_recvlogical's output file, I added a new helper that uses slurp_file() (see the attached 0002 patch). I also updated the 0003 patch (the pg_recvlogical reconnection test) to use this helper instead of pg_read_file(). Thoughts?
Agreed, nice addition.
I applied the v3-000* patch set and it builds successfully and passes the tests on my laptop.
However the CI seems not completely happy yet, with previous 2 runs not green for Windows. Could it be there's an issue with executing the test on Windows?
-- Thanks, Mircea Cadariu
On Mon, Dec 29, 2025 at 9:45 PM Mircea Cadariu <cadariu.mircea@gmail.com> wrote: > > Hi, > > Thanks for the patch updates. > > On 26/12/2025 10:28, Fujii Masao wrote: > > Maybe it's better to use slurp_file(). We already have wait_for_log() to > wait for a message in the cluster's log file, but there's no helper function > to wait for specific content to appear in an arbitrary file. > > To support waiting for output in pg_recvlogical's output file, > I added a new helper that uses slurp_file() (see the attached 0002 patch). > I also updated the 0003 patch (the pg_recvlogical reconnection test) to > use this helper instead of pg_read_file(). Thoughts? > > Agreed, nice addition. > > I applied the v3-000* patch set and it builds successfully and passes the tests on my laptop. > > However the CI seems not completely happy yet, with previous 2 runs not green for Windows. Could it be there's an issuewith executing the test on Windows? Thanks for the report! The TAP test failed on Windows because it attempted to terminate pg_recvlogical using a TERM signal, which isn't available there. As a result, the test waited indefinitely for pg_recvlogical to exit and finally timed out. To address this, I updated the 0003 patch so that the test passes --endpos to pg_recvlogical on Windows only. This allows pg_recvlogical to terminate without signals, by generating WAL until the current position reaches the specified end position. OTOH, on non-Windows platforms, the test continues to use signals to terminate pg_recvlogical. This approach may be somewhat unstable. If there's a more robust way to terminate pg_recvlogical on Windows, I'd be happy to switch to it, but I couldn't come up with a better option. Updated patches are attached. Regards, -- Fujii Masao
Вложения
Re: pg_recvlogical: Prevent flushed data from being re-sent after restarting replication
Hi Fujii,
This approach may be somewhat unstable. If there's a more robust way to terminate pg_recvlogical on Windows, I'd be happy to switch to it, but I couldn't come up with a better option.
Thanks for the updated patch set!
I applied them and it builds correctly and the test passes on my laptop.
About termination on Windows, I've noticed the following in file 006_logical_decoding.pl:
# some Windows Perls at least don't like IPC::Run's start/kill_kill regime.
skip "Test fails on Windows perl", 2 if $Config{osname} eq 'MSWin32'; It seems that previously, in a similar situation the decision was to skip it on Windows. If we adopt the same approach for this test it should be fine, the production code in the patch is not Windows-specific.
Alternatively if we keep the approach in v4, we can consider updating 006_logical_decoding.pl as well so they're consistent.
What do you think?
-- Thanks, Mircea Cadariu
On Sat, Jan 10, 2026 at 12:53 AM Mircea Cadariu
<cadariu.mircea@gmail.com> wrote:
> Thanks for the updated patch set!
>
> I applied them and it builds correctly and the test passes on my laptop.
Thanks for the test! Barring any objections, I will commit the patches.
> About termination on Windows, I've noticed the following in file 006_logical_decoding.pl:
>
> # some Windows Perls at least don't like IPC::Run's start/kill_kill regime.
> skip "Test fails on Windows perl", 2 if $Config{osname} eq 'MSWin32';
>
> It seems that previously, in a similar situation the decision was to skip it on Windows.
>
> If we adopt the same approach for this test it should be fine, the production code in the patch is not
Windows-specific.
Yes, I also noticed that Windows-specific handling while investigating
the test failure. Since IMO it's worth verifying that pg_recvlogical behaves
as expected on Windows, I'd prefer the approach taken in the v4 patch.
> Alternatively if we keep the approach in v4, we can consider updating 006_logical_decoding.pl as well so they're
consistent.
That's possible. But TBH I'm not sure how much effort is justified here.
The test uses pg_recvlogical to activate the slot and doesn't really test
pg_recvlogical itself. It's unclear how valuable it is to additionally run
this test on Windows...
Regards,
--
Fujii Masao
At 2026-01-11 17:21:19, "Fujii Masao" <masao.fujii@gmail.com> wrote: >That's possible. But TBH I'm not sure how much effort is justified here. >The test uses pg_recvlogical to activate the slot and doesn't really test >pg_recvlogical itself. It's unclear how valuable it is to additionally run >this test on Windows...>I applied the V4 patch and tested it on a CentOS 7 x86_64 platform. The test steps are as follows:1. Create a table:`create table test_id(id integer);`2. Create a function to close the connection:`create or replace function test_f(id integer) returns integer as $$declarevar1 integer;beginSELECT active_pid into var1 FROM pg_replication_slots WHERE slot_name = 'reconnect_test';perform pg_terminate_backend(var1);return 1;end; $$ language plpgsql;`3. Execute the command to receive logs:`./pg_recvlogical --create-slot --slot reconnect_test --dbname postgres --start --file decoding.out --fsync-interval 200 --status-interval 100 --verbose`4. Execute the following shell script:`while truedo./psql -d postgres<<EOFselect test_f(1);\qEOFdone`5. Execute data insertion using psql:`insert into test_id values(1);insert into test_id values(2);`6. `tail -f decoding.out`I found duplicate insert statements in the file.I don't know if this is a problem.Additionally, I tried moving the two lines involving `Stream LogicalLog` outside the loopin the `main` function, and then it worked correctly.`output_written_lsn = InvalidXLogRecPtr;``output_fsync_lsn = InvalidXLogRecPtr;`
> On Jan 7, 2026, at 11:36, Fujii Masao <masao.fujii@gmail.com> wrote:
>
> On Mon, Dec 29, 2025 at 9:45 PM Mircea Cadariu <cadariu.mircea@gmail.com> wrote:
>>
>> Hi,
>>
>> Thanks for the patch updates.
>>
>> On 26/12/2025 10:28, Fujii Masao wrote:
>>
>> Maybe it's better to use slurp_file(). We already have wait_for_log() to
>> wait for a message in the cluster's log file, but there's no helper function
>> to wait for specific content to appear in an arbitrary file.
>>
>> To support waiting for output in pg_recvlogical's output file,
>> I added a new helper that uses slurp_file() (see the attached 0002 patch).
>> I also updated the 0003 patch (the pg_recvlogical reconnection test) to
>> use this helper instead of pg_read_file(). Thoughts?
>>
>> Agreed, nice addition.
>>
>> I applied the v3-000* patch set and it builds successfully and passes the tests on my laptop.
>>
>> However the CI seems not completely happy yet, with previous 2 runs not green for Windows. Could it be there's an
issuewith executing the test on Windows?
>
> Thanks for the report!
>
> The TAP test failed on Windows because it attempted to terminate
> pg_recvlogical using a TERM signal, which isn't available there.
> As a result, the test waited indefinitely for pg_recvlogical to exit
> and finally timed out.
>
> To address this, I updated the 0003 patch so that the test passes
> --endpos to pg_recvlogical on Windows only. This allows pg_recvlogical
> to terminate without signals, by generating WAL until the current
> position reaches the specified end position. OTOH, on non-Windows
> platforms, the test continues to use signals to terminate pg_recvlogical.
>
> This approach may be somewhat unstable. If there's a more robust
> way to terminate pg_recvlogical on Windows, I'd be happy to switch
> to it, but I couldn't come up with a better option.
>
> Updated patches are attached.
>
> Regards,
>
> --
> Fujii Masao
>
<v4-0004-pg_recvlogical-remove-unnecessary-OutputFsync-ret.patch><v4-0003-Add-test-for-pg_recvlogical-reconnection-behavior.patch><v4-0001-pg_recvlogical-Prevent-flushed-data-from-being-re.patch><v4-0002-Add-a-new-helper-function-wait_for_file-to-Utils..patch>
Hi Fujii-san,
Thanks for the patch. Here are my comments on v4.
1 - 0001
```
+ /*
+ * Save the last flushed position as the replication start point. On
+ * reconnect, replication resumes from there to avoid re-sending flushed
+ * data.
+ */
+ startpos = output_fsync_lsn;
```
Looking at function OutputFsync(), fsync() may fail and there a few branches to return early without fsync(), so should
weonly update startpos after fsync()?
2 - 0001
```
+ if (outfd != -1 && strcmp(outfile, "-") != 0)
+ OutputFsync(feGetCurrentTimestamp());
```
Do we need to check return code of OutputFsync()? I checked this file, the only caller that doesn’t check return code
ofOutputFsync() has a comment for why:
```
/* no need to jump to error on failure here, we're finishing anyway */
OutputFsync(t);
```
I saw 0004 has changed OutputFsync to return nothing. So, it’s ok to not adding a comment here. But I just feel that,
ifwe want to make the commit self-contained, it would be better to add a comment here, but that’s not a strong opinion.
3 - 0001
No a direct issue of this patch. I noticed that, everywhere that calls OutputFsync(), it checks outfd != -1 &&
strcmp(outfile,"-") != 0. However, two places miss the check:
```
if (outfd != -1 &&
feTimestampDifferenceExceeds(output_last_fsync, now,
fsync_interval))
{
if (!OutputFsync(now))
goto error;
}
```
This place doesn’t check strcmp(outfile, "-") != 0.
```
static bool
flushAndSendFeedback(PGconn *conn, TimestampTz *now)
{
/* flush data to disk, so that we send a recent flush pointer */
if (!OutputFsync(*now))
return false;
*now = feGetCurrentTimestamp();
if (!sendFeedback(conn, *now, true, false))
return false;
return true;
}
```
flushAndSendFeedback() doesn’t check the both conditions. I don’t why the checks can be skipped.
I want to hear your opinion. If you consider this is a problem, then you may address it in this patch; or you want me
toaddress it in a separate patch?
4 - 0002
```
+ croak "timed out waiting for match: $regexp”;
```
Is it more helpful to include filename in the error message?
5 - 0003
```
+my ($stdout, $stderr);
+my $recv = IPC::Run::start(
+ [@pg_recvlogical_cmd],
+ '>' => \$stdout,
+ '2>' => \$stderr);
```
$stdout and $stderr are never used.
6 - 0004 LGTM. As OutputFsync() only returned true, making it as void makes the code neater.
Best regards,
--
Chao Li (Evan)
HighGo Software Co., Ltd.
https://www.highgo.com/
On Sun, Jan 11, 2026 at 6:49 PM Dewei Dai <daidewei1970@163.com> wrote: > > Hi Fujii, > > At 2026-01-11 17:21:19, "Fujii Masao" <masao.fujii@gmail.com> wrote: > >That's possible. But TBH I'm not sure how much effort is justified here. > >The test uses pg_recvlogical to activate the slot and doesn't really test > >pg_recvlogical itself. It's unclear how valuable it is to additionally run > >this test on Windows... > > > I applied the V4 patch and tested it on a CentOS 7 x86_64 platform. The test steps are as follows: > > 1. Create a table: > `create table test_id(id integer);` > 2. Create a function to close the connection: > `create or replace function test_f(id integer) returns integer as $$ > declare > var1 integer; > begin > SELECT active_pid into var1 FROM pg_replication_slots WHERE slot_name = 'reconnect_test'; > perform pg_terminate_backend(var1); > return 1; > end; $$ language plpgsql;` > > 3. Execute the command to receive logs: > `./pg_recvlogical --create-slot --slot reconnect_test --dbname postgres --start --file decoding.out --fsync-interval 200--status-interval 100 --verbose` > 4. Execute the following shell script: > `while true > do > ./psql -d postgres<<EOF > select test_f(1); > \q > EOF > done` > > 5. Execute data insertion using psql: > `insert into test_id values(1); > insert into test_id values(2);` > 6. `tail -f decoding.out` > I found duplicate insert statements in the file. > I don't know if this is a problem. > Additionally, I tried moving the two lines involving `Stream LogicalLog` outside the loop > in the `main` function, and then it worked correctly. > `output_written_lsn = InvalidXLogRecPtr;` > `output_fsync_lsn = InvalidXLogRecPtr;` Thanks for the test and the investigation! I was able to reproduce the issue as well. It occurs when the pg_recvlogical connection is terminated before it has received any messages. The problematic sequence is roughly: 1. The pg_recvlogical connection is terminated after running for some time. 2. StreamLogicalLog() is called again and initializes output_written_lsn to InvalidXLogRecPtr. 3. pg_recvlogical reconnects and starts replication from valid startpos. 4. The connection is terminated again. 5. StreamLogicalLog() exits and OutputFsync() sets startpos to output_written_lsn (i.e., InvalidXLogRecPtr). As a result, the next StreamLogicalLog() starts replication with startpos = InvalidXLogRecPtr, which can cause the server to resend already-streamed data and lead to duplicate output. The root cause is that StreamLogicalLog() reinitializes output_written_lsn and output_fsync_lsn on every call. As you suggested, removing that initialization fixes the issue. I’ve updated the 0001 patch accordingly. Attached is the updated version of the patches. Regards, -- Fujii Masao
Вложения
On Mon, Jan 12, 2026 at 4:08 PM Chao Li <li.evan.chao@gmail.com> wrote:
> Thanks for the patch. Here are my comments on v4.
Thanks for the review!
> 1 - 0001
> ```
> + /*
> + * Save the last flushed position as the replication start point. On
> + * reconnect, replication resumes from there to avoid re-sending flushed
> + * data.
> + */
> + startpos = output_fsync_lsn;
> ```
>
> Looking at function OutputFsync(), fsync() may fail and there a few branches to return early without fsync(), so
shouldwe only update startpos after fsync()?
Maybe not, but I might be missing something. Could you clarify what
concrete scenario would be problematic with the current code?
> 2 - 0001
> ```
> + if (outfd != -1 && strcmp(outfile, "-") != 0)
> + OutputFsync(feGetCurrentTimestamp());
> ```
>
> Do we need to check return code of OutputFsync()? I checked this file, the only caller that doesn’t check return code
ofOutputFsync() has a comment for why:
> ```
> /* no need to jump to error on failure here, we're finishing anyway */
> OutputFsync(t);
> ```
>
> I saw 0004 has changed OutputFsync to return nothing. So, it’s ok to not adding a comment here. But I just feel that,
ifwe want to make the commit self-contained, it would be better to add a comment here, but that’s not a strong opinion.
Yeah, I think that making the patch "self-contained" in that sense isn't
really worth the extra effort.
> 3 - 0001
>
> No a direct issue of this patch. I noticed that, everywhere that calls OutputFsync(), it checks outfd != -1 &&
strcmp(outfile,"-") != 0. However, two places miss the check:
The 0001 patch updates pg_recvlogical to call OutputFsync() before
restarting replication. That call is guarded by strcmp(outfile, "-") != 0.
Your comment makes me reconsider this: when "--file -" is used,
OutputFsync() would be skipped, so startpos would not be updated before
restarting replication. That can lead to duplicate output on stdout,
which is clearly problematic. For this reason, I removed that check
in the latest 0001 patch.
In other places where we check strcmp(outfile, "-") != 0, such as:
if (outfd != -1 && output_reopen && strcmp(outfile, "-") != 0)
{
now = feGetCurrentTimestamp();
OutputFsync(now);
close(outfd);
outfd = -1;
}
on second thought, the check seems necessary for close(outfd) and
"outfd = -1", but not for calling OutputFsync(). If that understanding is
correct, it might make sense to adjust where the check is applied.
However, I think that should be handled in a separate patch.
> 4 - 0002
> ```
> + croak "timed out waiting for match: $regexp”;
> ```
>
> Is it more helpful to include filename in the error message?
OK, I've updated the message to include the filename.
> 5 - 0003
> ```
> +my ($stdout, $stderr);
> +my $recv = IPC::Run::start(
> + [@pg_recvlogical_cmd],
> + '>' => \$stdout,
> + '2>' => \$stderr);
> ```
>
> $stdout and $stderr are never used.
Yes, but I'm fine with keeping them as they are.
I've attached the updated version of the patches upthread.
Regards,
--
Fujii Masao
> On Jan 14, 2026, at 09:26, Fujii Masao <masao.fujii@gmail.com> wrote: > > On Mon, Jan 12, 2026 at 4:08 PM Chao Li <li.evan.chao@gmail.com> wrote: >> Thanks for the patch. Here are my comments on v4. > > Thanks for the review! > > >> 1 - 0001 >> ``` >> + /* >> + * Save the last flushed position as the replication start point. On >> + * reconnect, replication resumes from there to avoid re-sending flushed >> + * data. >> + */ >> + startpos = output_fsync_lsn; >> ``` >> >> Looking at function OutputFsync(), fsync() may fail and there a few branches to return early without fsync(), so shouldwe only update startpos after fsync()? > > Maybe not, but I might be missing something. Could you clarify what > concrete scenario would be problematic with the current code? > I just reviewed the patch again, and I think I was wrong wrt this comment: * If fsync() fails, the process will fail out, no reconnect will happen, so wether or not updating startpos doesn’t matter; * if (fsync_interval <= 0), fsync is not required, but we still need to update startpos * if (!output_needs_fsync), meaning nothing new to fsync, but we still need to update startpos if startpos has not been updated So, I withdraw this comment. V5 LGTM. Best regards, -- Chao Li (Evan) HighGo Software Co., Ltd. https://www.highgo.com/
On Wed, Jan 14, 2026 at 5:47 PM Chao Li <li.evan.chao@gmail.com> wrote: > > > > > On Jan 14, 2026, at 09:26, Fujii Masao <masao.fujii@gmail.com> wrote: > > > > On Mon, Jan 12, 2026 at 4:08 PM Chao Li <li.evan.chao@gmail.com> wrote: > >> Thanks for the patch. Here are my comments on v4. > > > > Thanks for the review! > > > > > >> 1 - 0001 > >> ``` > >> + /* > >> + * Save the last flushed position as the replication start point. On > >> + * reconnect, replication resumes from there to avoid re-sending flushed > >> + * data. > >> + */ > >> + startpos = output_fsync_lsn; > >> ``` > >> > >> Looking at function OutputFsync(), fsync() may fail and there a few branches to return early without fsync(), so shouldwe only update startpos after fsync()? > > > > Maybe not, but I might be missing something. Could you clarify what > > concrete scenario would be problematic with the current code? > > > > I just reviewed the patch again, and I think I was wrong wrt this comment: > > * If fsync() fails, the process will fail out, no reconnect will happen, so wether or not updating startpos doesn’t matter; > * if (fsync_interval <= 0), fsync is not required, but we still need to update startpos > * if (!output_needs_fsync), meaning nothing new to fsync, but we still need to update startpos if startpos has not beenupdated > > So, I withdraw this comment. > > V5 LGTM. Thanks for the review! I've pushed the patches. Regards, -- Fujii Masao