Обсуждение: Making pg_rewind faster

Поиск
Список
Период
Сортировка

Making pg_rewind faster

От
vignesh ravichandran
Дата:
Hi Hackers,

I have been using pg_rewind in production for 2 years. One of the things that I noticed in pg_rewind is if it doesn't know what to do with a file "it copies". I understand it's the more safer option. After all, the alternative, pg_basebackup copies all the files from source to target.

However, this is making pg_rewind inefficient when we have a high number of WAL files. Majority of the data (in most of my cases 95%+) that it copies are WAL files which are anyway same between the source and target. Skipping those same WAL files from copying will improve the speed of pg_rewind a lot.

1. Does pg_rewind need to copy WAL files before the WAL that contains the last common check point?

Heikki's presentation https://pgsessions.com/assets/archives/pg_rewind-presentation-paris.pdf gave me a good overview and also explained the behavior what I mentioned.

Thanks,
Vignesh

Re: Making pg_rewind faster

От
Justin Kwan
Дата:
Hi everyone!

Here's the attached patch submission to optimize pg_rewind performance when many WAL files are retained on server. This patch avoids replaying (copying over) older WAL segment files that fall before the point of divergence between the source and target servers.

Thanks,
Justin

From: Justin Kwan <jkwan@cloudflare.com>
Sent: July 15, 2022 6:13 PM
To: vignesh ravichandran <admin@viggy28.dev>
Cc: pgsql-hackers <pgsql-hackers@postgresql.org>; vignesh <vignesh@cloudflare.com>; justinpkwan@outlook.com <justinpkwan@outlook.com>
Subject: Re: Making pg_rewind faster
 
Looping in my other email.

On Thu, Jun 30, 2022 at 6:22 AM vignesh ravichandran <admin@viggy28.dev> wrote:
Hi Hackers,

I have been using pg_rewind in production for 2 years. One of the things that I noticed in pg_rewind is if it doesn't know what to do with a file "it copies". I understand it's the more safer option. After all, the alternative, pg_basebackup copies all the files from source to target.

However, this is making pg_rewind inefficient when we have a high number of WAL files. Majority of the data (in most of my cases 95%+) that it copies are WAL files which are anyway same between the source and target. Skipping those same WAL files from copying will improve the speed of pg_rewind a lot.

1. Does pg_rewind need to copy WAL files before the WAL that contains the last common check point?

Heikki's presentation https://pgsessions.com/assets/archives/pg_rewind-presentation-paris.pdf gave me a good overview and also explained the behavior what I mentioned.

Thanks,
Vignesh

Вложения

Re: Making pg_rewind faster

От
Justin Kwan
Дата:
Hi everyone!

I've also attached the pg_rewind optimization patch file for Postgres version 14.4. The previous patch file targets version Postgres version 15 Beta 1/2.

Thanks,
Justin

From: Justin Kwan <jkwan@cloudflare.com>
Sent: July 15, 2022 6:13 PM
To: vignesh ravichandran <admin@viggy28.dev>
Cc: pgsql-hackers <pgsql-hackers@postgresql.org>; vignesh <vignesh@cloudflare.com>; justinpkwan@outlook.com <justinpkwan@outlook.com>
Subject: Re: Making pg_rewind faster
 
Looping in my other email.

On Thu, Jun 30, 2022 at 6:22 AM vignesh ravichandran <admin@viggy28.dev> wrote:
Hi Hackers,

I have been using pg_rewind in production for 2 years. One of the things that I noticed in pg_rewind is if it doesn't know what to do with a file "it copies". I understand it's the more safer option. After all, the alternative, pg_basebackup copies all the files from source to target.

However, this is making pg_rewind inefficient when we have a high number of WAL files. Majority of the data (in most of my cases 95%+) that it copies are WAL files which are anyway same between the source and target. Skipping those same WAL files from copying will improve the speed of pg_rewind a lot.

1. Does pg_rewind need to copy WAL files before the WAL that contains the last common check point?

Heikki's presentation https://pgsessions.com/assets/archives/pg_rewind-presentation-paris.pdf gave me a good overview and also explained the behavior what I mentioned.

Thanks,
Vignesh

Вложения

Re: Making pg_rewind faster

От
Tom Lane
Дата:
Justin Kwan <justinpkwan@outlook.com> writes:
> I've also attached the pg_rewind optimization patch file for Postgres version 14.4. The previous patch file targets
versionPostgres version 15 Beta 1/2. 

It's very unlikely that we would consider committing such changes into
released branches.  In fact, it's too late even for v15.  You should
be submitting non-bug-fix patches against master (v16-to-be).

            regards, tom lane



Re: Making pg_rewind faster

От
Justin Kwan
Дата:
Hi Tom,

Thank you for taking a look at this and that sounds good. I will send over a patch compatible with Postgres v16.

Justin

From: Tom Lane <tgl@sss.pgh.pa.us>
Sent: July 17, 2022 2:40 PM
To: Justin Kwan <justinpkwan@outlook.com>
Cc: pgsql-hackers <pgsql-hackers@postgresql.org>; vignesh <vignesh@cloudflare.com>; jkwan@cloudflare.com <jkwan@cloudflare.com>; vignesh ravichandran <admin@viggy28.dev>; hlinnaka@iki.fi <hlinnaka@iki.fi>
Subject: Re: Making pg_rewind faster
 
Justin Kwan <justinpkwan@outlook.com> writes:
> I've also attached the pg_rewind optimization patch file for Postgres version 14.4. The previous patch file targets version Postgres version 15 Beta 1/2.

It's very unlikely that we would consider committing such changes into
released branches.  In fact, it's too late even for v15.  You should
be submitting non-bug-fix patches against master (v16-to-be).

                        regards, tom lane

Re: Making pg_rewind faster

От
Michael Paquier
Дата:
On Mon, Jul 18, 2022 at 05:14:00PM +0000, Justin Kwan wrote:
> Thank you for taking a look at this and that sounds good. I will
> send over a patch compatible with Postgres v16.

+$node_2->psql(
+       'postgres',
+       "SELECT extract(epoch from modification) FROM pg_stat_file('pg_wal/000000010000000000000003');",
+       stdout => \my $last_common_tli1_wal_last_modified_at);
Please note that you should not rely on the FS-level stats for
anything that touches the WAL segments.  A rough guess about what you
could here to make sure that only the set of WAL segments you are
looking for is being copied over would be to either:
- Scan the logs produced by pg_rewind and see if the segments are
copied or not, depending on the divergence point (aka the last
checkpoint before WAL forked).
- Clean up pg_wal/ in the target node before running pg_rewind,
checking that only the segments you want are available once the
operation completes.
--
Michael

Вложения

Re: Making pg_rewind faster

От
Justin Kwan
Дата:
Hi Michael,

Thank you for your feedback, I've incorporated your suggestions by scanning the logs produced from pg_rewind when asserting that certain WAL segment files were skipped from being copied over to the target server.

I've also updated the pg_rewind patch file to target the Postgres master branch (version 16 to be). Please see attached.

Thanks,
Justin




From: Michael Paquier
Sent: Tuesday, July 19, 2022 1:36 AM
To: Justin Kwan
Cc: Tom Lane; pgsql-hackers; vignesh; jkwan@cloudflare.com; vignesh ravichandran; hlinnaka@iki.fi
Subject: Re: Making pg_rewind faster

On Mon, Jul 18, 2022 at 05:14:00PM +0000, Justin Kwan wrote:
> Thank you for taking a look at this and that sounds good. I will
> send over a patch compatible with Postgres v16.

+$node_2->psql(
+       'postgres',
+       "SELECT extract(epoch from modification) FROM pg_stat_file('pg_wal/000000010000000000000003');",
+       stdout => \my $last_common_tli1_wal_last_modified_at);
Please note that you should not rely on the FS-level stats for
anything that touches the WAL segments.  A rough guess about what you
could here to make sure that only the set of WAL segments you are
looking for is being copied over would be to either:
- Scan the logs produced by pg_rewind and see if the segments are
copied or not, depending on the divergence point (aka the last
checkpoint before WAL forked).
- Clean up pg_wal/ in the target node before running pg_rewind,
checking that only the segments you want are available once the
operation completes.
--
Michael
Вложения

Re: Making pg_rewind faster

От
Justin Kwan
Дата:
Hi Michael,

Not sure if this email went through previously but thank you for your feedback, I've incorporated your suggestions by scanning the logs produced from pg_rewind when asserting that certain WAL segment files were skipped from being copied over to the target server.

I've also updated the pg_rewind patch file to target the Postgres master branch (version 16 to be). Please see attached.

Thanks,
Justin


From: Justin Kwan <justinpkwan@outlook.com>
Sent: July 18, 2022 1:14 PM
To: Tom Lane <tgl@sss.pgh.pa.us>
Cc: pgsql-hackers <pgsql-hackers@postgresql.org>; vignesh <vignesh@cloudflare.com>; jkwan@cloudflare.com <jkwan@cloudflare.com>; vignesh ravichandran <admin@viggy28.dev>; hlinnaka@iki.fi <hlinnaka@iki.fi>
Subject: Re: Making pg_rewind faster
 
Hi Tom,

Thank you for taking a look at this and that sounds good. I will send over a patch compatible with Postgres v16.

Justin

From: Tom Lane <tgl@sss.pgh.pa.us>
Sent: July 17, 2022 2:40 PM
To: Justin Kwan <justinpkwan@outlook.com>
Cc: pgsql-hackers <pgsql-hackers@postgresql.org>; vignesh <vignesh@cloudflare.com>; jkwan@cloudflare.com <jkwan@cloudflare.com>; vignesh ravichandran <admin@viggy28.dev>; hlinnaka@iki.fi <hlinnaka@iki.fi>
Subject: Re: Making pg_rewind faster
 
Justin Kwan <justinpkwan@outlook.com> writes:
> I've also attached the pg_rewind optimization patch file for Postgres version 14.4. The previous patch file targets version Postgres version 15 Beta 1/2.

It's very unlikely that we would consider committing such changes into
released branches.  In fact, it's too late even for v15.  You should
be submitting non-bug-fix patches against master (v16-to-be).

                        regards, tom lane
Вложения

Re: Making pg_rewind faster

От
Alexander Korotkov
Дата:
Hi, Justin!

On Fri, Jul 29, 2022 at 1:05 PM Justin Kwan <justinpkwan@outlook.com> wrote:
> Not sure if this email went through previously but thank you for your feedback, I've incorporated your suggestions by
scanningthe logs produced from pg_rewind when asserting that certain WAL segment files were skipped from being copied
overto the target server. 
>
> I've also updated the pg_rewind patch file to target the Postgres master branch (version 16 to be). Please see
attached.

Thank you for the revision.

I've taken a look at this patch. Overall it looks good to me. I also
don't see any design objections in the thread.

A couple of points from me:
1) I would prefer to evade hard-coded names for WAL segments in the
tap tests. Could we calculate the names in the tap tests based on the
diverge point, etc.?
2) Patch contains some indentation with spaces, which should be done
in tabs. Please consider either manually fixing this or running
pgindent over modified files.

------
Regards,
Alexander Korotkov



Re: Making pg_rewind faster

От
Andres Freund
Дата:
Hi,

On 2022-09-13 20:50:20 +0300, Alexander Korotkov wrote:
> On Fri, Jul 29, 2022 at 1:05 PM Justin Kwan <justinpkwan@outlook.com> wrote:
> > Not sure if this email went through previously but thank you for your feedback, I've incorporated your suggestions
byscanning the logs produced from pg_rewind when asserting that certain WAL segment files were skipped from being
copiedover to the target server.
 
> >
> > I've also updated the pg_rewind patch file to target the Postgres master branch (version 16 to be). Please see
attached.
> 
> Thank you for the revision.
> 
> I've taken a look at this patch. Overall it looks good to me. I also
> don't see any design objections in the thread.
> 
> A couple of points from me:
> 1) I would prefer to evade hard-coded names for WAL segments in the
> tap tests. Could we calculate the names in the tap tests based on the
> diverge point, etc.?
> 2) Patch contains some indentation with spaces, which should be done
> in tabs. Please consider either manually fixing this or running
> pgindent over modified files.

This patch currently fails because it hasn't been adjusted for
commit c47885bd8b6
Author: Andres Freund <andres@anarazel.de>
Date:   2022-09-19 18:03:17 -0700
 
    Split TESTDIR into TESTLOGDIR and TESTDATADIR

The adjustment is trivial. Attached, together with also producing an error
message rather than just dying wordlessly.


It doesn't seem quite right to read pg_rewind's logs by reading
regress_log_001_basic. Too easy to confuse different runs of pg_rewind
etc. I'd suggest trying to redirect the log to a different file.


With regard to Alexander's point about whitespace:

.git/rebase-apply/patch:25: indent with spaces.
                /* Handle WAL segment file. */
.git/rebase-apply/patch:26: indent with spaces.
                const char  *fname;
.git/rebase-apply/patch:27: indent with spaces.
                char        *slash;
.git/rebase-apply/patch:29: indent with spaces.
                /* Split filepath into directory & filename. */
.git/rebase-apply/patch:30: indent with spaces.
                slash = strrchr(path, '/');
warning: squelched 29 whitespace errors
warning: 34 lines add whitespace errors.

Greetings,

Andres Freund

Вложения

Re: Making pg_rewind faster

От
Michael Paquier
Дата:
On Sun, Oct 02, 2022 at 10:44:25AM -0700, Andres Freund wrote:
> It doesn't seem quite right to read pg_rewind's logs by reading
> regress_log_001_basic. Too easy to confuse different runs of pg_rewind
> etc. I'd suggest trying to redirect the log to a different file.

Hardcoding log file names in the test increases the overall
maintenance, even if renaming these would be easy to track and fix if
the naming convention is changed.  Anyway, I think that what this
patch should do is to use command_checks_all() in RewindTest.pm as it
is the test API able to check after a status and multiple expected
outputs, which is what the changes in 001 and 008 are doing.
RewindTest::run_pg_rewind() needs to be a bit tweaked to accept these
regex patterns in input.

+    if (file_segno < last_common_segno)
+    {
+        pg_log_debug("WAL file entry \"%s\" not copied to target", fname);
+        return FILE_ACTION_NONE;
+    }
There may be something I am missing here, but there is no need to care
about segments with a TLI older than lastcommontliIndex, no?

decide_wal_file_action() assumes that the WAL segment exists on the
target and the source.  This looks bug-prone to me without at least an
assertion.

file_entry_t has an entry to track if a file is a relation file.  I
think that it would be much cleaner to track if we are handling a WAL
segment when inserting an entry in insert_filehash_entry(), so
isrelfile could be replaced by an enum with three values: relation
file, WAL segment and the rest.
--
Michael

Вложения

Re: Making pg_rewind faster

От
Michael Paquier
Дата:
On Thu, Oct 06, 2022 at 04:08:45PM +0900, Michael Paquier wrote:
> file_entry_t has an entry to track if a file is a relation file.  I
> think that it would be much cleaner to track if we are handling a WAL
> segment when inserting an entry in insert_filehash_entry(), so
> isrelfile could be replaced by an enum with three values: relation
> file, WAL segment and the rest.

This review has been done a few weeks ago, and there has been no
update since, so I am marking this entry as returned with feedback in
the CF app.
--
Michael

Вложения

Re: Making pg_rewind faster

От
Japin Li
Дата:
On Tue, 01 Jul 2025 at 11:13, John H <johnhyvr@gmail.com> wrote:
> Hi,
>
> I've attached an updated version of the patch against master with the changes
> suggested.
>
> On Tue, Nov 29, 2022 at 10:03 PM Michael Paquier <michael@paquier.xyz> wrote:
>>
>> On Thu, Oct 06, 2022 at 04:08:45PM +0900, Michael Paquier wrote:
>>>
>>>  There may be something I am missing here, but there is no need to care
>>> about segments with a TLI older than lastcommontliIndex, no?
>
> Hard to say. pg_rewind is intended to make the same "copy" of the cluster which
> implies pg_wal/ should look the same. There might be use cases around logical
> replication where you would want these WAL files to still exist even
> across promotions?
>
>>> decide_wal_file_action() assumes that the WAL segment exists on the
>>> target and the source.  This looks bug-prone to me without at least an
>>> assertion.
>
> From previous refactors there is now an Assertion in filemap.c
> decide_file_action that handles this.
>
>> Assert(entry->target_exists && entry->source_exists);
>
> decide_wal_file_action is called after the assertion.
>

Hi, John

Thanks for updating the patch.

1.
+/* Determine the type of file content (relation, WAL, or other) */
+static file_content_type_t
+getFileType(const char *path)

Considering the existence of file_type_t, would getFileContentType() be a
suitable function for handling file content types?

2.
Perhaps decide_wal_file_action() could be defined in filemap.c.


While this is unrelated to WAL logging, it could also contribute to faster
pg_rewind operations.  Should we consider ignoring log files under PGDATA
(e.g., those in the default log/ directory)?

--
Regards,
Japin Li



Re: Making pg_rewind faster

От
wenhui qiu
Дата:
HI 
>  2.
> Perhaps decide_wal_file_action() could be defined in filemap.c.


>  While this is unrelated to WAL logging, it could also contribute to faster
> pg_rewind operations.  Should we consider ignoring log files under PGDATA
> (e.g., those in the default log/ directory)?
Agree ,Usually the log file directory also takes up a lot of space, and the number of log files is quite large

On Wed, Jul 2, 2025 at 10:21 AM Japin Li <japinli@hotmail.com> wrote:
On Tue, 01 Jul 2025 at 11:13, John H <johnhyvr@gmail.com> wrote:
> Hi,
>
> I've attached an updated version of the patch against master with the changes
> suggested.
>
> On Tue, Nov 29, 2022 at 10:03 PM Michael Paquier <michael@paquier.xyz> wrote:
>>
>> On Thu, Oct 06, 2022 at 04:08:45PM +0900, Michael Paquier wrote:
>>>
>>>  There may be something I am missing here, but there is no need to care
>>> about segments with a TLI older than lastcommontliIndex, no?
>
> Hard to say. pg_rewind is intended to make the same "copy" of the cluster which
> implies pg_wal/ should look the same. There might be use cases around logical
> replication where you would want these WAL files to still exist even
> across promotions?
>
>>> decide_wal_file_action() assumes that the WAL segment exists on the
>>> target and the source.  This looks bug-prone to me without at least an
>>> assertion.
>
> From previous refactors there is now an Assertion in filemap.c
> decide_file_action that handles this.
>
>> Assert(entry->target_exists && entry->source_exists);
>
> decide_wal_file_action is called after the assertion.
>

Hi, John

Thanks for updating the patch.

1.
+/* Determine the type of file content (relation, WAL, or other) */
+static file_content_type_t
+getFileType(const char *path)

Considering the existence of file_type_t, would getFileContentType() be a
suitable function for handling file content types?

2.
Perhaps decide_wal_file_action() could be defined in filemap.c.


While this is unrelated to WAL logging, it could also contribute to faster
pg_rewind operations.  Should we consider ignoring log files under PGDATA
(e.g., those in the default log/ directory)?

--
Regards,
Japin Li


Re: Making pg_rewind faster

От
Japin Li
Дата:
On Wed, 02 Jul 2025 at 11:21, John H <johnhyvr@gmail.com> wrote:
> Hi,
>
> Thanks for the quick review.
>
> On Tue, Jul 1, 2025 at 8:16 PM wenhui qiu <qiuwenhuifx@gmail.com> wrote:
>> > Perhaps decide_wal_file_action() could be defined in filemap.c.
>>
>
> That's a good point. I updated the patch to reflect that.
>

Thanks for updating the patch.

>> >  While this is unrelated to WAL logging, it could also contribute to faster
>> > pg_rewind operations.  Should we consider ignoring log files under PGDATA
>> > (e.g., those in the default log/ directory)?
>> Agree ,Usually the log file directory also takes up a lot of space, and the number of log files is quite large
>>
>
> Should we handle this use case? I do agree that for the more common
> use-cases of pg_rewind which is synchronizing an old writer to the new
> leader after failover, avoiding syncing the logging directory is
> useful.
> At the same time, pg_rewind is intended to make the same copy of the
> source cluster as efficiently as possible which would include "all"
> directories if they exist in $PGDATA. By default logging_collector is
> off as well. I'd also think you would want to avoid putting the logs
> in $PGDATA to have smaller backups if you are using tools like
> pg_basebackup.
>

Splitting the logs from $PGDATA is definitely better. The question is whether
it's worth implementing this directly in core or if a prominent note in the
documentation would suffice.

>> On Wed, Jul 2, 2025 at 10:21 AM Japin Li <japinli@hotmail.com> wrote:
>>>
>>> Hi, John
>>>
>>> Thanks for updating the patch.
>>>
>>> 1.
>>> +/* Determine the type of file content (relation, WAL, or other) */
>>> +static file_content_type_t
>>> +getFileType(const char *path)
>>>
>>> Considering the existence of file_type_t, would getFileContentType() be a
>>> suitable function for handling file content types?
>
> Do you mean naming getFileType to getFileContentType?
>

Exactly!  It's confusing that getFileType() returns file_content_type_t
instead of file_type_t.

For v5 patch:

1.
We could simply use the global WalSegSz variable within decide_file_action(),
eliminating the need to pass wal_segsz_bytes as an argument.

2.
For last_common_segno, we could implement it similarly to WalSegSz, avoiding a
signature change for decide_file_actions() and decide_file_action().  I'm not
insisting on this approach, however.

--
Regards,
Japin Li



Re: Making pg_rewind faster

От
John H
Дата:
Hi,

On Wed, Jul 2, 2025 at 6:40 PM Japin Li <japinli@hotmail.com> wrote:
>
> >
>
> Splitting the logs from $PGDATA is definitely better. The question is whether
> it's worth implementing this directly in core or if a prominent note in the
> documentation would suffice.
>

I can work on the documentation update as a separate patch if folks
think this is worthwhile.

> >> On Wed, Jul 2, 2025 at 10:21 AM Japin Li <japinli@hotmail.com> wrote:
>
> Exactly!  It's confusing that getFileType() returns file_content_type_t
> instead of file_type_t.
>

Ah yes that is confusing, updated in patch.

> For v5 patch:
>
> 1.
> We could simply use the global WalSegSz variable within decide_file_action(),
> eliminating the need to pass wal_segsz_bytes as an argument.
>

Good point.

> 2.
> For last_common_segno, we could implement it similarly to WalSegSz, avoiding a
> signature change for decide_file_actions() and decide_file_action().  I'm not
> insisting on this approach, however.
>

I made it a global as well, and had to include access/xlog_internal.h
in pg_rewind.h but I don't feel strongly about it either way.

Thanks,

--
John Hsu - Amazon Web Services

Вложения

Re: Making pg_rewind faster

От
Japin Li
Дата:
On Thu, 03 Jul 2025 at 12:59, John H <johnhyvr@gmail.com> wrote:
> Hi,
>
> On Wed, Jul 2, 2025 at 6:40 PM Japin Li <japinli@hotmail.com> wrote:
>>
>> >
>>
>> Splitting the logs from $PGDATA is definitely better. The question is whether
>> it's worth implementing this directly in core or if a prominent note in the
>> documentation would suffice.
>>
>
> I can work on the documentation update as a separate patch if folks
> think this is worthwhile.
>
>> >> On Wed, Jul 2, 2025 at 10:21 AM Japin Li <japinli@hotmail.com> wrote:
>>
>> Exactly!  It's confusing that getFileType() returns file_content_type_t
>> instead of file_type_t.
>>
>
> Ah yes that is confusing, updated in patch.
>
>> For v5 patch:
>>
>> 1.
>> We could simply use the global WalSegSz variable within decide_file_action(),
>> eliminating the need to pass wal_segsz_bytes as an argument.
>>
>
> Good point.
>
>> 2.
>> For last_common_segno, we could implement it similarly to WalSegSz, avoiding a
>> signature change for decide_file_actions() and decide_file_action().  I'm not
>> insisting on this approach, however.
>>
>
> I made it a global as well, and had to include access/xlog_internal.h
> in pg_rewind.h but I don't feel strongly about it either way.
>

Thanks, LGTM.

--
Regards,
Japin Li



Re: Making pg_rewind faster

От
wenhui qiu
Дата:
Hi 
> Thanks, LGTM.
I think it's possible to register to https://commitfest.postgresql.org/54, https://commitfest.postgresql.org/53 will closed soon



Thanks

On Fri, Jul 4, 2025 at 10:50 AM Japin Li <japinli@hotmail.com> wrote:
On Thu, 03 Jul 2025 at 12:59, John H <johnhyvr@gmail.com> wrote:
> Hi,
>
> On Wed, Jul 2, 2025 at 6:40 PM Japin Li <japinli@hotmail.com> wrote:
>>
>> >
>>
>> Splitting the logs from $PGDATA is definitely better. The question is whether
>> it's worth implementing this directly in core or if a prominent note in the
>> documentation would suffice.
>>
>
> I can work on the documentation update as a separate patch if folks
> think this is worthwhile.
>
>> >> On Wed, Jul 2, 2025 at 10:21 AM Japin Li <japinli@hotmail.com> wrote:
>>
>> Exactly!  It's confusing that getFileType() returns file_content_type_t
>> instead of file_type_t.
>>
>
> Ah yes that is confusing, updated in patch.
>
>> For v5 patch:
>>
>> 1.
>> We could simply use the global WalSegSz variable within decide_file_action(),
>> eliminating the need to pass wal_segsz_bytes as an argument.
>>
>
> Good point.
>
>> 2.
>> For last_common_segno, we could implement it similarly to WalSegSz, avoiding a
>> signature change for decide_file_actions() and decide_file_action().  I'm not
>> insisting on this approach, however.
>>
>
> I made it a global as well, and had to include access/xlog_internal.h
> in pg_rewind.h but I don't feel strongly about it either way.
>

Thanks, LGTM.

--
Regards,
Japin Li

Re: Making pg_rewind faster

От
John H
Дата:
Hi,

Thanks for the review.

On Sun, Jul 6, 2025 at 7:12 PM wenhui qiu <qiuwenhuifx@gmail.com> wrote:
>
> Hi
> > Thanks, LGTM.
> I think it's possible to register to https://commitfest.postgresql.org/54, https://commitfest.postgresql.org/53 will
closedsoon 
>

I've created the commitfest entry here:
https://commitfest.postgresql.org/patch/5902/

I realized I wasn't testing properly in 0006. I've updated the latest
one to move it into a separate test file to make the results cleaner.

Thanks,

--
John Hsu - Amazon Web Services

Вложения

Re: Making pg_rewind faster

От
Robert Haas
Дата:
On Mon, Jul 7, 2025 at 2:46 PM John H <johnhyvr@gmail.com> wrote:
> I've created the commitfest entry here:
> https://commitfest.postgresql.org/patch/5902/
>
> I realized I wasn't testing properly in 0006. I've updated the latest
> one to move it into a separate test file to make the results cleaner.

Thanks to everyone who has worked on this patch. I think this is a
real problem about which we ought to try to do something. I think you
could even defend calling this a back-patchable bug fix, since copying
a lot of files that we know have to be identical is a serious
efficiency issue that can affect the usability of pg_upgrade. There
are more aggressive things that we could try to do here that might be
even better, e.g. we could copy just the WAL segment or segments that
contain the last checkpoint record and let the server fetch the rest
as needed after it starts up. However, that would be a definitional
change, which we certainly wouldn't want to back-patch, and which
might have downsides for some people, so perhaps it would need to be
optional behavior. But the patch as proposed is just an optimization
that shouldn't change the outcome for anyone. Perhaps it's still best
to treat it as a master-only improvement, but calling it a fix for a
performance bug isn't completely crazy either, IMHO.

Moving on to the patch contents, the approach taken makes sense to me.
Any WAL files that exist in both source and target from prior to the
point of divergence should not need to be copied, because they should
necessarily be identical. If anyone sees a flaw in this reasoning,
please point it out! While I agree with the basic approach, I think
that the patch could be hardened in two ways. First, isRelDataFile
tests the complete pathname to decide whether it has a relation file,
but getFileContentType only tests the final component of the pathname.
I think it should test the whole pathname also, in case there's a
stray file with a file name that looks like a WAL file in some other
directory. In fact, I think that these two functions should be
integrated. Let's rename the isRelDataFile() to getFileContentType()
and put all the logic in that function, including appropriate tests of
the path name. Second, in decide_file_action(), I think we should
check that entry->target_size == entry->source_size and return
FILE_ACTION_COPY if not. This case really shouldn't happen, but it's
very cheap to check and might offer a bit of protection in some weird
scenario.

I am not a huge fan of the way that the patch stuffs last_common_segno
into a global variable that is then accessed by
decide_wal_file_action. I think it would be better to find a way to
pass this information down through the call stack, either as a
separate argument or as part of some struct we're already passing, if
there's a reasonable and not-too-invasive way to do that. If it looks
too ugly, maybe this is the best option, but I think we should try to
find something better.

It does not appear to me that the test case tests what it purports to
test, and I don't think that what it purports to test is the right
thing anyway. It claims that it is testing that a certain WAL file
(which it erroneously calls a "WAL file entry") is not copied to the
target, but I see no logic to actually check this. I also don't think
it's the correct behavior. I think what the patch is doing and should
do is ensure that we don't copy the file when it's already present. If
the file isn't present in the target cluster, it should still be
copied. Granted, I haven't verified that this is the actual behavior,
but the "Handle cases where the file is missing from one of the
systems" code in decide_file_actiomn() seems to be higher up than the
code the patch changes, so I assume that logic should be applied
first. If that's correct, I don't quite know what a test case here can
usefully verify, since the only difference would be performance, but
maybe I'm misunderstanding something?

As an administrative note, the naming convention for the patches
posted to the thread is not consistent and not correct. Use "git
format-patch -v$VERSION" to generate patches. In a name like
v7-0003-do-something.patch, "v7" is the version of the patch set as a
whole, and "0003" identifies a specific patch within that version of
the patch set. In other words, that would be the third patch in the
series from the seventh time that the patch set was posted. This patch
set so far contains only one patch, so the file name should be
vX-0001-whatever.patch, where whatever is the subject of the patch and
X is a number that increases by one every time it's reposted. Let git
format-patch do the work for you, though.

On a related note, something a committer would need to do if they
wanted to commit this patch is figure out who should be credited with
writing it. I see that a number of different people have posted
versions of the patch; it would be nice if "Author:" or
"Co-authored-by:" tags were added to the commit message to reflect
whose code is present in a given version.

--
Robert Haas
EDB: http://www.enterprisedb.com



Re: Making pg_rewind faster

От
John H
Дата:
Hi Robert,

Thanks for taking a look.

On Thu, Sep 25, 2025 at 11:35 AM Robert Haas <robertmhaas@gmail.com> wrote:
> ...
> integrated. Let's rename the isRelDataFile() to getFileContentType()
> and put all the logic in that function, including appropriate tests of
> the path name. Second, in decide_file_action(), I think we should
> check that entry->target_size == entry->source_size and return
> FILE_ACTION_COPY if not. This case really shouldn't happen, but it's
> very cheap to check and might offer a bit of protection in some weird
> scenario.

Updated the patch to reflect that.

> I am not a huge fan of the way that the patch stuffs last_common_segno
> into a global variable that is then accessed by
> ...

I forgot why I bothered with a global instead since it was straightforward
to actually pass in the argument. Updated.

> It does not appear to me that the test case tests what it purports to
> test, and I don't think that what it purports to test is the right
> thing anyway. It claims that it is testing that a certain WAL file
> (which it erroneously calls a "WAL file entry") is not copied to the
> target, but I see no logic to actually check this.
> ...

That's a fair point. The test does:

1. Write some data -> WAL 1
2. SELECT pg_switch_wal() -> pg_current_wal_lsn() is at WAL 2
3. CHECKPOINT; -> Makes WAL 2 the last common checkpoint

Then it checks that "WAL 1 not copied over" was logged which isn't
the best since if it was logged elsewhere then it would still pass.

> ...
> first. If that's correct, I don't quite know what a test case here can
> usefully verify, since the only difference would be performance, but
> maybe I'm misunderstanding something?

I updated the test to run stat and get the modification time of the common
WAL segment before and after pg_rewind and verify it is the same.

In filemap.c in decide_wal_file_action, if you comment out
> return FILE_ACTION_NONE;

the test fails.

>
> As an administrative note, the naming convention for the patches
> posted to the thread is not consistent and not correct. Use "git
> format-patch -v$VERSION" to generate patches. In a name like
> ...

Fixed patch formatting.

> ...
> writing it. I see that a number of different people have posted
> versions of the patch; it would be nice if "Author:" or
> "Co-authored-by:" tags were added to the commit message to reflect

I added Justin as the Co-Author in the commit message but I realized
in their original patch they had "Author: Justin Kwan, Vignesh Ravichandran".
Andres also had the latest patch series before mine so I'll leave it up to
the committer how they want to handle this.

Thanks,
--
John Hsu - Amazon Web Services

Вложения

Re: Making pg_rewind faster

От
Srinath Reddy Sadipiralla
Дата:

Hi,

just a second late :( i was about to post a patch addressing the refactors which Robert mentioned  ,anyway will have a look at your latest patch John thanks :), curious about the tap test.

while i was writing the patch something suddenly struck me , that is why we are even depending on last_common_segno ,because once we reached decide_wal_file_action it means that the file exists in both target and source ,AFAIK this can only happen with wal segments older than or equal to last_common_segno because once the promotion competes the filename of the WAL files gets changed with the new timelineID(2), for ex: if the last_common_segno is 000000010000000000000003 then based on the rules in XLogInitNewTimeline 
1) if the timeline switch happens in middle of segment ,copy data from the last WAL segment and create WAL file with same segno but different timelineID,in this case the starting WAL file for the new timeline will be 000000020000000000000003
2) if the timeline switch happens at segment boundary , just create next segment for this case the starting WAL file for the new timeline will be 000000020000000000000004

so basically the files which exists in source and not in target like the new timeline WAL segments will be copied to target in total before we reach decide_wal_file_action , so i think we don't need to think about copying WAL files after divergence point by calculating and checking against last_common_segno which we are doing in our current approach , i think we can just do 

in decide_wal_file_action

if (entry->target_size != entry->source_size)
{
pg_log_debug("WAL segment file \"%s\" is copied to target", fname);
return FILE_ACTION_COPY;
}
return FILE_ACTION_NONE;

thoughts?

--
Thanks,
Srinath Reddy Sadipiralla
EDB: https://www.enterprisedb.com/

Re: Making pg_rewind faster

От
John H
Дата:
Hey Srinath

On Thu, Oct 9, 2025 at 12:09 PM Srinath Reddy Sadipiralla
<srinath2133@gmail.com> wrote:
> ...
> 1) if the timeline switch happens in middle of segment ,copy data from the last WAL segment and create WAL file with
samesegno but different timelineID,in this case the starting WAL file for the new timeline will be
000000020000000000000003
> 2) if the timeline switch happens at segment boundary , just create next segment for this case the starting WAL file
forthe new timeline will be 000000020000000000000004 
>
> so basically the files which exists in source and not in target like the new timeline WAL segments will be copied to
targetin total before we reach decide_wal_file_action , so i think we don't need to think about copying WAL files after
divergencepoint by calculating and checking against last_common_segno which we are doing in our current approach , i
thinkwe can just do 
>
> ...

That's a great point. I need to think about it some more but the
reasoning makes sense to me.
I think 'last_common_segno ' is only useful as another sanity check
but we already have the size
ones.

Thanks,

--
John Hsu - Amazon Web Services



Re: Making pg_rewind faster

От
Srinath Reddy Sadipiralla
Дата:
Hi John , I have reviewed your latest patch(v8-0001).

On Thu, Oct 9, 2025 at 11:27 PM John H <johnhyvr@gmail.com> wrote:
> ...
> integrated. Let's rename the isRelDataFile() to getFileContentType()
> and put all the logic in that function, including appropriate tests of
> the path name. Second, in decide_file_action(), I think we should
> check that entry->target_size == entry->source_size and return
> FILE_ACTION_COPY if not. This case really shouldn't happen, but it's
> very cheap to check and might offer a bit of protection in some weird
> scenario.

Updated the patch to reflect that.

in getFileContentType as you moved the logic of validating
whether they are xlog files or not using IsXLogFileName ,
i think it's better to validate it the kind of similar way as its
done for relfiles by parsing the path using sscanf this will
help to get the tli,logno,segno of the xlog file and using
XLogFromFileName which gives us logSegNo ,so that we
can rebuild the xlog file path again using these values with
XLogFilePath , then validate this new path with the given path
,this helps to catch invalid xlog files like pg_wal/00000001FFFFFFFFFFFFFF10.

maybe like this
+ if (strncmp("pg_wal/", path, 7) == 0)
+ {
+ const char *filename = path + 7;
+ XLogFromFileName(filename,&tli,logSegNo,WalSegSz);
+ XLogFilePath(check_path,tli,*logSegNo,WalSegSz);
+ if (strcmp(check_path, path) == 0)
+ return FILE_CONTENT_TYPE_WAL;
+ else
+ return FILE_CONTENT_TYPE_OTHER;
+ }
 

> I am not a huge fan of the way that the patch stuffs last_common_segno
> into a global variable that is then accessed by
> ...

I forgot why I bothered with a global instead since it was straightforward
to actually pass in the argument. Updated.

instead of passing last_common_segno down the call stack directly,
we can have a struct maybe called "rewindState" which will have the common
information related to both clusters involved in rewind ,so that in future
if we want to pass more such information down we won't be increasing
the number of arguments that need to be passed down to all functions in stack.
 

> It does not appear to me that the test case tests what it purports to
> test, and I don't think that what it purports to test is the right
> thing anyway. It claims that it is testing that a certain WAL file
> (which it erroneously calls a "WAL file entry") is not copied to the
> target, but I see no logic to actually check this.
> ...

That's a fair point. The test does:

1. Write some data -> WAL 1
2. SELECT pg_switch_wal() -> pg_current_wal_lsn() is at WAL 2
3. CHECKPOINT; -> Makes WAL 2 the last common checkpoint

Then it checks that "WAL 1 not copied over" was logged which isn't
the best since if it was logged elsewhere then it would still pass.

we can improve the TAP test file header comment as 
# Test the situation where we skip copying the same WAL files from source to target
# except if WAL file size differs.

let me put it this way
1) WALs are written to 000000010000000000000002 in primary.
2) then SELECT pg_switch_wal() now the current WAL is 000000010000000000000003
3) CHECKPOINT; this is the last common checkpoint and is written in
000000010000000000000003.
4) until this point we have the same WAL files in primary and standby.
5) promote standby ,this leads to timeline switch and lets say the divergence point
is at segment boundary , so the first WAL segment in the new timeline is 000000020000000000000004.
6) when we stop standby then primary , in primary we get checkpoint shutdown which
is greater than the divergence point ,means there is divergence so pg_rewind is needed.
7) so the last common WAL segment was 000000010000000000000003.

I have mentioned below about how this updated TAP test is testing whether
common WAL file is being copied or not and about one more test we need to cover
in this TAP test.
 

> ...
> first. If that's correct, I don't quite know what a test case here can
> usefully verify, since the only difference would be performance, but
> maybe I'm misunderstanding something?

I updated the test to run stat and get the modification time of the common
WAL segment before and after pg_rewind and verify it is the same.

In filemap.c in decide_wal_file_action, if you comment out
> return FILE_ACTION_NONE;

the test fails.

This test makes sense to me, by checking whether
last common segment(000000010000000000000003 in our case)
been copied or not using the stat->mtime after the pg_rewind, if its not
copied then stat->mtime will be the same ,else it will differ with the
one we noted before rewind, but we also need to test the case
where entry->target_size != entry->source_size and we do copy,
like if stat->mtime is changed (which means file has been copied)
and target_stat->size != soucrce_stat->size here no error expected.


--
Thanks,
Srinath Reddy Sadipiralla
EDB: https://www.enterprisedb.com/

Re: Making pg_rewind faster

От
John H
Дата:
Hi,

On Fri, Oct 10, 2025 at 12:45 AM Srinath Reddy Sadipiralla
<srinath2133@gmail.com> wrote:
>
> ...
> XLogFilePath , then validate this new path with the given path
> ,this helps to catch invalid xlog files like pg_wal/00000001FFFFFFFFFFFFFF10.
> ...

Are you concerned that somehow these files, which are named like XLog
files but actually
aren't, are somehow created therefore we should sync them in case?
I'm trying to understand how these files would be generated in the first place.

>
> instead of passing last_common_segno down the call stack directly,
> we can have a struct maybe called "rewindState" which will have the common
> information related to both clusters involved in rewind ,so that in future
> if we want to pass more such information down we won't be increasing
> the number of arguments that need to be passed down to all functions in stack.
>

I don't feel too strongly about this. Not sure how we view
future-proofing things,
as in do we proactively wrap around structs or wait until there's an
actual need.

This is the first time it's been touched since it was introduced in 14 for what
it's worth.

>
> we can improve the TAP test file header comment as
> # Test the situation where we skip copying the same WAL files from source to target
> # except if WAL file size differs.
>
> let me put it this way
> 1) WALs are written to 000000010000000000000002 in primary.
> ...
> 7) so the last common WAL segment was 000000010000000000000003.

Updated comments.

>
> ...
> where entry->target_size != entry->source_size and we do copy,
> like if stat->mtime is changed (which means file has been copied)
> and target_stat->size != soucrce_stat->size here no error expected.
>

Updated patch with one additional segment has a byte appended to it
on target but not source.


Thanks,

John H

Вложения

Re: Making pg_rewind faster

От
Srinath Reddy Sadipiralla
Дата:
Hi,

On Wed, Oct 15, 2025 at 4:49 AM John H <johnhyvr@gmail.com> wrote:
Hi,

On Fri, Oct 10, 2025 at 12:45 AM Srinath Reddy Sadipiralla
<srinath2133@gmail.com> wrote:
>
> ...
> XLogFilePath , then validate this new path with the given path
> ,this helps to catch invalid xlog files like pg_wal/00000001FFFFFFFFFFFFFF10.
> ...

Are you concerned that somehow these files, which are named like XLog
files but actually
aren't, are somehow created therefore we should sync them in case?
I'm trying to understand how these files would be generated in the first place.

the problem is not when server generates them because
filename is properly calculated using the XLogRecPtr in
XLogWrite->XLogFileInit->XLogFileInitInternal->XLogFilePath
,the main problem is when if someone manually places an invalid WAL file
in pg_wal like 00000001FFFFFFFFFFFFFF10, IsXLogFileName will
consider it as valid ,so with the approach as i mentioned earlier we can
catch such cases.
 

>
> instead of passing last_common_segno down the call stack directly,
> we can have a struct maybe called "rewindState" which will have the common
> information related to both clusters involved in rewind ,so that in future
> if we want to pass more such information down we won't be increasing
> the number of arguments that need to be passed down to all functions in stack.
>

I don't feel too strongly about this. Not sure how we view
future-proofing things,
as in do we proactively wrap around structs or wait until there's an
actual need.

This is the first time it's been touched since it was introduced in 14 for what
it's worth.

TBH first it seemed to me a good coding practice for future proofing,
then i checked if there's any place in postgres we are doing
something similar ,then found 1 in src/include/access/gist.h

typedef struct GISTDeletedPageContents
{
/* last xid which could see the page in a scan */
FullTransactionId deleteXid;
} GISTDeletedPageContents;
 

>
> we can improve the TAP test file header comment as
> # Test the situation where we skip copying the same WAL files from source to target
> # except if WAL file size differs.
>
> let me put it this way
> 1) WALs are written to 000000010000000000000002 in primary.
> ...
> 7) so the last common WAL segment was 000000010000000000000003.

Updated comments.

>
> ...
> where entry->target_size != entry->source_size and we do copy,
> like if stat->mtime is changed (which means file has been copied)
> and target_stat->size != soucrce_stat->size here no error expected.
>

Updated patch with one additional segment has a byte appended to it
on target but not source.

thanks for updating, i have attached a diff patch on
top of v9-0001 patch , where i tried to add more tests
to improve the validation that we copy the WAL file even
when it exists on both source and target but the size differs.
I also tried to rename some variables for readability,while
testing this out i found that usleep(1000); is not enough
to show that there has been a copy which changed the
stat->mtime of the file because i think the copy was very
fast (less than 1 millisecond) so the mtime of the
file after rewind's copy didn't change or filesystem precision
didn't caught the change, the copying of the file is confirmed
because the same file has different size before rewind and
same after rewind which these tests proved

ok($corrupt_wal_seg_stat_before_rewind->size != $corrupt_wal_seg_source_stat->size,"WAL segment $corrupt_wal_seg has different size in source vs target before rewind");
ok 11 - WAL segment 000000010000000000000004 has different size in source vs target before rewind

ok($corrupt_wal_seg_stat_after_rewind->size == $corrupt_wal_seg_source_stat->size, "WAL segment $corrupt_wal_seg was copied: file sizes are same between target and source after rewind");
ok 12 - WAL segment 000000010000000000000004 was copied: file sizes are same between target and source after rewind

and more over the pg_rewind confirmed this with a new debug
message which I added in this diff patch

diff --git a/src/bin/pg_rewind/filemap.c b/src/bin/pg_rewind/filemap.c
index 69d7728ce44..743484f4833 100644
--- a/src/bin/pg_rewind/filemap.c
+++ b/src/bin/pg_rewind/filemap.c
@@ -732,11 +732,11 @@ decide_wal_file_action(const char *fname, XLogSegNo last_common_segno,
  */
  if (file_segno < last_common_segno && source_size == target_size)
  {
  pg_log_debug("WAL segment \"%s\" not copied to target", fname);
  return FILE_ACTION_NONE;
  }
 
+ pg_log_debug("WAL segment \"%s\" is copied to target", fname);
  return FILE_ACTION_COPY;
 }
ok 6 - run pg_rewind stderr /(?^:WAL segment \"000000010000000000000004\" is copied to target)/

also did this instead of using catfile
-my $wal_skipped_path = File::Spec->catfile($node_primary->data_dir, 'pg_wal', $wal_seg_skipped);
+my $wal_skipped_path = $node_primary->data_dir . '/pg_wal/' . $wal_seg_skipped;
seems consistent with other TAP tests.

--
Thanks,
Srinath Reddy Sadipiralla
EDB: https://www.enterprisedb.com/
Вложения

Re: Making pg_rewind faster

От
Srinath Reddy Sadipiralla
Дата:


On Wed, Oct 15, 2025 at 7:56 PM Srinath Reddy Sadipiralla <srinath2133@gmail.com> wrote:

thanks for updating, i have attached a diff patch on
top of v9-0001 patch , where i tried to add more tests
to improve the validation that we copy the WAL file even
when it exists on both source and target but the size differs.
I also tried to rename some variables for readability,while
testing this out i found that usleep(1000); is not enough
to show that there has been a copy which changed the
stat->mtime of the file because i think the copy was very
fast (less than 1 millisecond) so the mtime of the
file after rewind's copy didn't change or filesystem precision
didn't caught the change, the copying of the file is confirmed
because the same file has different size before rewind and
same after rewind which these tests proved  
 
ok($corrupt_wal_seg_stat_before_rewind->size != $corrupt_wal_seg_source_stat->size,"WAL segment $corrupt_wal_seg has different size in source vs target before rewind");
ok 11 - WAL segment 000000010000000000000004 has different size in source vs target before rewind

ok($corrupt_wal_seg_stat_after_rewind->size == $corrupt_wal_seg_source_stat->size, "WAL segment $corrupt_wal_seg was copied: file sizes are same between target and source after rewind");
ok 12 - WAL segment 000000010000000000000004 was copied: file sizes are same between target and source after rewind

and more over the pg_rewind confirmed this with a new debug
message which I added in this diff patch

diff --git a/src/bin/pg_rewind/filemap.c b/src/bin/pg_rewind/filemap.c
index 69d7728ce44..743484f4833 100644
--- a/src/bin/pg_rewind/filemap.c
+++ b/src/bin/pg_rewind/filemap.c
@@ -732,11 +732,11 @@ decide_wal_file_action(const char *fname, XLogSegNo last_common_segno,
  */
  if (file_segno < last_common_segno && source_size == target_size)
  {
  pg_log_debug("WAL segment \"%s\" not copied to target", fname);
  return FILE_ACTION_NONE;
  }
 
+ pg_log_debug("WAL segment \"%s\" is copied to target", fname);
  return FILE_ACTION_COPY;
 }
ok 6 - run pg_rewind stderr /(?^:WAL segment \"000000010000000000000004\" is copied to target)/
 
 
 
btw i forgot to let you know that i have increased the
usleep(1000) to usleep(1000000) for the above reason
which helped to show the difference in mtime after rewind
adding the below extra validation that the file has been copied from
source to target ,this is already done in the v9-0001.diff patch. 

ok($corrupt_wal_seg_stat_after_rewind->mtime > $corrupt_wal_seg_stat_before_rewind->mtime,"WAL segment $corrupt_wal_seg was copied: mtime after rewind > mtime before rewind");
 ok 12 - WAL segment 000000010000000000000004 was copied: mtime after rewind > mtime before rewind


On Fri, Oct 10, 2025 at 3:18 AM John H <johnhyvr@gmail.com> wrote:
Hey Srinath

On Thu, Oct 9, 2025 at 12:09 PM Srinath Reddy Sadipiralla
<srinath2133@gmail.com> wrote:
> ...
> 1) if the timeline switch happens in middle of segment ,copy data from the last WAL segment and create WAL file with same segno but different timelineID,in this case the starting WAL file for the new timeline will be 000000020000000000000003
> 2) if the timeline switch happens at segment boundary , just create next segment for this case the starting WAL file for the new timeline will be 000000020000000000000004
>
> so basically the files which exists in source and not in target like the new timeline WAL segments will be copied to target in total before we reach decide_wal_file_action , so i think we don't need to think about copying WAL files after divergence point by calculating and checking against last_common_segno which we are doing in our current approach , i think we can just do
>
> ...

That's a great point. I need to think about it some more but the
reasoning makes sense to me.
I think 'last_common_segno ' is only useful as another sanity check
but we already have the size
ones.

it would be awesome to know ,if you have any update on this.

--
Thanks,
Srinath Reddy Sadipiralla
EDB: https://www.enterprisedb.com/