Обсуждение: pg_rewind docs correction

Поиск
Список
Период
Сортировка

pg_rewind docs correction

От
James Coleman
Дата:
The pg_rewind docs assert that the state of the target's data directory
after rewind is equivalent to the source's data directory. But that
isn't true both because the base state is further back in time and
because the target's data directory will include the current state on
the source of any copied blocks.

So I've attached a patch to summarize more correctly as well as
document clearly the state of the cluster after the operation and also
the operation sequencing dangers caused by copying configuration
files from the source.

James Coleman

Вложения

Re: pg_rewind docs correction

От
Michael Paquier
Дата:
On Fri, Sep 13, 2019 at 01:47:03PM -0400, James Coleman wrote:
> So I've attached a patch to summarize more correctly as well as
> document clearly the state of the cluster after the operation and also
> the operation sequencing dangers caused by copying configuration
> files from the source.

+   After a successful rewind, the target data directory is equivalent
to the
+   to the state of the data directory at the point at which the
source and
+   target diverged plus the current state on the source of any blocks
changed
+   on the target after that divergence. While only changed blocks
from relation
+   files are copied; all other files are copied in full, including
configuration
+   files and WAL segments. The advantage of
<application>pg_rewind</application>
+   over taking a new base backup, or tools like
<application>rsync</application>,
+   is that <application>pg_rewind</application> does not require
comparing or
+   copying unchanged relation blocks in the cluster. As such the
rewind operation
+   is significantly faster than other approaches when the database is
large and
+   only a small fraction of blocks differ between the clusters.

The point of divergence could be defined as the LSN position where WAL
has forked on the new timeline, but the block diffs are copied from
actually the last checkpoint just before WAL has forked.  So this new
paragraph brings confusion about the actual divergence point.

Regarding the relation files, if the file does not exist on the target
but does exist on the source, it is also copied fully, so the second
sentence is wrong here to mention as relation files could also be
copied fully.
--
Michael

Вложения

Re: pg_rewind docs correction

От
James Coleman
Дата:
On Sat, Sep 14, 2019 at 12:20 AM Michael Paquier <michael@paquier.xyz> wrote:
>
> On Fri, Sep 13, 2019 at 01:47:03PM -0400, James Coleman wrote:
> > So I've attached a patch to summarize more correctly as well as
> > document clearly the state of the cluster after the operation and also
> > the operation sequencing dangers caused by copying configuration
> > files from the source.
>
> +   After a successful rewind, the target data directory is equivalent
> to the
> +   to the state of the data directory at the point at which the
> source and
> +   target diverged plus the current state on the source of any blocks
> changed
> +   on the target after that divergence. While only changed blocks
> from relation
> +   files are copied; all other files are copied in full, including
> configuration
> +   files and WAL segments. The advantage of
> <application>pg_rewind</application>
> +   over taking a new base backup, or tools like
> <application>rsync</application>,
> +   is that <application>pg_rewind</application> does not require
> comparing or
> +   copying unchanged relation blocks in the cluster. As such the
> rewind operation
> +   is significantly faster than other approaches when the database is
> large and
> +   only a small fraction of blocks differ between the clusters.
>
> The point of divergence could be defined as the LSN position where WAL
> has forked on the new timeline, but the block diffs are copied from
> actually the last checkpoint just before WAL has forked.  So this new
> paragraph brings confusion about the actual divergence point.
>
> Regarding the relation files, if the file does not exist on the target
> but does exist on the source, it is also copied fully, so the second
> sentence is wrong here to mention as relation files could also be
> copied fully.

Updated (plus some additional wordsmithing).

James Coleman

Вложения

Re: pg_rewind docs correction

От
Michael Paquier
Дата:
On Sat, Sep 14, 2019 at 07:00:54PM -0400, James Coleman wrote:
> Updated (plus some additional wordsmithing).

+    The rewind operation is not expected to result in a consistent data
+    directory state either internally to the node or with respect to the rest
+    of the cluster. Instead the resulting data directory will only be consistent
+    after WAL replay has completed to at least the LSN at which changed blocks
+    copied from the source were originally written on the source.

That's not necessarily true.  pg_rewind enforces in the control file
of the target the minimum consistency LSN to be
pg_current_wal_insert_lsn() when using a live source or the last
checkpoint LSN for a stopped source, so while that sounds true from
the point of view of all the blocks copied, the control file may still
cause a complain that the target recovering has not reached its
consistent point even if all the blocks are already at a position
not-so-far from what has been registered in the control file.

+   the point at which the WAL timelines of the source and target diverged plus
+   the current state on the source of any blocks changed on the target after
+   that divergence. While only changed blocks from existing relation files are

And here we could mention that all the blocks copied from the source
are the ones which are found in the WAL records of the target until
the end of WAL of its timeline.  Still, that's basically what is
mentioned in the first part of "How It Works", which explains things
better.  I honestly don't really see that all this paragraph is an
improvement over the simplicity of the original when it comes to
understand the global idea of what pg_rewind does.

+  <para>
+    Because <application>pg_rewind</application> copies configuration files
+    entirely from the source, correcting recovery configuration options before
+    restarting the server is necessary if you intend to re-introduce the target
+    as a replica of the source. If you restart the server after the rewind
+    operation has finished but without configuring recovery, the target will
+    again diverge from the primary.
+   </para>

No objections regarding that part.  Now it seems to me that we had
better apply that to the last part of "How it works" instead?  I kind
of agree that the last paragraph could provide more details regarding
the risks of overwriting the wanted configuration.  The existing docs
also mention that pg_rewind only creates a backup_label file to start
recovery, perhaps we could mention up to which point recovery happens
in this section?  There is a bit more here than just "apply the WAL".
--
Michael

Вложения

Re: pg_rewind docs correction

От
James Coleman
Дата:
On Sun, Sep 15, 2019 at 10:25 AM Michael Paquier <michael@paquier.xyz> wrote:
>
> On Sat, Sep 14, 2019 at 07:00:54PM -0400, James Coleman wrote:
> > Updated (plus some additional wordsmithing).
>
> +    The rewind operation is not expected to result in a consistent data
> +    directory state either internally to the node or with respect to the rest
> +    of the cluster. Instead the resulting data directory will only be consistent
> +    after WAL replay has completed to at least the LSN at which changed blocks
> +    copied from the source were originally written on the source.
>
> That's not necessarily true.  pg_rewind enforces in the control file
> of the target the minimum consistency LSN to be
> pg_current_wal_insert_lsn() when using a live source or the last
> checkpoint LSN for a stopped source, so while that sounds true from
> the point of view of all the blocks copied, the control file may still
> cause a complain that the target recovering has not reached its
> consistent point even if all the blocks are already at a position
> not-so-far from what has been registered in the control file.

I could just say "after WAL replay has completed to a consistent state"?

> +   the point at which the WAL timelines of the source and target diverged plus
> +   the current state on the source of any blocks changed on the target after
> +   that divergence. While only changed blocks from existing relation files are
>
> And here we could mention that all the blocks copied from the source
> are the ones which are found in the WAL records of the target until
> the end of WAL of its timeline.  Still, that's basically what is
> mentioned in the first part of "How It Works", which explains things
> better.  I honestly don't really see that all this paragraph is an
> improvement over the simplicity of the original when it comes to
> understand the global idea of what pg_rewind does.

The problem with the original is that while simple, it's actually
incorrect in that simplicity. Pg_rewind does *not* result in the data
directory on the target matching the data directory on the source.

> +  <para>
> +    Because <application>pg_rewind</application> copies configuration files
> +    entirely from the source, correcting recovery configuration options before
> +    restarting the server is necessary if you intend to re-introduce the target
> +    as a replica of the source. If you restart the server after the rewind
> +    operation has finished but without configuring recovery, the target will
> +    again diverge from the primary.
> +   </para>
>
> No objections regarding that part.  Now it seems to me that we had
> better apply that to the last part of "How it works" instead?  I kind
> of agree that the last paragraph could provide more details regarding
> the risks of overwriting the wanted configuration.  The existing docs
> also mention that pg_rewind only creates a backup_label file to start
> recovery, perhaps we could mention up to which point recovery happens
> in this section?  There is a bit more here than just "apply the WAL".

I'll look to see if there's a better place to put this.

James Coleman



Re: pg_rewind docs correction

От
Michael Paquier
Дата:
On Sun, Sep 15, 2019 at 10:36:04AM -0400, James Coleman wrote:
> On Sun, Sep 15, 2019 at 10:25 AM Michael Paquier <michael@paquier.xyz> wrote:
>> +    The rewind operation is not expected to result in a consistent data
>> +    directory state either internally to the node or with respect to the rest
>> +    of the cluster. Instead the resulting data directory will only be consistent
>> +    after WAL replay has completed to at least the LSN at which changed blocks
>> +    copied from the source were originally written on the source.
>>
>> That's not necessarily true.  pg_rewind enforces in the control file
>> of the target the minimum consistency LSN to be
>> pg_current_wal_insert_lsn() when using a live source or the last
>> checkpoint LSN for a stopped source, so while that sounds true from
>> the point of view of all the blocks copied, the control file may still
>> cause a complain that the target recovering has not reached its
>> consistent point even if all the blocks are already at a position
>> not-so-far from what has been registered in the control file.
>
> I could just say "after WAL replay has completed to a consistent state"?

I still would not change this paragraph.  The first sentence means
that we have an equivalency, because that's the case if you think
about it as we make sure that the target is able to sync with the
source, and the target gets into a state where it as an on-disk state
equivalent to the target up to the minimum consistency point defined
in the control file once the tool has done its work (this last point
is too precise to be included in a global description to be honest).
And the second sentence makes clear what are the actual diffs are.

>> +   the point at which the WAL timelines of the source and target diverged plus
>> +   the current state on the source of any blocks changed on the target after
>> +   that divergence. While only changed blocks from existing relation files are
>>
>> And here we could mention that all the blocks copied from the source
>> are the ones which are found in the WAL records of the target until
>> the end of WAL of its timeline.  Still, that's basically what is
>> mentioned in the first part of "How It Works", which explains things
>> better.  I honestly don't really see that all this paragraph is an
>> improvement over the simplicity of the original when it comes to
>> understand the global idea of what pg_rewind does.
>
> The problem with the original is that while simple, it's actually
> incorrect in that simplicity. Pg_rewind does *not* result in the data
> directory on the target matching the data directory on the source.

That's not what I get from the original docs, but I may be too much
used to it.

>> +  <para>
>> +    Because <application>pg_rewind</application> copies configuration files
>> +    entirely from the source, correcting recovery configuration options before
>> +    restarting the server is necessary if you intend to re-introduce the target
>> +    as a replica of the source. If you restart the server after the rewind
>> +    operation has finished but without configuring recovery, the target will
>> +    again diverge from the primary.
>> +   </para>
>>
>> No objections regarding that part.  Now it seems to me that we had
>> better apply that to the last part of "How it works" instead?  I kind
>> of agree that the last paragraph could provide more details regarding
>> the risks of overwriting the wanted configuration.  The existing docs
>> also mention that pg_rewind only creates a backup_label file to start
>> recovery, perhaps we could mention up to which point recovery happens
>> in this section?  There is a bit more here than just "apply the WAL".
>
> I'll look to see if there's a better place to put this.

Thanks.  From what I can see, we could improve further the doc part
about how the tool works in details, especially regarding the
configuration files which may get overwritten and be more precise
about that.
--
Michael

Вложения

Re: pg_rewind docs correction

От
James Coleman
Дата:
On Tue, Sep 17, 2019 at 3:51 AM Michael Paquier <michael@paquier.xyz> wrote:
>
> On Sun, Sep 15, 2019 at 10:36:04AM -0400, James Coleman wrote:
> > On Sun, Sep 15, 2019 at 10:25 AM Michael Paquier <michael@paquier.xyz> wrote:
> >> +    The rewind operation is not expected to result in a consistent data
> >> +    directory state either internally to the node or with respect to the rest
> >> +    of the cluster. Instead the resulting data directory will only be consistent
> >> +    after WAL replay has completed to at least the LSN at which changed blocks
> >> +    copied from the source were originally written on the source.
> >>
> >> That's not necessarily true.  pg_rewind enforces in the control file
> >> of the target the minimum consistency LSN to be
> >> pg_current_wal_insert_lsn() when using a live source or the last
> >> checkpoint LSN for a stopped source, so while that sounds true from
> >> the point of view of all the blocks copied, the control file may still
> >> cause a complain that the target recovering has not reached its
> >> consistent point even if all the blocks are already at a position
> >> not-so-far from what has been registered in the control file.
> >
> > I could just say "after WAL replay has completed to a consistent state"?
>
> I still would not change this paragraph.  The first sentence means
> that we have an equivalency, because that's the case if you think
> about it as we make sure that the target is able to sync with the
> source, and the target gets into a state where it as an on-disk state
> equivalent to the target up to the minimum consistency point defined
> in the control file once the tool has done its work (this last point
> is too precise to be included in a global description to be honest).
> And the second sentence makes clear what are the actual diffs are.
> >> +   the point at which the WAL timelines of the source and target diverged plus
> >> +   the current state on the source of any blocks changed on the target after
> >> +   that divergence. While only changed blocks from existing relation files are
> >>
> >> And here we could mention that all the blocks copied from the source
> >> are the ones which are found in the WAL records of the target until
> >> the end of WAL of its timeline.  Still, that's basically what is
> >> mentioned in the first part of "How It Works", which explains things
> >> better.  I honestly don't really see that all this paragraph is an
> >> improvement over the simplicity of the original when it comes to
> >> understand the global idea of what pg_rewind does.
> >
> > The problem with the original is that while simple, it's actually
> > incorrect in that simplicity. Pg_rewind does *not* result in the data
> > directory on the target matching the data directory on the source.
>
> That's not what I get from the original docs, but I may be too much
> used to it.

I don't agree that that's a valid equivalency. I myself spent a lot of
time trying to understand how this could possibly be true a while
back, and even looked at source code to be certain. I've asked other
people and found the same confusion.

As I read it the 2nd second sentence doesn't actually tell you the
differences; it makes a quick attempt at summarizing *how* the first
sentence is true, but if the first sentence isn't accurate, then it's
hard to read the 2nd one as helping.

If you'd prefer something less detailed at this point at that point in
the docs, then something along the lines of "results in a data
directory state which can then be safely replayed from the source" or
some such.

The docs shouldn't be correct just for someone how already understands
the intricacies. And the end  user shouldn't have to read the "how it
works" (which incidentally is kinda hidden at the bottom underneath
the CLI args -- perhaps we could move that?) to extrapolate things in
the primary documentation.

James Coleman



Re: pg_rewind docs correction

От
Michael Paquier
Дата:
On Tue, Sep 17, 2019 at 08:38:18AM -0400, James Coleman wrote:
> I don't agree that that's a valid equivalency. I myself spent a lot of
> time trying to understand how this could possibly be true a while
> back, and even looked at source code to be certain. I've asked other
> people and found the same confusion.
>
> As I read it the 2nd second sentence doesn't actually tell you the
> differences; it makes a quick attempt at summarizing *how* the first
> sentence is true, but if the first sentence isn't accurate, then it's
> hard to read the 2nd one as helping.

Well, then it comes back to the part where I am used to the existing
docs :)

> If you'd prefer something less detailed at this point at that point in
> the docs, then something along the lines of "results in a data
> directory state which can then be safely replayed from the source" or
> some such.

Actually this is a good suggestion, and could replace the first
sentence of this paragraph.

> The docs shouldn't be correct just for someone how already understands
> the intricacies. And the end user shouldn't have to read the "how it
> works" (which incidentally is kinda hidden at the bottom underneath
> the CLI args -- perhaps we could move that?) to extrapolate things in
> the primary documentation.

Perhaps.  This doc page is not that long either.
--
Michael

Вложения

Re: pg_rewind docs correction

От
James Coleman
Дата:


On Tue, Sep 17, 2019 at 9:41 PM Michael Paquier <michael@paquier.xyz> wrote:
On Tue, Sep 17, 2019 at 08:38:18AM -0400, James Coleman wrote:
> I don't agree that that's a valid equivalency. I myself spent a lot of
> time trying to understand how this could possibly be true a while
> back, and even looked at source code to be certain. I've asked other
> people and found the same confusion.
>
> As I read it the 2nd second sentence doesn't actually tell you the
> differences; it makes a quick attempt at summarizing *how* the first
> sentence is true, but if the first sentence isn't accurate, then it's
> hard to read the 2nd one as helping.

Well, then it comes back to the part where I am used to the existing
docs :)

> If you'd prefer something less detailed at this point at that point in
> the docs, then something along the lines of "results in a data
> directory state which can then be safely replayed from the source" or
> some such.

Actually this is a good suggestion, and could replace the first
sentence of this paragraph.

> The docs shouldn't be correct just for someone how already understands
> the intricacies. And the end user shouldn't have to read the "how it
> works" (which incidentally is kinda hidden at the bottom underneath
> the CLI args -- perhaps we could move that?) to extrapolate things in
> the primary documentation.

Perhaps.  This doc page is not that long either.

I'd set this aside for quite a while, but I was looking at it again this afternoon, and I've come to see your concern about the opening paragraphs remaining relatively simple. To that end I believe I've come up with a patch that's a good compromise: retaining that simplicity and being more clear and accurate at the same time.

In the first paragraph I've updated it to refer to both "successful rewind and subsequent WAL replay" and the result I describe as being equivalent to the result of a base backup, since that's more technically correct anyway (the current text could be read as implying a full out copy of the data directory, but that's not really true just as it isn't with pg_basebackup).

I've added the information about how the backup label control file is written, and updated the How It Works steps to refer to that separately from restart.

Additionally the How It Works is updated to include WAL segments and new relation files in the list of files copied wholesale, since that was previously stated but somewhat contradicted there.

I realized I didn't previously add this to the CF; since it's not a new patch I've added it to the current CF, but if this is incorrect please let me know.

Thanks,
James

Вложения

Re: pg_rewind docs correction

От
James Coleman
Дата:
On Sun, Mar 8, 2020 at 5:13 PM James Coleman <jtc331@gmail.com> wrote:

I realized I didn't previously add this to the CF; since it's not a new patch I've added it to the current CF, but if this is incorrect please let me know.

 Hmm, looks like I can't add it to the current one. I added it to the next one. I think it could probably go now, since the patch is really 6 months old, but either way is fine -- it's just a docs patch.

James

Re: pg_rewind docs correction

От
Michael Paquier
Дата:
On Sun, Mar 08, 2020 at 05:13:21PM -0400, James Coleman wrote:
> I've added the information about how the backup label control file is
> written, and updated the How It Works steps to refer to that separately
> from restart.
>
> Additionally the How It Works is updated to include WAL segments and new
> relation files in the list of files copied wholesale, since that was
> previously stated but somewhat contradicted there.

-   The result is equivalent to replacing the target data directory with the
-   source one. Only changed blocks from relation files are copied;
-   all other files are copied in full, including configuration files. The
-   advantage of <application>pg_rewind</application> over taking a new base backup, or
-   tools like <application>rsync</application>, is that <application>pg_rewind</application> does
-   not require reading through unchanged blocks in the cluster. This makes
-   it a lot faster when the database is large and only a small
-   fraction of blocks differ between the clusters.
+   After a successful rewind and subsequent WAL replay, the target data
+   directory is equivalent to a base backup of the source data directory. While
+   only changed blocks from existing relation files are copied; all other files
+   are copied in full, including new relation files, configuration files, and WAL
+   segments. The advantage of <application>pg_rewind</application> over taking a

The first sentence you are adding refers to "subsequent WAL replay".
However, this paragraph emphasizes with the state of the target
cluster after running pg_rewind but *before* make the target cluster
start recovery.  So shouldn't you just remove the part "and subsequent
WAL replay" from your first new sentence?

In the same paragraph, I think that you should remove the "While" from
"While only changed blocks", as the second part of the sentence refers
to the other files, WAl segments, etc.

The second paragraph of the docs regarding timeline lookup is
unchanged, which is fine.

-   When the target server is started for the first time after running
-   <application>pg_rewind</application>, it will go into recovery mode and replay all
-   WAL generated in the source server after the point of divergence.
+   After running <application>pg_rewind</application> the data directory is
+   not immediately in a consistent state. However
+   <application>pg_rewind</application> configures the control file so that when
+   the target server is started again it will enter recovery mode and replay all
+   WAL generated in the source server after the point of divergence.

The second part of the third paragraph is not changed, and the
modification you are doing here is about the control file.  I am
still unconvinced that this is a good change, because mentioning the
control file would be actually more adapted to the part "How it
works", where you are adding details about the backup_label file, and
already include details about the minimum consistency LSN itself
stored in the control file.

+   <para>
+    Because <application>pg_rewind</application> copies configuration files
+    entirely from the source, correcting recovery configuration options before
+    restarting the server is necessary if you intend to re-introduce the target
+    as a replica of the source. If you restart the server after the rewind
+    operation has finished but without configuring recovery, the target will
+    again diverge from the primary.
+   </para>

True that this is not outlined enough.

+      The relation files are now to their state at the last checkpoint completed
+      prior to the point at which the WAL timelines of the source and target
+      diverged plus the current state on the source of any blocks changed on the
+      target after that divergence.

"Relation files are now in a state equivalent to the moment of the
last completed checkpoint prior to the point.."?

-      <filename>pg_stat_tmp/</filename>, and
-      <filename>pg_subtrans/</filename> are omitted from the data copied
-      from the source cluster. Any file or directory beginning with
-      <filename>pgsql_tmp</filename> is omitted, as well as are
+      <filename>pg_stat_tmp/</filename>, and <filename>pg_subtrans/</filename>
+      are omitted from the data copied from the source cluster. The files

This is just reorganizing an existing list, why?

+      Create a backup label file to begin WAL replay at the checkpoint created
+      at failover and  a minimum consistency LSN using
+      <literal>pg_current_wal_insert_lsn()</literal>, when using a live source
+      and the last checkpoint LSN, when using a stopped source.

Now would be the moment to mention the control file.

> I realized I didn't previously add this to the CF; since it's not a new
> patch I've added it to the current CF, but if this is incorrect please let
> me know.

The last CF of Postgres 13 began at the beginning of February :(
--
Michael

Вложения

Re: pg_rewind docs correction

От
James Coleman
Дата:
On Mon, Mar 9, 2020 at 2:59 AM Michael Paquier <michael@paquier.xyz> wrote:
On Sun, Mar 08, 2020 at 05:13:21PM -0400, James Coleman wrote:
> I've added the information about how the backup label control file is
> written, and updated the How It Works steps to refer to that separately
> from restart.
>
> Additionally the How It Works is updated to include WAL segments and new
> relation files in the list of files copied wholesale, since that was
> previously stated but somewhat contradicted there.

-   The result is equivalent to replacing the target data directory with the
-   source one. Only changed blocks from relation files are copied;
-   all other files are copied in full, including configuration files. The
-   advantage of <application>pg_rewind</application> over taking a new base backup, or
-   tools like <application>rsync</application>, is that <application>pg_rewind</application> does
-   not require reading through unchanged blocks in the cluster. This makes
-   it a lot faster when the database is large and only a small
-   fraction of blocks differ between the clusters.
+   After a successful rewind and subsequent WAL replay, the target data
+   directory is equivalent to a base backup of the source data directory. While
+   only changed blocks from existing relation files are copied; all other files
+   are copied in full, including new relation files, configuration files, and WAL
+   segments. The advantage of <application>pg_rewind</application> over taking a

The first sentence you are adding refers to "subsequent WAL replay".
However, this paragraph emphasizes with the state of the target
cluster after running pg_rewind but *before* make the target cluster
start recovery.  So shouldn't you just remove the part "and subsequent
WAL replay" from your first new sentence?

I'd originally typed this:
I'm not sure I follow. After pg_rewind but before replay the directory is *not* equivalent to a base backup. I don't see how paragraph is clearly limited to describing what pg_rewind does. While the 2nd sentence is about pg_rewind steps specifically, the paragraph (even in the original) goes on to compare it to a base backup so we're talking about the operation in totality not just the one tool.
 
But I realized while typing it that I was probably missing something of what you were getting at: is the hangup on calling out the WAL replay that a base backup (or rsync even) *also* requires WAL reply to reach a consistent state? I hadn't thought of that while writing this initially, so I've updated the patch to eliminate that part but also to make the analogy to base backups more direct, since it's helpful in understanding what result the tool is trying to accomplish and how it differs.

In the same paragraph, I think that you should remove the "While" from
"While only changed blocks", as the second part of the sentence refers
to the other files, WAl segments, etc.

Fixed as part of the above.
 
The second paragraph of the docs regarding timeline lookup is
unchanged, which is fine.

-   When the target server is started for the first time after running
-   <application>pg_rewind</application>, it will go into recovery mode and replay all
-   WAL generated in the source server after the point of divergence.
+   After running <application>pg_rewind</application> the data directory is
+   not immediately in a consistent state. However
+   <application>pg_rewind</application> configures the control file so that when
+   the target server is started again it will enter recovery mode and replay all
+   WAL generated in the source server after the point of divergence.

The second part of the third paragraph is not changed, and the
modification you are doing here is about the control file.  I am
still unconvinced that this is a good change, because mentioning the
control file would be actually more adapted to the part "How it
works", where you are adding details about the backup_label file, and
already include details about the minimum consistency LSN itself
stored in the control file.

I've removed the control file reference and instead continued the analogy to base backups.
 
+   <para>
+    Because <application>pg_rewind</application> copies configuration files
+    entirely from the source, correcting recovery configuration options before
+    restarting the server is necessary if you intend to re-introduce the target
+    as a replica of the source. If you restart the server after the rewind
+    operation has finished but without configuring recovery, the target will
+    again diverge from the primary.
+   </para>

True that this is not outlined enough.

Thanks.
 
+      The relation files are now to their state at the last checkpoint completed
+      prior to the point at which the WAL timelines of the source and target
+      diverged plus the current state on the source of any blocks changed on the
+      target after that divergence.

"Relation files are now in a state equivalent to the moment of the
last completed checkpoint prior to the point.."?

Updated.
 
-      <filename>pg_stat_tmp/</filename>, and
-      <filename>pg_subtrans/</filename> are omitted from the data copied
-      from the source cluster. Any file or directory beginning with
-      <filename>pgsql_tmp</filename> is omitted, as well as are
+      <filename>pg_stat_tmp/</filename>, and <filename>pg_subtrans/</filename>
+      are omitted from the data copied from the source cluster. The files

This is just reorganizing an existing list, why?

The grammar seemed a bit awkward to me, so while I was already reworking this paragraph I tried to clean that up a bit.
 
+      Create a backup label file to begin WAL replay at the checkpoint created
+      at failover and  a minimum consistency LSN using
+      <literal>pg_current_wal_insert_lsn()</literal>, when using a live source
+      and the last checkpoint LSN, when using a stopped source.

Now would be the moment to mention the control file.

I made that more explicit here, and also referenced the filenames directly (and with tags).
 
> I realized I didn't previously add this to the CF; since it's not a new
> patch I've added it to the current CF, but if this is incorrect please let
> me know.

The last CF of Postgres 13 began at the beginning of February :(

Still ongoing, correct? I guess I mentally think of them as being only one month, but I guess that's not actually true. Regardless I'm not sure what policy is for patches that have been in flight in hackers for a while but just missed being added to the CF app.

Thanks,
James
Вложения

Re: pg_rewind docs correction

От
Michael Paquier
Дата:
On Mon, Mar 09, 2020 at 09:26:17AM -0400, James Coleman wrote:
>> -      <filename>pg_stat_tmp/</filename>, and
>> -      <filename>pg_subtrans/</filename> are omitted from the data copied
>> -      from the source cluster. Any file or directory beginning with
>> -      <filename>pgsql_tmp</filename> is omitted, as well as are
>> +      <filename>pg_stat_tmp/</filename>, and
>> <filename>pg_subtrans/</filename>
>> +      are omitted from the data copied from the source cluster. The files
>>
>> This is just reorganizing an existing list, why?
>>
>
> The grammar seemed a bit awkward to me, so while I was already reworking
> this paragraph I tried to clean that up a bit.

Thanks for the new patch, and sorry for the delay.

Okay, I saw what you were coming at here, with one sentence for
directories, and one for files.

> Still ongoing, correct? I guess I mentally think of them as being only one
> month, but I guess that's not actually true. Regardless I'm not sure what
> policy is for patches that have been in flight in hackers for a while but
> just missed being added to the CF app.

This is a documentation patch, so improving this part of the docs now
is fine by me, particularly as this is an improvement.  Here are more
notes from me:
- I have removed the "As with a base backup" at the beginning of the
second paragraph you modified.  The first paragraph modified already
references a base backup, so one reference is enough IMO.
- WAL replay does not happen from the WAL position where WAL diverged,
but from the last checkpoint before WAL diverged.
- Did some tweaks about the new part for configuration files, as it
may actually not be necessary to update the configuration for recovery
to complete (depending on the settings of the source, the target may
just require the creation of a standby.signal file in its data
directory particularly with a common archive location for multiple
clusters).
- Some word-smithing in the step-by-step description.

Is the updated version fine for you?
--
Michael

Вложения

Re: pg_rewind docs correction

От
James Coleman
Дата:
On Tue, Apr 28, 2020 at 12:31 AM Michael Paquier <michael@paquier.xyz> wrote:
>
> On Mon, Mar 09, 2020 at 09:26:17AM -0400, James Coleman wrote:
> >> -      <filename>pg_stat_tmp/</filename>, and
> >> -      <filename>pg_subtrans/</filename> are omitted from the data copied
> >> -      from the source cluster. Any file or directory beginning with
> >> -      <filename>pgsql_tmp</filename> is omitted, as well as are
> >> +      <filename>pg_stat_tmp/</filename>, and
> >> <filename>pg_subtrans/</filename>
> >> +      are omitted from the data copied from the source cluster. The files
> >>
> >> This is just reorganizing an existing list, why?
> >>
> >
> > The grammar seemed a bit awkward to me, so while I was already reworking
> > this paragraph I tried to clean that up a bit.
>
> Thanks for the new patch, and sorry for the delay.
>
> Okay, I saw what you were coming at here, with one sentence for
> directories, and one for files.
>
> > Still ongoing, correct? I guess I mentally think of them as being only one
> > month, but I guess that's not actually true. Regardless I'm not sure what
> > policy is for patches that have been in flight in hackers for a while but
> > just missed being added to the CF app.
>
> This is a documentation patch, so improving this part of the docs now
> is fine by me, particularly as this is an improvement.  Here are more
> notes from me:
> - I have removed the "As with a base backup" at the beginning of the
> second paragraph you modified.  The first paragraph modified already
> references a base backup, so one reference is enough IMO.
> - WAL replay does not happen from the WAL position where WAL diverged,
> but from the last checkpoint before WAL diverged.
> - Did some tweaks about the new part for configuration files, as it
> may actually not be necessary to update the configuration for recovery
> to complete (depending on the settings of the source, the target may
> just require the creation of a standby.signal file in its data
> directory particularly with a common archive location for multiple
> clusters).
> - Some word-smithing in the step-by-step description.
>
> Is the updated version fine for you?

In your revised patched the follow paragraph:

+   <para>
+    As <application>pg_rewind</application> copies configuration files
+    entirely from the source, it may be required to correct the configuration
+    used for recovery before restarting the target server, especially the
+    the target is reintroduced as a standby of the source. If you restart
+    the server after the rewind operation has finished but without configuring
+    recovery, the target may again diverge from the primary.
+   </para>

I think is missing a word. Instead of "especially the the target"
should be "especially if the target".

In this block:

+      Create a <filename>backup_label</filename> file to begin WAL replay at
+      the checkpoint created at failover and configure the
+      <filename>pg_control</filename> file with a minimum consistency LSN
+      defined as the result of <literal>pg_current_wal_insert_lsn()</literal>
+      when rewinding from a live source and using the last checkpoint LSN
+      when rewinding from a stopped source.
+     </para>

Perhaps change "and using the last checkpoint LSN" to "or the last
checkpoint LSN". Alternatively you could make the grammar parallel by
changing to "and defined as the last checkpoint LSN", but that seems
wordy, and the "defined as [item or item]" is already a good grammar
construction.

Other than those two small things, your proposed revision looks good to me.

Thanks,
James



Re: pg_rewind docs correction

От
Michael Paquier
Дата:
On Tue, Apr 28, 2020 at 12:13:38PM -0400, James Coleman wrote:
> I think is missing a word. Instead of "especially the the target"
> should be "especially if the target".

Thanks, fixed.

> In this block:
>
> +      Create a <filename>backup_label</filename> file to begin WAL replay at
> +      the checkpoint created at failover and configure the
> +      <filename>pg_control</filename> file with a minimum consistency LSN
> +      defined as the result of <literal>pg_current_wal_insert_lsn()</literal>
> +      when rewinding from a live source and using the last checkpoint LSN
> +      when rewinding from a stopped source.
> +     </para>
>
> Perhaps change "and using the last checkpoint LSN" to "or the last
> checkpoint LSN". Alternatively you could make the grammar parallel by
> changing to "and defined as the last checkpoint LSN", but that seems
> wordy, and the "defined as [item or item]" is already a good grammar
> construction.

Using your first suggestion of "or the last checkpoint LSN" sounds
more natural as of this morning, so updated the patch with that.

I am letting that aside for a couple of days to see if others have
more comments, and will likely commit it after an extra lookup.
--
Michael

Вложения

Re: pg_rewind docs correction

От
Michael Paquier
Дата:
On Wed, Apr 29, 2020 at 09:15:06AM +0900, Michael Paquier wrote:
> I am letting that aside for a couple of days to see if others have
> more comments, and will likely commit it after an extra lookup.

And applied after an extra lookup.  Thanks for the discussion, James.
--
Michael

Вложения

Re: pg_rewind docs correction

От
James Coleman
Дата:
On Fri, May 1, 2020 at 4:46 AM Michael Paquier <michael@paquier.xyz> wrote:
>
> On Wed, Apr 29, 2020 at 09:15:06AM +0900, Michael Paquier wrote:
> > I am letting that aside for a couple of days to see if others have
> > more comments, and will likely commit it after an extra lookup.
>
> And applied after an extra lookup.  Thanks for the discussion, James.

Yep. Thanks for pushing to make sure it was as correct as possible
while improving it.

James