Обсуждение: Two pg_rewind patches (auto generate recovery conf and ensure clean shutdown)

Поиск
Список
Период
Сортировка

Two pg_rewind patches (auto generate recovery conf and ensure clean shutdown)

От
Paul Guo
Дата:
Hello, Postgres hackers,

Please see the attached patches.

The first patch adds an option to automatically generate recovery conf contents in related files, following pg_basebackup. In the patch, GenerateRecoveryConf(), WriteRecoveryConf() and escape_quotes() are almost same as them on pg_basebackup. The main difference is due to replication slot support and code (variables) limit. It seems that we could slightly refactor later to put some common code into another file after aligning pg_rewind with pg_basebackup. This was tested manually and was done by Jimmy (cc-ed), Ashiwin (cc-ed) and me.

Another patch does automatic clean shutdown by running a single mode postgres instance if the target was not clean shut down since that is required by pg_rewind. This was manually tested and was done by Jimmy (cc-ed) and me. I'm not sure if we want a test case for that though.

Thanks.
Вложения

Re: Two pg_rewind patches (auto generate recovery conf and ensureclean shutdown)

От
Michael Paquier
Дата:
On Tue, Mar 19, 2019 at 02:09:03PM +0800, Paul Guo wrote:
> The first patch adds an option to automatically generate recovery conf
> contents in related files, following pg_basebackup. In the patch,
> GenerateRecoveryConf(), WriteRecoveryConf() and escape_quotes() are almost
> same as them on pg_basebackup. The main difference is due to replication
> slot support and code (variables) limit. It seems that we could slightly
> refactor later to put some common code into another file after aligning
> pg_rewind with pg_basebackup. This was tested manually and was done by
> Jimmy (cc-ed), Ashiwin (cc-ed) and me.


Interesting.  The two routines have really the same logic, I would
recommend to have a first patch which does the refactoring and have
pg_rewind use it, and then a second patch which writes recovery.conf
and uses the first patch to get the contents.  Please note that the
common routine needs to be version-aware as pg_basebackup requires
compatibility with past versions, but you could just pass the version
number from the connection, and have pg_rewind pass the compiled-in
version value.

> Another patch does automatic clean shutdown by running a single mode
> postgres instance if the target was not clean shut down since that is
> required by pg_rewind. This was manually tested and was done by Jimmy
> (cc-ed) and me. I'm not sure if we want a test case for that though.

I am not sure that I see the value in that.  I'd rather let the
required service start and stop out of pg_rewind and not introduce
dependencies with other binaries.  This step can also take quite some
time depending on the amount of WAL to replay post-crash at recovery
and the shutdown checkpoint which is required to reach a consistent
on-disk state.
--
Michael

Вложения

Re: Two pg_rewind patches (auto generate recovery conf and ensureclean shutdown)

От
Paul Guo
Дата:


On Tue, Mar 19, 2019 at 2:18 PM Michael Paquier <michael@paquier.xyz> wrote:
On Tue, Mar 19, 2019 at 02:09:03PM +0800, Paul Guo wrote:
> The first patch adds an option to automatically generate recovery conf
> contents in related files, following pg_basebackup. In the patch,
> GenerateRecoveryConf(), WriteRecoveryConf() and escape_quotes() are almost
> same as them on pg_basebackup. The main difference is due to replication
> slot support and code (variables) limit. It seems that we could slightly
> refactor later to put some common code into another file after aligning
> pg_rewind with pg_basebackup. This was tested manually and was done by
> Jimmy (cc-ed), Ashiwin (cc-ed) and me.


Interesting.  The two routines have really the same logic, I would
recommend to have a first patch which does the refactoring and have
pg_rewind use it, and then a second patch which writes recovery.conf
and uses the first patch to get the contents.  Please note that the

This is a good suggestion also. Will do it.
 
common routine needs to be version-aware as pg_basebackup requires
compatibility with past versions, but you could just pass the version
number from the connection, and have pg_rewind pass the compiled-in
version value.

> Another patch does automatic clean shutdown by running a single mode
> postgres instance if the target was not clean shut down since that is
> required by pg_rewind. This was manually tested and was done by Jimmy
> (cc-ed) and me. I'm not sure if we want a test case for that though.

I am not sure that I see the value in that.  I'd rather let the
required service start and stop out of pg_rewind and not introduce
dependencies with other binaries.  This step can also take quite some

This makes recovery more automatically. Yes, it will add the dependency on the postgres
binary, but it seems that most time pg_rewind should be shipped as postgres
in the same install directory. From my experience of manually testing pg_rewind,
I feel that this besides auto-recovery-conf writing really alleviate my burden. I'm not sure how
other users usually do before running pg_rewind when the target is not cleanly shut down,
but probably we can add an argument to pg_rewind to give those people who want to
handle target separately another chance? default on or off whatever.
 
time depending on the amount of WAL to replay post-crash at recovery
and the shutdown checkpoint which is required to reach a consistent
on-disk state.

The time is still required for people who want to make the target ready for pg_rewind in another way.

Thanks.

Re: Two pg_rewind patches (auto generate recovery conf and ensureclean shutdown)

От
Michael Paquier
Дата:
On Wed, Mar 20, 2019 at 12:48:52PM +0800, Paul Guo wrote:
> This is a good suggestion also. Will do it.

Please note also that we don't care about recovery.conf since v12 as
recovery parameters are now GUCs.  I would suggest appending those
extra parameters to postgresql.auto.conf, which is what pg_basebackup
does.
--
Michael

Вложения

Re: Two pg_rewind patches (auto generate recovery conf and ensureclean shutdown)

От
Paul Guo
Дата:


On Wed, Mar 20, 2019 at 1:20 PM Michael Paquier <michael@paquier.xyz> wrote:
On Wed, Mar 20, 2019 at 12:48:52PM +0800, Paul Guo wrote:
> This is a good suggestion also. Will do it.

Please note also that we don't care about recovery.conf since v12 as
recovery parameters are now GUCs.  I would suggest appending those
extra parameters to postgresql.auto.conf, which is what pg_basebackup
does.
Yes, the recovery conf patch in the first email did like this, i.e. writing postgresql.auto.conf & standby.signal
 

Re: Two pg_rewind patches (auto generate recovery conf and ensureclean shutdown)

От
Paul Guo
Дата:
Hi Michael,

I updated the patches as attached following previous discussions.

The two patches:
v2-0001-Extact-common-functions-from-pg_basebackup-into-s.patch
v2-0002-Add-option-to-write-recovery-configuration-inform.patch

1. 0001 does move those common functions & variables to two new files (one .c and one .h) for both pg_rewind and pg_basebackup use,
note the functions are slightly modified (e.g. because conn is probably NULL on pg_rewind). I do not know where is more proper to put the
new files. Currently, they are under pg_basebackup and are used in pg_rewind (Makefile modified to support that).

2. 0002 adds the option to write recovery conf.

The below patch runs single mode Postgres if needed to make sure the target is cleanly shutdown. A new option is added (off by default).
v2-0001-Ensure-target-clean-shutdown-at-beginning-of-pg_r.patch

I've manually tested them and installcheck passes.

Thanks.

On Wed, Mar 20, 2019 at 1:23 PM Paul Guo <pguo@pivotal.io> wrote:


On Wed, Mar 20, 2019 at 1:20 PM Michael Paquier <michael@paquier.xyz> wrote:
On Wed, Mar 20, 2019 at 12:48:52PM +0800, Paul Guo wrote:
> This is a good suggestion also. Will do it.

Please note also that we don't care about recovery.conf since v12 as
recovery parameters are now GUCs.  I would suggest appending those
extra parameters to postgresql.auto.conf, which is what pg_basebackup
does.
Yes, the recovery conf patch in the first email did like this, i.e. writing postgresql.auto.conf & standby.signal
 
Вложения

Re: Two pg_rewind patches (auto generate recovery conf and ensureclean shutdown)

От
Thomas Munro
Дата:
On Fri, Apr 19, 2019 at 3:41 PM Paul Guo <pguo@pivotal.io> wrote:
> I updated the patches as attached following previous discussions.

Hi Paul,

Could we please have a fresh rebase now that the CF is here?

Thanks,

-- 
Thomas Munro
https://enterprisedb.com



Re: Two pg_rewind patches (auto generate recovery conf and ensureclean shutdown)

От
Alvaro Herrera
Дата:
On 2019-Apr-19, Paul Guo wrote:

> The below patch runs single mode Postgres if needed to make sure the target
> is cleanly shutdown. A new option is added (off by default).
> v2-0001-Ensure-target-clean-shutdown-at-beginning-of-pg_r.patch

Why do we need an option for this?  Is there a reason not to do this
unconditionally?

-- 
Álvaro Herrera                https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



Re: Two pg_rewind patches (auto generate recovery conf and ensureclean shutdown)

От
Paul Guo
Дата:
Rebased, aligned with recent changes in pg_rewind/pg_basebackup and then retested. Thanks.

On Mon, Jul 1, 2019 at 7:35 PM Thomas Munro <thomas.munro@gmail.com> wrote:
On Fri, Apr 19, 2019 at 3:41 PM Paul Guo <pguo@pivotal.io> wrote:
> I updated the patches as attached following previous discussions.

Hi Paul,

Could we please have a fresh rebase now that the CF is here?

Thanks,

--
Thomas Munro
https://urldefense.proofpoint.com/v2/url?u=https-3A__enterprisedb.com&d=DwIBaQ&c=lnl9vOaLMzsy2niBC8-h_K-7QJuNJEsFrzdndhuJ3Sw&r=Usi0ex6Ch92MsB5QQDgYFw&m=mxictY8xxFIFvsyxFYx4bXwF4PfnGWWRuYLLX4R1yhs&s=qvC2BI2OhKkBz71FO1w2XNy6dvfhIeGHT3X0yU3XDlU&e=
Вложения

Re: Two pg_rewind patches (auto generate recovery conf and ensureclean shutdown)

От
Paul Guo
Дата:


On Tue, Jul 2, 2019 at 12:35 AM Alvaro Herrera <alvherre@2ndquadrant.com> wrote:
On 2019-Apr-19, Paul Guo wrote:

> The below patch runs single mode Postgres if needed to make sure the target
> is cleanly shutdown. A new option is added (off by default).
> v2-0001-Ensure-target-clean-shutdown-at-beginning-of-pg_r.patch

Why do we need an option for this?  Is there a reason not to do this
unconditionally?

There is concern about this (see previous emails in this thread). On greenplum (MPP DB based on Postgres),
we unconditionally do this. I'm not sure about usually how Postgres users do this when there is an unclean shutdown,
but providing an option seem to be safer to avoid breaking existing script/service whatever. If many people
think this option is unnecessary, I'm fine to remove the option and keep the code logic.
 
 

Re: Two pg_rewind patches (auto generate recovery conf and ensureclean shutdown)

От
Thomas Munro
Дата:
On Tue, Jul 2, 2019 at 5:46 PM Paul Guo <pguo@pivotal.io> wrote:
> Rebased, aligned with recent changes in pg_rewind/pg_basebackup and then retested. Thanks.

Hi Paul,

A minor build problem on Windows:

src/bin/pg_rewind/pg_rewind.c(32): fatal error C1083: Cannot open
include file: 'backup_common.h': No such file or directory
[C:\projects\postgresql\pg_rewind.vcxproj]

https://ci.appveyor.com/project/postgresql-cfbot/postgresql/build/1.0.46422
http://cfbot.cputube.org/paul-guo.html

-- 
Thomas Munro
https://enterprisedb.com



Re: Two pg_rewind patches (auto generate recovery conf and ensureclean shutdown)

От
Paul Guo
Дата:
Yes, the patches changed Makefile so that pg_rewind and pg_basebackup could use some common code, but for Windows build, I'm not sure where are those window build files. Does anyone know about that? Thanks.

Re: Two pg_rewind patches (auto generate recovery conf and ensureclean shutdown)

От
Michael Paquier
Дата:
On Tue, Jul 09, 2019 at 10:48:49PM +0800, Paul Guo wrote:
> Yes, the patches changed Makefile so that pg_rewind and pg_basebackup could
> use some common code, but for Windows build, I'm not sure where are those
> window build files. Does anyone know about that? Thanks.

The VS scripts are located in src/tools/msvc/.  You will likely need
to tweak things like $frontend_extraincludes or variables in the same
area for this patch (please see Mkvcbuild.pm).
--
Michael

Вложения

Re: Two pg_rewind patches (auto generate recovery conf and ensureclean shutdown)

От
Paul Guo
Дата:


On Wed, Jul 10, 2019 at 3:28 PM Michael Paquier <michael@paquier.xyz> wrote:
On Tue, Jul 09, 2019 at 10:48:49PM +0800, Paul Guo wrote:
> Yes, the patches changed Makefile so that pg_rewind and pg_basebackup could
> use some common code, but for Windows build, I'm not sure where are those
> window build files. Does anyone know about that? Thanks.

The VS scripts are located in src/tools/msvc/.  You will likely need
to tweak things like $frontend_extraincludes or variables in the same
area for this patch (please see Mkvcbuild.pm).

Thanks. Both Mkvcbuild.pm and pg_rewind/Makefile are modified to make Windows build pass in a
local environment (Hopefully this passes the CI testing), also now pg_rewind/Makefile does not
create soft link for backup_common.h anymore. Instead -I is used to specify the header directory.

I also noticed that doc change is needed so modified documents for the two new options accordingly.
Please see the attached new patches.
Вложения

Re: Two pg_rewind patches (auto generate recovery conf and ensureclean shutdown)

От
Alvaro Herrera
Дата:
On 2019-Jul-15, Paul Guo wrote:

> Thanks. Both Mkvcbuild.pm and pg_rewind/Makefile are modified to make
> Windows build pass in a
> local environment (Hopefully this passes the CI testing), also now
> pg_rewind/Makefile does not
> create soft link for backup_common.h anymore. Instead -I is used to specify
> the header directory.

It seems there's minor breakage in the build,  per CFbot.  Can you
please rebase this?

Thanks

-- 
Álvaro Herrera                https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



Re: Two pg_rewind patches (auto generate recovery conf and ensureclean shutdown)

От
Paul Guo
Дата:

It seems there's minor breakage in the build,  per CFbot.  Can you
please rebase this?

There is a code conflict. See attached for the new version. Thanks. 
Вложения

Re: Two pg_rewind patches (auto generate recovery conf and ensureclean shutdown)

От
Alvaro Herrera
Дата:
Thank for rebasing.

I didn't like 0001 very much.

* It seems now would be the time to stop pretending we're using a file
called recovery.conf; I know we still support older server versions that
use that file, but it sounds like we should take the opportunity to
rename the function to be less misleading once those versions vanish out
of existance.

* disconnect_and_exit seems a bit out of place compared to the other
parts of this new module.  I think you only put it there so that the
'conn' can be a global, and that you can stop passing 'conn' as a
variable to GenerateRecoveryConf.  It seems more modular to me to keep
it as a separate variable in each program and have it passed down to the
routine that writes the file.

* From modularity also seems better to me to avoid a global variable
'recoveryconfcontents' and instead return the string from
GenerateRecoveryConf to pass as a param to WriteRecoveryConf.
(In fact, I wonder why the current structure is like it is, namely to
have ReceiveAndUnpackTarFile write the file; why wouldn't be its caller
be responsible for writing it?)

I wonder about putting this new file in src/fe_utils instead of keeping
it in pg_basebackup and symlinking to pg_rewind.  Maybe if we make it a
true module (recovery_config_gen.c) it makes more sense there.

0002 seems okay as far as it goes.


0003:

I still don't understand why we need a command-line option to do this.
Why should it not be the default behavior?

-- 
Álvaro Herrera                https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



Re: Two pg_rewind patches (auto generate recovery conf and ensureclean shutdown)

От
Paul Guo
Дата:
Thank for rebasing.

I didn't like 0001 very much.

* It seems now would be the time to stop pretending we're using a file
called recovery.conf; I know we still support older server versions that
use that file, but it sounds like we should take the opportunity to
rename the function to be less misleading once those versions vanish out
of existance.

How about renaming the function names to 
GenerateRecoveryConf -> GenerateRecoveryConfContents 
WriteRecoveryConf -> WriteRecoveryConfInfo    <- it writes standby.signal on pg12+, so function name WriteRecoveryConfContents is not accurate.
and
variable writerecoveryconf -> write_recovery_conf_info?


* disconnect_and_exit seems a bit out of place compared to the other
parts of this new module.  I think you only put it there so that the
'conn' can be a global, and that you can stop passing 'conn' as a
variable to GenerateRecoveryConf.  It seems more modular to me to keep
it as a separate variable in each program and have it passed down to the
routine that writes the file.

* From modularity also seems better to me to avoid a global variable
'recoveryconfcontents' and instead return the string from
GenerateRecoveryConf to pass as a param to WriteRecoveryConf.
(In fact, I wonder why the current structure is like it is, namely to
have ReceiveAndUnpackTarFile write the file; why wouldn't be its caller
be responsible for writing it?)
 
Reasonable to make common code include less variables. I can try modifying
the patches to remove the previously added variables below in the common code.

+/* Contents of configuration file to be generated */
+extern PQExpBuffer recoveryconfcontents;
+
+extern bool writerecoveryconf;
+extern char *replication_slot;
+PGconn    *conn;
 

I wonder about putting this new file in src/fe_utils instead of keeping
it in pg_basebackup and symlinking to pg_rewind.  Maybe if we make it a
true module (recovery_config_gen.c) it makes more sense there.

I thought some about where to put the common code also. It seems pg_rewind
and pg_basebackup are the only consumers of the small common code.  I doubt
it deserves a separate file under src/fe_utils.
 

0003:

I still don't understand why we need a command-line option to do this.
Why should it not be the default behavior?
 
This was discussed but frankly speaking I do not know how other postgres
users or enterprise providers handle this (probably some have own scripts?).
I could easily remove the option code if more and more people agree on that
or at least we could turn it on by default?

Thanks

Re: Two pg_rewind patches (auto generate recovery conf and ensureclean shutdown)

От
Alvaro Herrera from 2ndQuadrant
Дата:
On 2019-Sep-09, Paul Guo wrote:

> >
> > Thank for rebasing.
> >
> > I didn't like 0001 very much.
> >
> > * It seems now would be the time to stop pretending we're using a file
> > called recovery.conf; I know we still support older server versions that
> > use that file, but it sounds like we should take the opportunity to
> > rename the function to be less misleading once those versions vanish out
> > of existance.
> 
> How about renaming the function names to
> GenerateRecoveryConf -> GenerateRecoveryConfContents
> WriteRecoveryConf -> WriteRecoveryConfInfo    <- it writes standby.signal
> on pg12+, so function name WriteRecoveryConfContents is not accurate.

GenerateRecoveryConfig / WriteRecoveryConfig ?

> > I wonder about putting this new file in src/fe_utils instead of keeping
> > it in pg_basebackup and symlinking to pg_rewind.  Maybe if we make it a
> > true module (recovery_config_gen.c) it makes more sense there.
> >
> I thought some about where to put the common code also. It seems pg_rewind
> and pg_basebackup are the only consumers of the small common code.  I doubt
> it deserves a separate file under src/fe_utils.

Hmm, but other things there are also used by only two programs, say
psqlscan.l and conditional.c are just for psql and pgbench.

> > 0003:
> >
> > I still don't understand why we need a command-line option to do this.
> > Why should it not be the default behavior?
> 
> This was discussed but frankly speaking I do not know how other postgres
> users or enterprise providers handle this (probably some have own scripts?).
> I could easily remove the option code if more and more people agree on that
> or at least we could turn it on by default?

Well, I've seen no contrary votes, and frankly I see no use for the
opposite (current) behavior.

-- 
Álvaro Herrera                https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



Re: Two pg_rewind patches (auto generate recovery conf and ensureclean shutdown)

От
Paul Guo
Дата:

Re: Two pg_rewind patches (auto generate recovery conf and ensureclean shutdown)

От
Paul Guo
Дата:
The patch series failed on Windows CI. I modified the Windows build file to fix that. See attached for the v7 version.

Вложения

Re: Two pg_rewind patches (auto generate recovery conf and ensureclean shutdown)

От
Alvaro Herrera
Дата:
On 2019-Sep-20, Paul Guo wrote:

> The patch series failed on Windows CI. I modified the Windows build file to
> fix that. See attached for the v7 version.

Thanks.

Question about 0003.  If we specify --skip-clean-shutdown and the cluter
was not cleanly shut down, shouldn't we error out instead of trying to
press on?

-- 
Álvaro Herrera                https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



Re: Two pg_rewind patches (auto generate recovery conf and ensureclean shutdown)

От
Paul Guo
Дата:
On 2019-Sep-20, Paul Guo wrote:

> The patch series failed on Windows CI. I modified the Windows build file to
> fix that. See attached for the v7 version.

Thanks.

Question about 0003.  If we specify --skip-clean-shutdown and the cluter
was not cleanly shut down, shouldn't we error out instead of trying to
press on?

pg_rewind would error out in this case, see sanityChecks().
Users are expected to clean up themselves if they use this argument.

Re: Two pg_rewind patches (auto generate recovery conf and ensureclean shutdown)

От
Alvaro Herrera
Дата:
CC Alexey for reasons that become clear below.

On 2019-Sep-25, Paul Guo wrote:

> > Question about 0003.  If we specify --skip-clean-shutdown and the cluter
> > was not cleanly shut down, shouldn't we error out instead of trying to
> > press on?
> 
> pg_rewind would error out in this case, see sanityChecks().
> Users are expected to clean up themselves if they use this argument.

Ah, good.  We should have a comment about that below the relevant
stanza, I suggest.  (Or maybe in the same comment that ends in line
272).

I pushed 0001 with a few tweaks.  Nothing really substantial, just my
OCD that doesn't leave me alone ... but this means your subsequent
patches need to be adjusted.  One thing is that that patch touched
pg_rewind for no reason (those changes should have been in 0002) --
dropped those.

Another thing in 0002 is that you're adding a "-R" switch to pg_rewind,
but we have another patch in the commitfest using the same switch for a
different purpose.  Maybe you guys need to get to an agreement over who
uses the letter :-)  Also, it would be super helpful if you review
Alexey's patch: https://commitfest.postgresql.org/24/1849/


This line is far too long:

+   printf(_("  -s, --skip-clean-shutdown      skip running single-mode postgres if needed to make sure target is clean
shutdown\n"));

Can we make the description more concise?

-- 
Álvaro Herrera                https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



Re: Two pg_rewind patches (auto generate recovery conf and ensureclean shutdown)

От
Laurenz Albe
Дата:
On Wed, 2019-09-25 at 14:48 -0300, Alvaro Herrera wrote:
> Another thing in 0002 is that you're adding a "-R" switch to
> pg_rewind, but we have another patch in the commitfest using the same
> switch for a different purpose.  Maybe you guys need to get to an
> agreement over who uses the letter :-)  Also, it would be super
> helpful if you review Alexey's patch:
> https://commitfest.postgresql.org/24/1849/

I believe that -R should be reserved for creating recovery.conf,
similar to pg_basebackup.

Everything else would be confusing.

I've been missing pg_rewind -R!

Yours,
Laurenz Albe




Re: Two pg_rewind patches (auto generate recovery conf and ensureclean shutdown)

От
a.kondratov@postgrespro.ru
Дата:
On 2019-09-25 20:48, Alvaro Herrera wrote:
> CC Alexey for reasons that become clear below.
> 
> Another thing in 0002 is that you're adding a "-R" switch to pg_rewind,
> but we have another patch in the commitfest using the same switch for a
> different purpose.  Maybe you guys need to get to an agreement over who
> uses the letter :-)
> 

Thank you for mentioning me. I've been monitoring silently this thread 
and was ready to modify my patch if this one will proceed faster. It 
seems like it's time :)

On 2019-09-25 22:26, Laurenz Albe wrote:
> 
> I believe that -R should be reserved for creating recovery.conf,
> similar to pg_basebackup.
> 
> Everything else would be confusing.
> 
> I've been missing pg_rewind -R!
> 

Yes, -R is already used in pg_basebackup for the same functionality, so 
it seems natural to use it here as well for consistency.

I will review options naming in my own patch and update it accordingly. 
Maybe -w/-W or -a/-A options will be good, since it's about WALs 
retrieval from archive.


Regards
--
Alexey

P.S. Just noticed that in v12 fullname of -R option in pg_basebackup is 
still --write-recovery-conf, which is good for a backward compatibility, 
but looks a little bit awkward, since recovery.conf doesn't exist 
already, doesn't it? However, one may read it as 
'write-recovery-configuration', then it seems fine.




Re: Two pg_rewind patches (auto generate recovery conf and ensureclean shutdown)

От
Paul Guo
Дата:


On Thu, Sep 26, 2019 at 1:48 AM Alvaro Herrera <alvherre@2ndquadrant.com> wrote:
CC Alexey for reasons that become clear below.

On 2019-Sep-25, Paul Guo wrote:

> > Question about 0003.  If we specify --skip-clean-shutdown and the cluter
> > was not cleanly shut down, shouldn't we error out instead of trying to
> > press on?
>
> pg_rewind would error out in this case, see sanityChecks().
> Users are expected to clean up themselves if they use this argument.

Ah, good.  We should have a comment about that below the relevant
stanza, I suggest.  (Or maybe in the same comment that ends in line
272).

I pushed 0001 with a few tweaks.  Nothing really substantial, just my
OCD that doesn't leave me alone ... but this means your subsequent
patches need to be adjusted.  One thing is that that patch touched
pg_rewind for no reason (those changes should have been in 0002) --
dropped those.

Another thing in 0002 is that you're adding a "-R" switch to pg_rewind,
but we have another patch in the commitfest using the same switch for a
different purpose.  Maybe you guys need to get to an agreement over who
uses the letter :-)  Also, it would be super helpful if you review
Alexey's patch: https://commitfest.postgresql.org/24/1849/


This line is far too long:

+   printf(_("  -s, --skip-clean-shutdown      skip running single-mode postgres if needed to make sure target is clean shutdown\n"));

Can we make the description more concise?

Thanks. I've updated the reset two patches and attached as v8.

Note in the 2nd patch, the long option is changed as below. Both the option and description
now seems to be more concise since we want db state as either DB_SHUTDOWNED or
DB_SHUTDOWNED_IN_RECOVERY. 

"-s, --no-ensure-shutdowned     do not auto-fix unclean shutdown"

Вложения

Re: Two pg_rewind patches (auto generate recovery conf and ensureclean shutdown)

От
Paul Guo
Дата:

Yes, -R is already used in pg_basebackup for the same functionality, so
it seems natural to use it here as well for consistency.

I will review options naming in my own patch and update it accordingly.
Maybe -w/-W or -a/-A options will be good, since it's about WALs
retrieval from archive.


Thanks
 

Regards
--
Alexey

P.S. Just noticed that in v12 fullname of -R option in pg_basebackup is
still --write-recovery-conf, which is good for a backward compatibility,
but looks a little bit awkward, since recovery.conf doesn't exist
already, doesn't it? However, one may read it as
'write-recovery-configuration', then it seems fine.


Yes, here is the description
"--write-recovery-conf      write configuration for replication"
So we do not mention that is the file recovery.conf. People do not know
about the recovery.conf history might really not be confused since
postgresql has various configuration files.
 

Re: Two pg_rewind patches (auto generate recovery conf and ensureclean shutdown)

От
Alvaro Herrera
Дата:
> Thanks. I've updated the reset two patches and attached as v8.

Great, thanks.

> Note in the 2nd patch, the long option is changed as below. Both the option
> and description
> now seems to be more concise since we want db state as either DB_SHUTDOWNED
> or
> DB_SHUTDOWNED_IN_RECOVERY.
> 
> "-s, --no-ensure-shutdowned     do not auto-fix unclean shutdown"

Note that "shutdowned" is incorrect English; we've let
it live in the code because it's not user-visible, but we should
certainly not immortalize it where it becomes so.  I suppose
"--no-ensure-shutdown" is okay, although I think some may prefer
"--no-ensure-shut-down".  Opinions from native speakers would be
welcome.  Also, let's expand "auto-fix" to "automatically fix" (or
"repair" if there's room in the line?  Not sure. Can be bikeshedded to
death I guess.)

Secondarily, I see no reason to test connstr_source rather than just
"conn" in the other patch; doing it the other way is more natural, since
it's that thing that's tested as an argument.

pg_rewind.c: Please put the new #include line keeping the alphabetical
order.

-- 
Álvaro Herrera                https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



Re: Two pg_rewind patches (auto generate recovery conf and ensureclean shutdown)

От
Paul Guo
Дата:

> Note in the 2nd patch, the long option is changed as below. Both the option
> and description
> now seems to be more concise since we want db state as either DB_SHUTDOWNED
> or
> DB_SHUTDOWNED_IN_RECOVERY.
>
> "-s, --no-ensure-shutdowned     do not auto-fix unclean shutdown"

Note that "shutdowned" is incorrect English; we've let
it live in the code because it's not user-visible, but we should
certainly not immortalize it where it becomes so.  I suppose
"--no-ensure-shutdown" is okay, although I think some may prefer
"--no-ensure-shut-down".  Opinions from native speakers would be
welcome.  Also, let's expand "auto-fix" to "automatically fix" (or
"repair" if there's room in the line?  Not sure. Can be bikeshedded to
death I guess.)

I choose that one from the below tree.

--no-ensure-shutdown
--no-ensure-shutdowned
--no-ensure-clean-shutdown

Now I agree for user experience we should not use the 2nd one. For
--no-ensure-clean-shutdown or -no-ensure-shut-down, seems too many -.

I'm using --no-ensure-shutdown in the new version unless there are better suggestions.
 

Secondarily, I see no reason to test connstr_source rather than just
"conn" in the other patch; doing it the other way is more natural, since
it's that thing that's tested as an argument.

pg_rewind.c: Please put the new #include line keeping the alphabetical
order.

Agreed to the above suggestions. I attached the v9.

Thanks.
Вложения

Re: Two pg_rewind patches (auto generate recovery conf and ensureclean shutdown)

От
Alexey Kondratov
Дата:
On 27.09.2019 6:27, Paul Guo wrote:
>
>
>     Secondarily, I see no reason to test connstr_source rather than just
>     "conn" in the other patch; doing it the other way is more natural,
>     since
>     it's that thing that's tested as an argument.
>
>     pg_rewind.c: Please put the new #include line keeping the alphabetical
>     order.
>
>
> Agreed to the above suggestions. I attached the v9.
>

I went through the remaining two patches and they seem to be very clear 
and concise. However, there are two points I could complain about:

1) Maybe I've missed it somewhere in the thread above, but currently 
pg_rewind allows to run itself with -R and --source-pgdata. In that case 
-R option is just swallowed and neither standby.signal, nor 
postgresql.auto.conf is written, which is reasonable though. Should it 
be stated somehow in the docs that -R option always has to go altogether 
with --source-server? Or should pg_rewind notify user that options are 
incompatible and no recovery configuration will be written?

2) Are you going to leave -R option completely without tap-tests? 
Attached is a small patch, which tests -R option along with the existing 
'remote' case. If needed it may be split into two separate cases. First, 
it tests that pg_rewind is able to succeed with minimal permissions 
according to the Michael's patch d9f543e [1]. Next, it checks presence 
of standby.signal and adds REPLICATION permission to rewind_user to test 
that new standby is able to start with generated recovery configuration.

[1] 
https://github.com/postgres/postgres/commit/d9f543e9e9be15f92abdeaf870e57ef289020191


Regards

-- 
Alexey Kondratov

Postgres Professional https://www.postgrespro.com
Russian Postgres Company


Вложения

Re: Two pg_rewind patches (auto generate recovery conf and ensureclean shutdown)

От
Alvaro Herrera
Дата:
On 2019-Sep-27, Alexey Kondratov wrote:

> 1) Maybe I've missed it somewhere in the thread above, but currently
> pg_rewind allows to run itself with -R and --source-pgdata. In that case -R
> option is just swallowed and neither standby.signal, nor
> postgresql.auto.conf is written, which is reasonable though. Should it be
> stated somehow in the docs that -R option always has to go altogether with
> --source-server? Or should pg_rewind notify user that options are
> incompatible and no recovery configuration will be written?

Hmm I think it should throw an error, yeah.  Ignoring options is not
good.

> +        # Now, when pg_rewind apparently succeeded with minimal permissions,
> +        # add REPLICATION privilege.  So we could test that new standby
> +        # is able to connect to the new master with generated config.
> +        $node_standby->psql(
> +            'postgres', "ALTER ROLE rewind_user WITH REPLICATION;");

I think this better use safe_psql.

-- 
Álvaro Herrera                https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



Re: Two pg_rewind patches (auto generate recovery conf and ensureclean shutdown)

От
Alvaro Herrera
Дата:
On 2019-Sep-27, Paul Guo wrote:

> I'm using --no-ensure-shutdown in the new version unless there are better
> suggestions.

That sounds sufficiently good.  I pushed this patch, after fixing a few
smallish problems, such as an assertion failure because of the
terminating \n in the error message when "postgres --single" fails
(which I tested by introducing a typo in the command).  I also removed
the short option, because I doubt that this option is useful enough to
warrant using up such an important shorthand (Maybe if it had been
-\ or -% or -& I would have let it through, since I doubt anybody would
have wanted to use those for anything else).  But if somebody disagrees,
they can send a patch to restore it, and we can then discuss the merits
of individual chars to use.

I also added quotes to DEVNULL, because we do that everywhere.  Maybe
there exists a system somewhere that requires this ... !!??

Finally, I split out the command in the error message in case it fails.

Thanks.

-- 
Álvaro Herrera                https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



Re: Two pg_rewind patches (auto generate recovery conf and ensureclean shutdown)

От
Paul Guo
Дата:

I went through the remaining two patches and they seem to be very clear
and concise. However, there are two points I could complain about:

1) Maybe I've missed it somewhere in the thread above, but currently
pg_rewind allows to run itself with -R and --source-pgdata. In that case
-R option is just swallowed and neither standby.signal, nor
postgresql.auto.conf is written, which is reasonable though. Should it
be stated somehow in the docs that -R option always has to go altogether
with --source-server? Or should pg_rewind notify user that options are
incompatible and no recovery configuration will be written?

I modified code & doc to address this. In code, pg_rewind will error out for the local case.
 
2) Are you going to leave -R option completely without tap-tests?
Attached is a small patch, which tests -R option along with the existing
'remote' case. If needed it may be split into two separate cases. First,
it tests that pg_rewind is able to succeed with minimal permissions
according to the Michael's patch d9f543e [1]. Next, it checks presence
of standby.signal and adds REPLICATION permission to rewind_user to test
that new standby is able to start with generated recovery configuration.

[1]
https://github.com/postgres/postgres/commit/d9f543e9e9be15f92abdeaf870e57ef289020191

 
It seems that we could further disabling recovery info setting code for the 'remote' test case?

-   my $port_standby = $node_standby->port;
-   $node_master->append_conf(
-       'postgresql.conf', qq(
-primary_conninfo='port=$port_standby'
-));
+   if ($test_mode ne "remote")
+   {
+       my $port_standby = $node_standby->port;
+       $node_master->append_conf(
+           'postgresql.conf',
+           qq(primary_conninfo='port=$port_standby'));

-   $node_master->set_standby_mode();
+       $node_master->set_standby_mode();
+   }

Thanks.

Вложения

Re: Two pg_rewind patches (auto generate recovery conf and ensureclean shutdown)

От
Alexey Kondratov
Дата:
On 27.09.2019 17:28, Alvaro Herrera wrote:
>
>> +        # Now, when pg_rewind apparently succeeded with minimal permissions,
>> +        # add REPLICATION privilege.  So we could test that new standby
>> +        # is able to connect to the new master with generated config.
>> +        $node_standby->psql(
>> +            'postgres', "ALTER ROLE rewind_user WITH REPLICATION;");
> I think this better use safe_psql.
>

Yes, indeed.

On 30.09.2019 10:07, Paul Guo wrote:
>
>     2) Are you going to leave -R option completely without tap-tests?
>     Attached is a small patch, which tests -R option along with the
>     existing
>     'remote' case. If needed it may be split into two separate cases.
>     First,
>     it tests that pg_rewind is able to succeed with minimal permissions
>     according to the Michael's patch d9f543e [1]. Next, it checks
>     presence
>     of standby.signal and adds REPLICATION permission to rewind_user
>     to test
>     that new standby is able to start with generated recovery
>     configuration.
>
>     [1]
>     https://github.com/postgres/postgres/commit/d9f543e9e9be15f92abdeaf870e57ef289020191
>
> It seems that we could further disabling recovery info setting code 
> for the 'remote' test case?
>
> -   my $port_standby = $node_standby->port;
> -   $node_master->append_conf(
> -       'postgresql.conf', qq(
> -primary_conninfo='port=$port_standby'
> -));
> +   if ($test_mode ne "remote")
> +   {
> +       my $port_standby = $node_standby->port;
> +       $node_master->append_conf(
> +           'postgresql.conf',
> +           qq(primary_conninfo='port=$port_standby'));
>
> -   $node_master->set_standby_mode();
> +       $node_master->set_standby_mode();
> +   }
>
>

Yeah, it makes sense. It is excessive for remote if we add '-R' there. 
I've updated and attached my test adding patch.



-- 
Alexey Kondratov

Postgres Professional https://www.postgrespro.com
Russian Postgres Company


Вложения

Re: Two pg_rewind patches (auto generate recovery conf and ensureclean shutdown)

От
Alvaro Herrera
Дата:
OK, I pushed this patch as well as Alexey's test patch.  It all works
for me, and the coverage report shows that we're doing the new thing ...
though only in the case that rewind *is* required.  There is no test to
verify the case where rewind is *not* required.  I guess it'd also be
good to test the case when we throw the new error, if only for
completeness ...

-- 
Álvaro Herrera                https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



Re: Two pg_rewind patches (auto generate recovery conf and ensureclean shutdown)

От
Alvaro Herrera
Дата:
On 2019-Mar-19, Paul Guo wrote:

> Hello, Postgres hackers,
> 
> Please see the attached patches.

BTW in the future if you have two separate patches, please post them in
separate threads and use separate commitfest items for each, even if
they have minor conflicts.

-- 
Álvaro Herrera                https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



Re: Two pg_rewind patches (auto generate recovery conf and ensureclean shutdown)

От
Paul Guo
Дата:
BTW in the future if you have two separate patches, please post them in
separate threads and use separate commitfest items for each, even if
they have minor conflicts.

Sure. Thanks. 

Re: Two pg_rewind patches (auto generate recovery conf and ensureclean shutdown)

От
Alexey Kondratov
Дата:
Hi Alvaro,

On 30.09.2019 20:13, Alvaro Herrera wrote:
> OK, I pushed this patch as well as Alexey's test patch.  It all works
> for me, and the coverage report shows that we're doing the new thing ...
> though only in the case that rewind *is* required.  There is no test to
> verify the case where rewind is *not* required.  I guess it'd also be
> good to test the case when we throw the new error, if only for
> completeness ...

I've directly followed your guess and tried to elaborate pg_rewind test 
cases and... It seems I've caught a few bugs:

1) --dry-run actually wasn't completely 'dry'. It did update target 
controlfile, which could cause repetitive pg_rewind calls to fail after 
dry-run ones.

2) --no-ensure-shutdown flag was broken, it simply didn't turn off this 
new feature.

3) --write-recovery-conf didn't obey the --dry-run flag.

Thus, it was definitely a good idea to add new tests. Two patches are 
attached:

1) First one fixes all the issues above;

2) Second one slightly increases pg_rewind overall code coverage from 
74% to 78.6%.

Should I put this fix on the next commitfest?


Regards

-- 
Alexey Kondratov

Postgres Professional https://www.postgrespro.com
Russian Postgres Company


P.S. My apologies that I've missed two of these bugs during review.


Вложения

Re: Two pg_rewind patches (auto generate recovery conf and ensureclean shutdown)

От
Michael Paquier
Дата:
On Wed, Oct 02, 2019 at 08:28:09PM +0300, Alexey Kondratov wrote:
> I've directly followed your guess and tried to elaborate pg_rewind test
> cases and... It seems I've caught a few bugs:
>
> 1) --dry-run actually wasn't completely 'dry'. It did update target
> controlfile, which could cause repetitive pg_rewind calls to fail after
> dry-run ones.

I have just paid attention to this thread, but this is a bug which
goes down to 12 actually so let's treat it independently of the rest.
The control file was not written thanks to the safeguards in
write_target_range() in past versions, but the recent refactoring
around control file handling broke that promise.  Another thing which
is not completely exact is the progress reporting which should be
reported even if the dry-run mode runs.  That's less critical, but
let's make things consistent.

Patch 0001 also forgot that recovery.conf should not be written either
when no rewind is needed.

I have reworked your first patch as per the attached.  What do you
think about it?  The part with the control file needs to go down to
v12, and I would likely split that into two commits on HEAD: one for
the control file and a second for the recovery.conf portion with the
fix for --no-ensure-shutdown to keep a cleaner history.

+               # Check that incompatible options error out.
+               command_fails(
+                       [
+                               'pg_rewind', "--debug",
+                               "--source-pgdata=$standby_pgdata",
+                               "--target-pgdata=$master_pgdata", "-R",
+                               "--no-ensure-shutdown"
+                       ],
+                       'pg_rewind local with -R');
Incompatible options had better be checked within a separate perl
script?  We generally do that for the other binaries.
--
Michael

Вложения

Re: Two pg_rewind patches (auto generate recovery conf and ensureclean shutdown)

От
Alexey Kondratov
Дата:
On 03.10.2019 6:07, Michael Paquier wrote:
> On Wed, Oct 02, 2019 at 08:28:09PM +0300, Alexey Kondratov wrote:
>> I've directly followed your guess and tried to elaborate pg_rewind test
>> cases and... It seems I've caught a few bugs:
>>
>> 1) --dry-run actually wasn't completely 'dry'. It did update target
>> controlfile, which could cause repetitive pg_rewind calls to fail after
>> dry-run ones.
> I have just paid attention to this thread, but this is a bug which
> goes down to 12 actually so let's treat it independently of the rest.
> The control file was not written thanks to the safeguards in
> write_target_range() in past versions, but the recent refactoring
> around control file handling broke that promise.  Another thing which
> is not completely exact is the progress reporting which should be
> reported even if the dry-run mode runs.  That's less critical, but
> let's make things consistent.

I also thought about v12, though didn't check whether it's affected.

> Patch 0001 also forgot that recovery.conf should not be written either
> when no rewind is needed.

Yes, definitely, I forgot this code path, thanks.

> I have reworked your first patch as per the attached.  What do you
> think about it?  The part with the control file needs to go down to
> v12, and I would likely split that into two commits on HEAD: one for
> the control file and a second for the recovery.conf portion with the
> fix for --no-ensure-shutdown to keep a cleaner history.

It looks fine for me excepting the progress reporting part. It now adds 
PG_CONTROL_FILE_SIZE to fetch_done. However, I cannot find that control 
file is either included into filemap and fetch_size or counted during 
calculate_totals(). Maybe I've missed something, but now it looks like 
we report something that wasn't planned for progress reporting, doesn't it?

> +               # Check that incompatible options error out.
> +               command_fails(
> +                       [
> +                               'pg_rewind', "--debug",
> +                               "--source-pgdata=$standby_pgdata",
> +                               "--target-pgdata=$master_pgdata", "-R",
> +                               "--no-ensure-shutdown"
> +                       ],
> +                       'pg_rewind local with -R');
> Incompatible options had better be checked within a separate perl
> script?  We generally do that for the other binaries.

Yes, it makes sense. I've reworked the patch with tests and added a 
couple of extra cases.


-- 
Alexey Kondratov

Postgres Professional https://www.postgrespro.com
Russian Postgres Company


Вложения

Re: Two pg_rewind patches (auto generate recovery conf and ensureclean shutdown)

От
Michael Paquier
Дата:
On Thu, Oct 03, 2019 at 12:43:37PM +0300, Alexey Kondratov wrote:
> On 03.10.2019 6:07, Michael Paquier wrote:
>> I have reworked your first patch as per the attached.  What do you
>> think about it?  The part with the control file needs to go down to
>> v12, and I would likely split that into two commits on HEAD: one for
>> the control file and a second for the recovery.conf portion with the
>> fix for --no-ensure-shutdown to keep a cleaner history.
>
> It looks fine for me excepting the progress reporting part. It now adds
> PG_CONTROL_FILE_SIZE to fetch_done. However, I cannot find that control file
> is either included into filemap and fetch_size or counted during
> calculate_totals(). Maybe I've missed something, but now it looks like we
> report something that wasn't planned for progress reporting, doesn't
> it?

Right.  The pre-12 code actually handles that incorrecly as it assumed
that any files written through file_ops.c should be part of the
progress.  So I went with the simplest solution, and backpatched this
part with 6f3823b.  I have also committed the set of fixes for the new
options so as we have a better base of work than what's on HEAD
currently.

>> +               # Check that incompatible options error out.
>> +               command_fails(
>> +                       [
>> +                               'pg_rewind', "--debug",
>> +                               "--source-pgdata=$standby_pgdata",
>> +                               "--target-pgdata=$master_pgdata", "-R",
>> +                               "--no-ensure-shutdown"
>> +                       ],
>> +                       'pg_rewind local with -R');
>> Incompatible options had better be checked within a separate perl
>> script?  We generally do that for the other binaries.
>
> Yes, it makes sense. I've reworked the patch with tests and added a couple
> of extra cases.

Regarding the tests, adding a --dry-run command is a good idea.
However I think that there is more value to automate the use of the
single user mode automatically in the tests as that's more critical
from the point of view of rewind run, and stopping the cluster with
immediate mode causes, as expected, the next --dry-run command to
fail.

Another thing is that I think that we should use -F with --single.
This makes recovery faster, and the target data folder is synced
at the end of pg_rewind anyway.

Using the long option names makes the tests easier to follow in this
case, so I have switched -R to --write-recovery-conf.

Some comments and the docs have been using some confusing wording, so
I have reworked what I found (like many "it" in a single sentence
referring different things).

+command_fails(
+    [
+        'pg_rewind', "--debug",
+        "--source-pgdata=$standby_pgdata",
+        "--target-pgdata=$master_pgdata",
+        "--no-ensure-shutdown"
+    ],
+    'pg_rewind local without source shutdown');
Regarding all the set of incompatible options, we have much more of
that after the initial option parsing so I think that we should group
all the cheap ones together.  Let's tackle that as a separate patch.
We can also just check after --no-ensure-shutdown directly in
RewindTest.pm as I have switched the cluster to not be cleanly shut
down anymore to stress the automatic recovery path, and trigger that
before running pg_rewind for the local and remote mode.

Attached is an updated patch with all I found.  What do you think?
--
Michael

Вложения

Re: Two pg_rewind patches (auto generate recovery conf and ensureclean shutdown)

От
Alexey Kondratov
Дата:
On 04.10.2019 11:37, Michael Paquier wrote:
> On Thu, Oct 03, 2019 at 12:43:37PM +0300, Alexey Kondratov wrote:
>> On 03.10.2019 6:07, Michael Paquier wrote:
>>> I have reworked your first patch as per the attached.  What do you
>>> think about it?  The part with the control file needs to go down to
>>> v12, and I would likely split that into two commits on HEAD: one for
>>> the control file and a second for the recovery.conf portion with the
>>> fix for --no-ensure-shutdown to keep a cleaner history.
>> It looks fine for me excepting the progress reporting part. It now adds
>> PG_CONTROL_FILE_SIZE to fetch_done. However, I cannot find that control file
>> is either included into filemap and fetch_size or counted during
>> calculate_totals(). Maybe I've missed something, but now it looks like we
>> report something that wasn't planned for progress reporting, doesn't
>> it?
> Right.  The pre-12 code actually handles that incorrecly as it assumed
> that any files written through file_ops.c should be part of the
> progress.  So I went with the simplest solution, and backpatched this
> part with 6f3823b.  I have also committed the set of fixes for the new
> options so as we have a better base of work than what's on HEAD
> currently.

Great, thanks.

>
> Regarding the tests, adding a --dry-run command is a good idea.
> However I think that there is more value to automate the use of the
> single user mode automatically in the tests as that's more critical
> from the point of view of rewind run, and stopping the cluster with
> immediate mode causes, as expected, the next --dry-run command to
> fail.
>
> Another thing is that I think that we should use -F with --single.
> This makes recovery faster, and the target data folder is synced
> at the end of pg_rewind anyway.
>
> Using the long option names makes the tests easier to follow in this
> case, so I have switched -R to --write-recovery-conf.
>
> Some comments and the docs have been using some confusing wording, so
> I have reworked what I found (like many "it" in a single sentence
> referring different things).

I agree with all the points. Shutting down target server using 
'immediate' mode is a good way to test ensureCleanShutdown automatically.

> Regarding all the set of incompatible options, we have much more of
> that after the initial option parsing so I think that we should group
> all the cheap ones together.  Let's tackle that as a separate patch.
> We can also just check after --no-ensure-shutdown directly in
> RewindTest.pm as I have switched the cluster to not be cleanly shut
> down anymore to stress the automatic recovery path, and trigger that
> before running pg_rewind for the local and remote mode.
>
> Attached is an updated patch with all I found.  What do you think?

I've checked your patch, but it seems that it cannot be applied as is, 
since it e.g. adds a comment to 005_same_timeline.pl without actually 
changing the test. So I've slightly modified your patch and tried to fit 
both dry-run and ensureCleanShutdown testing together. It works just 
fine and fails immediately if any of recent fixes is reverted. I still 
think that dry-run testing is worth adding, since it helped to catch 
this v12 refactoring issue, but feel free to throw it way if it isn't 
commitable right now, of course.

As for incompatible options and sanity checks testing, yes, I agree that 
it is a matter of different patch. I attached it as a separate WIP patch 
just for history. Maybe I will try to gather more cases there later.

-- 
Alexey Kondratov

Postgres Professional https://www.postgrespro.com
Russian Postgres Company


Вложения

Re: Two pg_rewind patches (auto generate recovery conf and ensureclean shutdown)

От
Michael Paquier
Дата:
On Fri, Oct 04, 2019 at 05:21:25PM +0300, Alexey Kondratov wrote:
> I've checked your patch, but it seems that it cannot be applied as is, since
> it e.g. adds a comment to 005_same_timeline.pl without actually changing the
> test. So I've slightly modified your patch and tried to fit both dry-run and
> ensureCleanShutdown testing together. It works just fine and fails
> immediately if any of recent fixes is reverted. I still think that dry-run
> testing is worth adding, since it helped to catch this v12 refactoring
> issue, but feel free to throw it way if it isn't commitable right now, of
> course.

I can guarantee the last patch I sent can be applied on top of HEAD:
https://www.postgresql.org/message-id/20191004083721.GA1829@paquier.xyz

It would be nice to add the --dry-run part, though I think that we
could just make that part of one of the existing tests, and stop the
target server first (got to think about that part, please see below).

> As for incompatible options and sanity checks testing, yes, I agree that it
> is a matter of different patch. I attached it as a separate WIP patch just
> for history. Maybe I will try to gather more cases there later.

Thanks.  I have applied the first patch for the various improvements
around --no-ensure-shutdown.

Regarding the rest, I have hacked my way through as per the attached.
The previous set of patches did the following, which looked either
overkill or not necessary:
- Why running test 005 with the remote mode?
- --dry-run coverage is basically the same with the local and remote
modes, so it seems like a waste of resource to run it for all the
tests and all the modes.  I tend to think that this would live better
as part of another existing test, only running for say the local mode.
It is also possible to group all your tests from patch 2 and
006_actions.pl in this area.
- There is no need for the script checking for options combinations to
initialize a data folder.  It is important to design the tests to be
cheap and meaningful.

Patch v3-0002 also had a test to make sure that the source server is
shut down cleanly before using it.  I have included that part as
well, as the flow feels right.

So, Alexey, what do you think?
--
Michael

Вложения

Re: Two pg_rewind patches (auto generate recovery conf and ensureclean shutdown)

От
Alexey Kondratov
Дата:
On 07.10.2019 4:06, Michael Paquier wrote:
> On Fri, Oct 04, 2019 at 05:21:25PM +0300, Alexey Kondratov wrote:
>> I've checked your patch, but it seems that it cannot be applied as is, since
>> it e.g. adds a comment to 005_same_timeline.pl without actually changing the
>> test. So I've slightly modified your patch and tried to fit both dry-run and
>> ensureCleanShutdown testing together. It works just fine and fails
>> immediately if any of recent fixes is reverted. I still think that dry-run
>> testing is worth adding, since it helped to catch this v12 refactoring
>> issue, but feel free to throw it way if it isn't commitable right now, of
>> course.
> I can guarantee the last patch I sent can be applied on top of HEAD:
> https://www.postgresql.org/message-id/20191004083721.GA1829@paquier.xyz

Yes, it did, but my comment was about these lines:

diff --git a/src/bin/pg_rewind/t/005_same_timeline.pl 
b/src/bin/pg_rewind/t/005_same_timeline.pl
index 40dbc44caa..df469d3939 100644
--- a/src/bin/pg_rewind/t/005_same_timeline.pl
+++ b/src/bin/pg_rewind/t/005_same_timeline.pl
@@ -1,3 +1,7 @@
+#
+# Test that running pg_rewind with the source and target clusters
+# on the same timeline runs successfully.
+#

You have added this new comment section, but kept the old one, which was 
pretty much the same [1].

> Regarding the rest, I have hacked my way through as per the attached.
> The previous set of patches did the following, which looked either
> overkill or not necessary:
> - Why running test 005 with the remote mode?

OK, it was definitely an overkill, since remote control file fetch will 
be also tested in any other remote test case.

> - --dry-run coverage is basically the same with the local and remote
> modes, so it seems like a waste of resource to run it for all the
> tests and all the modes.

My point was to test --dry-run + --write-recover-conf in remote, since 
the last one may cause recovery configuration write without doing any 
actual work, due to some wrong refactoring for example.

> - There is no need for the script checking for options combinations to
> initialize a data folder.  It is important to design the tests to be
> cheap and meaningful.

Yes, I agree, moving some of those tests to just a 001_basic seems to be 
a proper optimization.

> Patch v3-0002 also had a test to make sure that the source server is
> shut down cleanly before using it.  I have included that part as
> well, as the flow feels right.
>
> So, Alexey, what do you think?

It looks good for me. Two minor remarks:

+    # option combinations.  As the code paths taken by those tests
+    # does not change for the "local" and "remote" modes, just run them

I am far from being fluent in English, but should it be 'do not change' 
instead?

+command_fails(
+    [
+        'pg_rewind',     '--target-pgdata',
+        $primary_pgdata, '--source-pgdata',
+        $standby_pgdata, 'extra_arg1'
+    ],

Here and below I would prefer traditional options ordering "'--key', 
'value'". It should be easier to recognizefrom the reader perspective:

+command_fails(
+    [
+        'pg_rewind',
+        '--target-pgdata', $primary_pgdata,
+        '--source-pgdata', $standby_pgdata,
+        'extra_arg1'
+    ],


[1] 

https://github.com/postgres/postgres/blob/caa078353ecd1f3b3681c0d4fa95ad4bb8c2308a/src/bin/pg_rewind/t/005_same_timeline.pl#L15


-- 
Alexey Kondratov

Postgres Professional https://www.postgrespro.com
Russian Postgres Company




Re: Two pg_rewind patches (auto generate recovery conf and ensureclean shutdown)

От
Michael Paquier
Дата:
On Mon, Oct 07, 2019 at 03:31:45PM +0300, Alexey Kondratov wrote:
> On 07.10.2019 4:06, Michael Paquier wrote:
>> - --dry-run coverage is basically the same with the local and remote
>> modes, so it seems like a waste of resource to run it for all the
>> tests and all the modes.
>
> My point was to test --dry-run + --write-recover-conf in remote, since the
> last one may cause recovery configuration write without doing any actual
> work, due to some wrong refactoring for example.

Yes, that's possible.  I agree that it would be nice to have an extra
test for that, still I would avoid making that run in all the tests.

>> Patch v3-0002 also had a test to make sure that the source server is
>> shut down cleanly before using it.  I have included that part as
>> well, as the flow feels right.
>>
>> So, Alexey, what do you think?
>
> It looks good for me. Two minor remarks:
>
> +    # option combinations.  As the code paths taken by those tests
> +    # does not change for the "local" and "remote" modes, just run them
>
> I am far from being fluent in English, but should it be 'do not change'
> instead?

That was wrong, fixed.

> +command_fails(
> +    [
> +        'pg_rewind',     '--target-pgdata',
> +        $primary_pgdata, '--source-pgdata',
> +        $standby_pgdata, 'extra_arg1'
> +    ],
>
> Here and below I would prefer traditional options ordering "'--key',
> 'value'". It should be easier to recognizefrom the reader perspective:

While I agree with you, the perl indentation we use has formatted the
code this way.  There is also an argument for keeping it at the end
for clarity (I recall that Windows also requires extra args to be
last when parsing options).  Anyway, I have used a trick by adding
--debug to reach command, which is still useful, so the order of the
options is better at the end.
--
Michael

Вложения