Обсуждение: How to start slave after pg_basebackup. Why min_wal_size and wal_keep_segments are duplicated

Поиск

Список

Период

Сортировка

How to start slave after pg_basebackup. Why min_wal_size and wal_keep_segments are duplicated

От

"Andrus"

Дата:

31 мая 2020 г., 22:47:44

Hi!

pg_basebackup takes 8 hours.
After it is finished,  replication slave does not start:

LOG:  consistent recovery state reached at 2DE/985A5BE0
LOG:  database system is ready to accept read only connections
LOG:  started streaming WAL from primary at 2DE/99000000 on timeline 1
replikaator@[unknown] LOG:  received replication command: SHOW data_directory_mode
replikaator@[unknown] LOG:  received replication command: IDENTIFY_SYSTEM
replikaator@[unknown] LOG:  received replication command: START_REPLICATION 2CF/E9000000 TIMELIN
replikaator@[unknown] ERROR:  requested WAL segment 00000001000002CF000000E9 has already been re
replikaator@[unknown] LOG:  received replication command: SHOW data_directory_mode
replikaator@[unknown] LOG:  received replication command: IDENTIFY_SYSTEM
replikaator@[unknown] LOG:  received replication command: START_REPLICATION 2CF/E9000000 TIMELIN
replikaator@[unknown] ERROR:  requested WAL segment 00000001000002CF000000E9 has already been removed
...

i tried it again and same error occured.

How to force replication to start?

I increased wal parameters in master to

wal_compression=on
max_wal_size = 5GB
min_wal_size = 4GB # was 80MB
wal_keep_segments= 360 # was 180

Will this allow replication to start after pg_basebackup ?

According to doc min_wal_size and wal_keep_segments both keep the minimum number of wal segments for replication.
Why those parameters are duplicated?

Andrus.

Re: How to start slave after pg_basebackup. Why min_wal_size andwal_keep_segments are duplicated

От

Adrian Klaver

Дата:

31 мая 2020 г., 22:59:18

On 5/31/20 12:47 PM, Andrus wrote:
> Hi!
> 
> pg_basebackup takes 8 hours.
> After it is finished,  replication slave does not start:
> 
> LOG:  consistent recovery state reached at 2DE/985A5BE0
> LOG:  database system is ready to accept read only connections
> LOG:  started streaming WAL from primary at 2DE/99000000 on timeline 1
> replikaator@[unknown] LOG:  received replication command: SHOW 
> data_directory_mode
> replikaator@[unknown] LOG:  received replication command: IDENTIFY_SYSTEM
> replikaator@[unknown] LOG:  received replication command: 
> START_REPLICATION 2CF/E9000000 TIMELIN
> replikaator@[unknown] ERROR:  requested WAL segment 
> 00000001000002CF000000E9 has already been re
> replikaator@[unknown] LOG:  received replication command: SHOW 
> data_directory_mode
> replikaator@[unknown] LOG:  received replication command: IDENTIFY_SYSTEM
> replikaator@[unknown] LOG:  received replication command: 
> START_REPLICATION 2CF/E9000000 TIMELIN
> replikaator@[unknown] ERROR:  requested WAL segment 
> 00000001000002CF000000E9 has already been removed

There's your problem ^

> ...
> 
> i tried it again and same error occured.
> 
> How to force replication to start?

If the WAL is gone you can't.

More below.

> 
> I increased wal parameters in master to
> 
> wal_compression=on
> max_wal_size = 5GB
> min_wal_size = 4GB # was 80MB
> wal_keep_segments= 360 # was 180
> 
> Will this allow replication to start after pg_basebackup ?
> 
> According to doc min_wal_size and wal_keep_segments both keep the 
> minimum number of wal segments for replication.

No it doesn't:

https://www.postgresql.org/docs/12/runtime-config-replication.html

"wal_keep_segments (integer)

     Specifies the minimum number of past log file segments kept in the 
pg_wal directory, in case a standby server needs to fetch them for 
streaming replication. Each segment is normally 16 megabytes. If a 
standby server connected to the sending server falls behind by more than 
wal_keep_segments segments, the sending server might remove a WAL 
segment still needed by the standby, in which case the replication
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
connection will be terminated. Downstream connections will also
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
eventually fail as a result. (However, the standby server can recover by 
fetching the segment from archive, if WAL archiving is in use.)

...

https://www.postgresql.org/docs/12/runtime-config-wal.html

"min_wal_size (integer)

     As long as WAL disk usage stays below this setting, old WAL files 
are always recycled for future use at a checkpoint, rather than removed. 
This can be used to ensure that enough WAL space is reserved to handle 
spikes in WAL usage, for example when running large batch jobs. If this 
value is specified without units, it is taken as megabytes. The default 
is 80 MB. This parameter can only be set in the postgresql.conf file or 
on the server command line.
"

I'm guessing are looking for:

https://www.postgresql.org/docs/12/runtime-config-replication.html

"
26.2.6. Replication Slots

Replication slots provide an automated way to ensure that the master 
does not remove WAL segments until they have been received by all 
standbys, and that the master does not remove rows which could cause a 
recovery conflict even when the standby is disconnected.

...
"

This is spelled out here:

https://www.postgresql.org/docs/12/warm-standby.html#STREAMING-REPLICATION

"If you use streaming replication without file-based continuous 
archiving, the server might recycle old WAL segments before the standby 
has received them. If this occurs, the standby will need to be 
reinitialized from a new base backup. You can avoid this by setting 
wal_keep_segments to a value large enough to ensure that WAL segments 
are not recycled too early, or by configuring a replication slot for the 
standby. If you set up a WAL archive that's accessible from the standby, 
these solutions are not required, since the standby can always use the 
archive to catch up provided it retains enough segments."



> Why those parameters are duplicated?
> 
> Andrus.
> 
> 


-- 
Adrian Klaver
adrian.klaver@aklaver.com

Re: How to start slave after pg_basebackup. Why min_wal_size andwal_keep_segments are duplicated

От

Paul Förster

Дата:

31 мая 2020 г., 23:01:43

Hi Andrus,

> On 31. May, 2020, at 21:47, Andrus <kobruleht2@hot.ee> wrote:
> replikaator@[unknown] ERROR:  requested WAL segment 00000001000002CF000000E9 has already been removed

the message says it all. You need to copy the WAL file 00000001000002CF000000E9 and newer to the replica's pg_wal
directorybecause it has been removed already on the master site. Obviously, you can only do that if the files have been
archived.Otherwise, you'd have to fully reinitiate the replica. 

Replication will start again as soon as the requested WALs are copied over to the replica.

Use a replication slot to avoid this situation. If you use a replication slot, the master will only remove WAL files
whichare not needed by any one replica. 

Hope this helps.

Cheers,
Paul

Re: How to start slave after pg_basebackup. Why min_wal_size and wal_keep_segments are duplicated

От

"Andrus"

Дата:

31 мая 2020 г., 23:56:05

Hi!

> On 31. May, 2020, at 21:47, Andrus <kobruleht2@hot.ee> wrote:
> replikaator@[unknown] ERROR:  requested WAL segment 00000001000002CF000000E9 has already been removed
>the message says it all. You need to copy the WAL file 00000001000002CF000000E9 and newer to the replica's pg_wal
directorybecause 

>it has been removed >already on the master site.
>Obviously, you can only do that if the files have been archived.

wal files are not archieved.

>Otherwise, you'd have to fully reinitiate the replica.

I have tried to re-initiate replica serveral times in low-use time but this error occurs again.

>Replication will start again as soon as the requested WALs are copied over to the replica.
>Use a replication slot to avoid this situation. If you use a replication slot, the master will only remove WAL files
whichare not 

>needed by any one replica.

If named replication slot is used commands like

vacuumdb --all --full

will cause main server crash due to disk space limit. pg_wal directory will occupy free disk space. After that main
serverstops.

I tried using wal_keep_segments =180
Will setting wal_keep_segments to higher value allw replication start after pg_basebackup ?

Andrus.

Re: How to start slave after pg_basebackup. Why min_wal_size and wal_keep_segments are duplicated

От

"Andrus"

Дата:

01 июня 2020 г., 00:03:30

Hi!

>I'm guessing are looking for:
>https://www.postgresql.org/docs/12/runtime-config-replication.html
>26.2.6. Replication Slots
>Replication slots provide an automated way to ensure that the master 
>does not remove WAL segments until they have been received by all 
>standbys, and that the master does not remove rows which could cause a 
>recovery conflict even when the standby is disconnected.

Using replication slot can cause pg_wal directoy to occupy all free disk space and after that server stop respondig.

>This is spelled out here:
> https://www.postgresql.org/docs/12/warm-standby.html#STREAMING-REPLICATION
>"If you use streaming replication without file-based continuous 
>archiving, the server might recycle old WAL segments before the standby 
>has received them. If this occurs, the standby will need to be 
>reinitialized from a new base backup. You can avoid this by setting 
>wal_keep_segments to a value large enough to ensure that WAL segments 
>are not recycled too early, or by configuring a replication slot for the 
>standby. 

Will wal_keep_segments  keep segments also if named replication slot is lot used ?

Andrus.

Re: How to start slave after pg_basebackup. Why min_wal_size andwal_keep_segments are duplicated

От

Adrian Klaver

Дата:

01 июня 2020 г., 00:18:46

On 5/31/20 2:03 PM, Andrus wrote:
> Hi!
> 
>> I'm guessing are looking for:
>> https://www.postgresql.org/docs/12/runtime-config-replication.html
>> 26.2.6. Replication Slots
>> Replication slots provide an automated way to ensure that the master 
>> does not remove WAL segments until they have been received by all 
>> standbys, and that the master does not remove rows which could cause a 
>> recovery conflict even when the standby is disconnected.
> 
> Using replication slot can cause pg_wal directoy to occupy all free disk 
> space and after that server stop respondig.
> 
>> This is spelled out here:
>> https://www.postgresql.org/docs/12/warm-standby.html#STREAMING-REPLICATION 
>>
>> "If you use streaming replication without file-based continuous 
>> archiving, the server might recycle old WAL segments before the 
>> standby has received them. If this occurs, the standby will need to be 
>> reinitialized from a new base backup. You can avoid this by setting 
>> wal_keep_segments to a value large enough to ensure that WAL segments 
>> are not recycled too early, or by configuring a replication slot for 
>> the standby. 
> 
> Will wal_keep_segments  keep segments also if named replication slot is 
> lot used ?

Well if you are using a replication slot there is no point in using 
wal_keep_segments. Slots where created in, part at least, so you did not 
have to guess at a wal_keep_segments number.

To really answer this we will need to see the exact commands you are 
using and the sequence they are done in.

> 
> Andrus.


-- 
Adrian Klaver
adrian.klaver@aklaver.com

Re: How to start slave after pg_basebackup. Why min_wal_size andwal_keep_segments are duplicated

От

Adrian Klaver

Дата:

01 июня 2020 г., 00:26:28

On 5/31/20 2:03 PM, Andrus wrote:
> Hi!
> 
>> I'm guessing are looking for:
>> https://www.postgresql.org/docs/12/runtime-config-replication.html
>> 26.2.6. Replication Slots
>> Replication slots provide an automated way to ensure that the master 
>> does not remove WAL segments until they have been received by all 
>> standbys, and that the master does not remove rows which could cause a 
>> recovery conflict even when the standby is disconnected.
> 
> Using replication slot can cause pg_wal directoy to occupy all free disk 
> space and after that server stop respondig.
> 
>> This is spelled out here:
>> https://www.postgresql.org/docs/12/warm-standby.html#STREAMING-REPLICATION 
>>
>> "If you use streaming replication without file-based continuous 
>> archiving, the server might recycle old WAL segments before the 
>> standby has received them. If this occurs, the standby will need to be 
>> reinitialized from a new base backup. You can avoid this by setting 
>> wal_keep_segments to a value large enough to ensure that WAL segments 
>> are not recycled too early, or by configuring a replication slot for 
>> the standby. 
> 
> Will wal_keep_segments  keep segments also if named replication slot is 
> lot used ?

In addition to my most recent questions:

What are you trying to achieve?

In other words why do a pg_basebackup if you have a standby receiving WALs?

> 
> Andrus.


-- 
Adrian Klaver
adrian.klaver@aklaver.com

Re: How to start slave after pg_basebackup. Why min_wal_size andwal_keep_segments are duplicated

От

Paul Förster

Дата:

01 июня 2020 г., 00:35:33

Hi Andrus,

> On 31. May, 2020, at 22:56, Andrus <kobruleht2@hot.ee> wrote:
> wal files are not archieved.

IMHO a bad decision. They should be. Now you're in the situation where you see why.

> I have tried to re-initiate replica serveral times in low-use time but this error occurs again.

remove the whole replica's PGDATA/* and do a pg_basebackup again. But before that, make sure wal_keep_segments in big
enoughon the master and, just as much important, do a vacuumdb -a (takes much space during the process) and use
archiving!

> If named replication slot is used commands like
> vacuumdb --all --full
> will cause main server crash due to disk space limit. pg_wal directory will occupy free disk space. After that main
serverstops. 

if you have disk constraints you will run into trouble sooner or later anyway. Make sure, you have enough disk space.
There'sno way around that anyway. 

> I tried using wal_keep_segments =180
> Will setting wal_keep_segments to higher value allw replication start after pg_basebackup ?

it depends. If you start the replica immediately and don't wait for hours or days, you should be good to go. But that
dependson different factors, for example, how many WAL files are written during the pg_basebackup and pg_ctl start of
thereplica. If more than 180 WALs have gone by on the master because it is really busy, then you're probably lost
again.Point being, you'll have to launch the replica before WALs are expired! 

Again: Make sure you have enough disk space, use archiving and use a replication slot.

Cheers,
Paul

Re: How to start slave after pg_basebackup. Why min_wal_size and wal_keep_segments are duplicated

От

"Andrus"

Дата:

01 июня 2020 г., 00:43:02

Hi!

>In addition to my most recent questions:
>What are you trying to achieve?

I want to create  hot standby async server using 

/etc/init.d/postgresql stop
mv /var/lib/postgresql/12/main /var/lib/postgresql/12/mainold
pg_basebackup --write-recovery-conf -D /var/lib/postgresql/12/main
chmod --recursive --verbose 0700 /var/lib/postgresql/12/main
chown -Rv postgres:postgres /var/lib/postgresql/12/main
/etc/init.d/postgresql start

>In other words why do a pg_basebackup if you have a standby receiving WALs?

I dont receive WALs.

Andrus.

Re: How to start slave after pg_basebackup. Why min_wal_size and wal_keep_segments are duplicated

От

"Andrus"

Дата:

01 июня 2020 г., 00:45:17

Hi!

>> Will wal_keep_segments  keep segments also if named replication slot is 
>> lot used ?
>Well if you are using a replication slot there is no point in using 
>wal_keep_segments. Slots where created in, part at least, so you did not 
>have to guess at a wal_keep_segments number.

I dont use slot.

>To really answer this we will need to see the exact commands you are 
>using and the sequence they are done in.

Replication server is created using

/etc/init.d/postgresql stop
mv /var/lib/postgresql/12/main /var/lib/postgresql/12/mainold
pg_basebackup --write-recovery-conf -D /var/lib/postgresql/12/main
chmod --recursive --verbose 0700 /var/lib/postgresql/12/main
chown -Rv postgres:postgres /var/lib/postgresql/12/main
/etc/init.d/postgresql start

Andrus.

Re: How to start slave after pg_basebackup. Why min_wal_size andwal_keep_segments are duplicated

От

Adrian Klaver

Дата:

01 июня 2020 г., 03:29:05

On 5/31/20 2:43 PM, Andrus wrote:
> Hi!
> 
>> In addition to my most recent questions:
>> What are you trying to achieve?
> 
> I want to create  hot standby async server using
> /etc/init.d/postgresql stop
> mv /var/lib/postgresql/12/main /var/lib/postgresql/12/mainold
> pg_basebackup --write-recovery-conf -D /var/lib/postgresql/12/main

I don't see where the base backup is being taken from just where it is 
going.

> chmod --recursive --verbose 0700 /var/lib/postgresql/12/main
> chown -Rv postgres:postgres /var/lib/postgresql/12/main
> /etc/init.d/postgresql start
> 
>> In other words why do a pg_basebackup if you have a standby receiving 
>> WALs?
> 
> I dont receive WALs.

If you are doing binary replication then you are receiving WALs. It just 
a matter of whether you are streaming them or shipping them over complete.

> 
> Andrus.


-- 
Adrian Klaver
adrian.klaver@aklaver.com

Re: How to start slave after pg_basebackup. Why min_wal_size and wal_keep_segments are duplicated

От

"Andrus"

Дата:

01 июня 2020 г., 08:50:38

Hi!

>> I want to create  hot standby async server using
>> /etc/init.d/postgresql stop
>> mv /var/lib/postgresql/12/main /var/lib/postgresql/12/mainold
>> pg_basebackup --write-recovery-conf -D /var/lib/postgresql/12/main
>I don't see where the base backup is being taken from just where it is going.

It is taken from VPS server over 20 Mbit public internet connection.
Both servers are running Debian Linux.

>> I dont receive WALs.
>If you are doing binary replication then you are receiving WALs. It just a matter of whether you are streaming them or
shipping
 
>them over complete.

Using wal_keep_segments=360 also causes same wal file not found error after pg_basebackup.
master server has 6GB wal files.  wal log during pg_basebackup is much slower than 360.
Maybe pg_basebackup skips wal segments . Maybe using wal_compression=on causes the issue.

How to fix this ?
How to create base backup so that cluster is tranferred over internet faster? Maybe it can transferred in compressed
formover 
 
internet.

Andrus.

Re: How to start slave after pg_basebackup. Why min_wal_size and wal_keep_segments are duplicated

От

"Andrus"

Дата:

01 июня 2020 г., 11:17:19

Hi!

> I have tried to re-initiate replica serveral times in low-use time but this error occurs again.
>remove the whole replica's PGDATA/* and do a pg_basebackup again. But before that, make sure wal_keep_segments in big
enoughon the 
 
>master and,

I renamed whole cluster before pg_basebackup

>just as much important, do a vacuumdb -a (takes much space during the process) and use archiving!

I run vacuumdb --full --all before pg_basebackup

> If named replication slot is used commands like
> vacuumdb --all --full
> will cause main server crash due to disk space limit. pg_wal directory will occupy free disk space. After that main
serverstops.
 
>>if you have disk constraints you will run into trouble sooner or later anyway. Make sure, you have enough disk space.
There'sno 
 
>>way around that anyway.

This space is sufficient for base backup and replication.

>> I tried using wal_keep_segments =180
>> Will setting wal_keep_segments to higher value allw replication start after pg_basebackup ?
>it depends. If you start the replica immediately and don't wait for hours or days, you should be good to go. But that
dependson 
 
>different factors, for example, how >many WAL files are written during the pg_basebackup and pg_ctl start of the
replica.If more 
 
>than 180 WALs have gone by on the master because it is really busy, >then you're probably lost again. Point being,
you'llhave to 
 
>launch the replica before WALs are expired!
>Again: Make sure you have enough disk space, use archiving and use a replication slot.

I tried with wal_keep_segments=360 but problem persisists.
Server generates lot of less than 300 wal files.

Shell script starts server after pg_basebackup completes automatically:

PGHOST=example.com
PGPASSWORD=mypass
PGUSER=replikaator
export PGHOST  PGPASSWORD PGUSER
/etc/init.d/postgresql stop
mv /var/lib/postgresql/12/main /var/lib/postgresql/12/mainennebaasbakuppi
pg_basebackup --verbose --progress --write-recovery-conf -D /var/lib/postgresql/12/main
chmod --recursive --verbose 0700 /var/lib/postgresql/12/main
chown -Rv postgres:postgres /var/lib/postgresql/12/main
/etc/init.d/postgresql start

How to create replication server ?

Andrus.

Re: How to start slave after pg_basebackup. Why min_wal_size andwal_keep_segments are duplicated

От

Paul Förster

Дата:

01 июня 2020 г., 12:06:37

Hi Andrus,

> On 01. Jun, 2020, at 10:17, Andrus <kobruleht2@hot.ee> wrote:
> Shell script starts server after pg_basebackup completes automatically:
>
> PGHOST=example.com
> PGPASSWORD=mypass
> PGUSER=replikaator
> export PGHOST  PGPASSWORD PGUSER
> /etc/init.d/postgresql stop
> mv /var/lib/postgresql/12/main /var/lib/postgresql/12/mainennebaasbakuppi
> pg_basebackup --verbose --progress --write-recovery-conf -D /var/lib/postgresql/12/main
> chmod --recursive --verbose 0700 /var/lib/postgresql/12/main
> chown -Rv postgres:postgres /var/lib/postgresql/12/main
> /etc/init.d/postgresql start
>
> How to create replication server ?

I always do it this way and it work for me:

$ pg_basebackup -h ${PGHOST} -p ${PGPORT} -U replicator -W -R -D ${PGDATA} -P -v -Fp -Xs

After that, I edit ${PGDATA}/postgresql.conf and (w/ PostgreSQL 11 and older ${PGDATA}/recovery.conf) to make it do
whatI want and then I just launch it: 

$ pg_ctl start

From that moment onward, it replicates and applies to the replica. Checks in pg_stat_replication on the master and
pg_stat_wal_receiveron the replica confirm that. They also show the WAL switches. 

To provoke a WAL switch I always do:

postgres=# checkpoint; select pg_switch_wal();
CHECKPOINT
 pg_switch_wal
---------------
 C/50000128
(1 row)

I just don't understand what you're trying to achieve here. My guess is, you want to stop and backup the old database
cluster,then create a new one in its old directory, right? In this case, you probably need to change your script to
somethinglike this: 

PGHOST=remote.example.com
PGPASSWORD=mypass
PGUSER=replikaator
PGDATA=/var/lib/postgresql/12/main
export PGHOST PGPASSWORD PGUSER PGDATA

/etc/init.d/postgresql stop
mv ${PGDATA} /var/lib/postgresql/12/mainennebaasbakuppi
pg_basebackup -h ${PGHOST} -p ${PGPORT} -U ${PGUSER} -W -R -D ${PGDATA} -P -v -Fp -Xs
/etc/init.d/postgresql start

Note that my invocation of pg_basebackup asks for the replicator password. This is intended. You'd probably want to
changethat. 

Also, no need to play around with ownership and permissions. Do it as "postgres", not as "root".

Cheers,
Paul

Re: How to start slave after pg_basebackup. Why min_wal_size andwal_keep_segments are duplicated

От

Magnus Hagander

Дата:

01 июня 2020 г., 12:13:58

On Mon, Jun 1, 2020 at 10:17 AM Andrus <kobruleht2@hot.ee> wrote:

Hi!

> I have tried to re-initiate replica serveral times in low-use time but this error occurs again.
>remove the whole replica's PGDATA/* and do a pg_basebackup again. But before that, make sure wal_keep_segments in big enough on the
>master and,

I renamed whole cluster before pg_basebackup

>just as much important, do a vacuumdb -a (takes much space during the process) and use archiving!

I run vacuumdb --full --all before pg_basebackup

> If named replication slot is used commands like
> vacuumdb --all --full
> will cause main server crash due to disk space limit. pg_wal directory will occupy free disk space. After that main server stops.
>>if you have disk constraints you will run into trouble sooner or later anyway. Make sure, you have enough disk space. There's no
>>way around that anyway.

This space is sufficient for base backup and replication.

>> I tried using wal_keep_segments =180
>> Will setting wal_keep_segments to higher value allw replication start after pg_basebackup ?
>it depends. If you start the replica immediately and don't wait for hours or days, you should be good to go. But that depends on
>different factors, for example, how >many WAL files are written during the pg_basebackup and pg_ctl start of the replica. If more
>than 180 WALs have gone by on the master because it is really busy, >then you're probably lost again. Point being, you'll have to
>launch the replica before WALs are expired!
>Again: Make sure you have enough disk space, use archiving and use a replication slot.

I tried with wal_keep_segments=360 but problem persisists.
Server generates lot of less than 300 wal files.

Have you verified that wal_keep_segments actually end up at 360, by connecting to the database and issuing SHOW wal_keep_segments? I've seen far too many examples of people who accidentally had a second line that overrode the one they thought they changed, and thus still ran with a lower number.

Shell script starts server after pg_basebackup completes automatically:

PGHOST=example.com
PGPASSWORD=mypass
PGUSER=replikaator
export PGHOST PGPASSWORD PGUSER
/etc/init.d/postgresql stop
mv /var/lib/postgresql/12/main /var/lib/postgresql/12/mainennebaasbakuppi
pg_basebackup --verbose --progress --write-recovery-conf -D /var/lib/postgresql/12/main
chmod --recursive --verbose 0700 /var/lib/postgresql/12/main
chown -Rv postgres:postgres /var/lib/postgresql/12/main
/etc/init.d/postgresql start

Do you get any useful output from the -v part of pg_basebackup? It should for example tell you the exact start and stop point in the wal during the basebackup, that can be correlated to the msising file.

Normally the window between end of pg_basebackup and start of the actual service is not big enough to cause a problem (since v12 will do a streaming receive of the logs *during* the backup -- it could be a big problem before that was possible, or if one forgot to enable it before it was the default), and it certainly sounds weird that it should be in your case, unless the chmod and chown commands take a *long* time. But if it is, there is nothing preventing you from creating a slot just during setup and then get rid of it. That is:

1. create slot

2. pg_basebackup with slot

3. start replication with slot

4. restart replication without slot once it's caught up

5. drop slot

However, if you want reliable replication, you really should have a slot. Or at least, you should have either a slot *or* log archiving that's read-accessible from the replica.

Magnus Hagander
Me: https://www.hagander.net/
Work: https://www.redpill-linpro.com/

Re: How to start slave after pg_basebackup. Why min_wal_size and wal_keep_segments are duplicated

От

"Andrus"

Дата:

01 июня 2020 г., 13:26:12

Hi!

> How to create replication server ?
>I always do it this way and it work for me:
>$ pg_basebackup -h ${PGHOST} -p ${PGPORT} -U replicator -W -R -D ${PGDATA} -P -v -Fp -Xs
>After that, I edit ${PGDATA}/postgresql.conf and (w/ PostgreSQL 11 and older ${PGDATA}/recovery.conf) to make it do
whatI want and
 
>then I just launch it:
>$ pg_ctl start

My script does the same thing as your comands.

>From that moment onward, it replicates and applies to the replica. Checks in pg_stat_replication on the master and
>pg_stat_wal_receiver on the replica >confirm that. They also show the WAL switches.
>To provoke a WAL switch I always do:
>postgres=# checkpoint; select pg_switch_wal();
>CHECKPOINT
> pg_switch_wal
>I just don't understand what you're trying to achieve here.

I want to create replication server.

>My guess is, you want to stop and backup the old database cluster,

Old cluster is empty, from initdb. Backup is not needed

>then create a new one in its old directory, right?

pg_basebackup creates new main directory.

>In this case, you probably need to change your script to something like this:
>PGHOST=remote.example.com
>PGPASSWORD=mypass
>PGUSER=replikaator
>PGDATA=/var/lib/postgresql/12/main
>export PGHOST PGPASSWORD PGUSER PGDATA
>/etc/init.d/postgresql stop
>mv ${PGDATA} /var/lib/postgresql/12/mainennebaasbakuppi
>pg_basebackup -h ${PGHOST} -p ${PGPORT} -U ${PGUSER} -W -R -D ${PGDATA} -P -v -Fp -Xs
>/etc/init.d/postgresql start

pg_basebackup uses environment varuables if not specified in command line. So my script does the same thing.

>Note that my invocation of pg_basebackup asks for the replicator password. This is intended. You'd probably want to
changethat.
 
>Also, no need to play around with ownership and permissions. Do it as "postgres", not as "root".

I tried

sudo --user=postgres pg_basebackup ....

but got error

could not change directory to "/root": Permission denied

Andrus.

Re: How to start slave after pg_basebackup. Why min_wal_size andwal_keep_segments are duplicated

От

Paul Förster

Дата:

01 июня 2020 г., 13:44:07

Hi Andrus,

> On 01. Jun, 2020, at 12:26, Andrus <kobruleht2@hot.ee> wrote:
> My script does the same thing as your comands.

no, it does not. And as long as you refuse to understand that, I can't help you.

> I want to create replication server.

no, you want to create a streaming replica.

> Old cluster is empty, from initdb. Backup is not needed

then why the mv stuff?

> pg_basebackup uses environment varuables if not specified in command line. So my script does the same thing.

have you looked at the other options?

> I tried
> sudo --user=postgres pg_basebackup ....
> but got error
> could not change directory to "/root": Permission denied

try
sudo su - postgres pg_basebackup ....

Note the "-"! It's essential to get the environment of postgres, which is what you want.

Cheers,
Paul

Вход в личный кабинет

Восстановление пароля

Подтверждение аккаунта

Изменение пароля

Обсуждение: How to start slave after pg_basebackup. Why min_wal_size and wal_keep_segments are duplicated