Re: postgres wal sender replication timeout during pg_basebackup

Поиск
Список
Период
Сортировка
От Peter Brunnengräber
Тема Re: postgres wal sender replication timeout during pg_basebackup
Дата
Msg-id 109968546.289.1460409481048.JavaMail.pbrunnen@Station8.local
обсуждение исходный текст
Ответ на Re: postgres wal sender replication timeout during pg_basebackup  (Albe Laurenz <laurenz.albe@wien.gv.at>)
Список pgsql-admin
Hello Mr. Albe,

> What PostgreSQL version are you running?
  9.2

> The server error message means that the client did not send a status
> update within "wal_sender_timeout" milliseconds

  So if I understand this correctly, the wal sender must receive a message back from the receiver in this preset time
orelse think that the transmission failed... 

  9.2 doesn't seem to have the "wal_sender_timeout" parameter, and it appears than "replication_timeout" may be the
nameof the parameter prior to v9.3 so this is what I am tweaking.  I originally had "replication_timeout = 5s", and I
verifiedthat "wal_receiver_status_interval = 2s" per the documentation. 

  You were correct that setting this value to 0 did allow the pg_basebackup to complete without an error.  I plan to
alsotry setting this value to 15s to see if the pg_basebackup completes in that time frame. 


> Is there a firewall between client and server that could swallow such messages?
  None that I am aware of, but I will check with the Xen Hypervisor admin to make sure there isn't something setup here
whichcould also cause trouble down the road. 


> Avoiding SSL will also greatly speed up pg_basebackup.
  Ok.  I will give this a try as well.



Thank you ever so much for your reply and solution, it was greatly appreciated!

With kind regards. -Peter



----- Original Message -----
From: "Albe Laurenz" <laurenz.albe@wien.gv.at>
To: "Peter Brunnengräber" <pbrunnen@bccglobal.com>, pgsql-admin@postgresql.org
Sent: Friday, April 8, 2016 5:03:29 AM
Subject: Re: [ADMIN] postgres wal sender replication timeout during pg_basebackup

Peter Brunnengräber wrote:
>    I brought the database files over from the current single production postgres server. By this I
> mean I shutdown postgres and tar-ed up the data directory and copied it over the the cluster's Master
> node. I put the files in place, set the permissions, and was able to start-up postgres on the Master
> via corosync just fine.
>
>    In preparing the slave, I used the pg_basebackup tool to bring the database over from the Master
> and this is where I keep having issues. As it is transferring, at about 57% I see the error:
>
> >  $ pg_basebackup -h db-master -U u_repl -D /db/data/postgresql/9.2/main/ -X stream -P
> >  pg_basebackup: could not receive data from WAL stream: SSL connection has been closed unexpectedly
> >  176472/176472 kB (100%), 1/1 tablespace
> >  pg_basebackup: child process exited with error 1`
>
> And on the server, I see:
>
> >  2016-04-06 21:05:31 UTC LOG:  terminating walsender process due to replication timeout
>
>   But the transfer doesn't stop and keeps going to completion.
>
>   I found this [http://dba.stackexchange.com/questions/59916/streaming-replication-log-is-puzzling-me]
> question on stackexchange about setting "ssl_renegotiation_limit" to 0, but this didn't make much
> difference.
>
>   Anyone have any ideas? I didn't find any reference to this problem in the mailing list archives. I
> am completely baffled as to why this would error, but keep on going. Maybe this isn't a problem at
> all?  It is the same procedure I used in the lab setup... the only difference is that the production
> database is much bigger in size.

ssl_renegotiation_limit would also have been my first guess.
What PostgreSQL version are you running?

The server error message means that the client did not send a status update
within "wal_sender_timeout" milliseconds, see
http://www.postgresql.org/docs/current/static/runtime-config-replication.html#GUC-WAL-SENDER-TIMEOUT

Does pg_basebackup succeed if you set "wal_sender_timeout" to zero?

Is there a firewall between client and server that could swallow such messages?

Could you try without SSL (e.g. set the environment variable PGSSLMODE to "disable")
an see if that makes the problem go away?
Avoiding SSL will also greatly speed up pg_basebackup.

Yours,
Laurenz Albe

--
Sent via pgsql-admin mailing list (pgsql-admin@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-admin


В списке pgsql-admin по дате отправления:

Предыдущее
От: Marc Mamin
Дата:
Сообщение: idx_scan =0, but idx_blks_read > 0
Следующее
От: "drum.lucas@gmail.com"
Дата:
Сообщение: Re: [TIPS] Tuning PostgreSQL 9.2