Re: streaming replication timeout error

Поиск
Список
Период
Сортировка
От Adrian Klaver
Тема Re: streaming replication timeout error
Дата
Msg-id 52560DD2.2010605@gmail.com
обсуждение исходный текст
Ответ на Re: streaming replication timeout error  (高健 <luckyjackgao@gmail.com>)
Список pgsql-general
On 10/09/2013 05:51 PM, 高健 wrote:
> Hello:
>
> Thanks for replying.
>
> The recovery.conf file on standby(DB2) is like that:
>
> standby_mode             = 'on'
> primary_conninfo         = 'host=DB1 port=5432 application_name=testpg
> user=postgres connect_timeout=10 keepalives_idle=5 keepalives_interval=1'
> recovery_target_timeline = 'latest'
> restore_command          = 'scp -o "ConnectTimeout 5" -i
> /opt/PostgresPlus/9.2AS/.ssh/id_edb
> DB1:/opt/PostgresPlus/9.2AS/data/arch/%f %p'
>
>
> I  am not familiar with the scp command,  I think that here scp is used
> to copy archive wal log files from primary  to standby...
>
> Maybe the ConnectionTimeout is too small, And sometimes when network is
> not very well,
> the restore_command will fail and return FATAL error?
>
> In fact I am a little confused about restore_command, we are using
> streaming replication, but why restore_command is still needed to copy
> archive wal log, isn't it  the old warm standby (file shipping)?

Best explanation is in the docs:

http://www.postgresql.org/docs/9.3/static/warm-standby.html
"
At startup, the standby begins by restoring all WAL available in the
archive location, calling restore_command. Once it reaches the end of
WAL available there and restore_command fails, it tries to restore any
WAL available in the pg_xlog directory. If that fails, and streaming
replication has been configured, the standby tries to connect to the
primary server and start streaming WAL from the last valid record found
in archive or pg_xlog. If that fails or streaming replication is not
configured, or if the connection is later disconnected, the standby goes
back to step 1 and tries to restore the file from the archive again.
This loop of retries from the archive, pg_xlog, and via streaming
replication goes on until the server is stopped or failover is triggered
by a trigger file.
"

Basically by having a restore_command and primary_conninfo you are
telling the standby to do both, following the sequence described above.

FYI ConnectTimeout is a SSH option passed to scp.

man ssh_config will get you more information.

Would seem both your streaming and archiving are using the same network,
is that correct?

If so you have a single point of failure, the network.


>
> Best Regards
> jian gao
>
>

--
Adrian Klaver
adrian.klaver@gmail.com


В списке pgsql-general по дате отправления:

Предыдущее
От: bricklen
Дата:
Сообщение: Re: [GENERAL] Forms for entering data into postgresql‏
Следующее
От: Adrian Klaver
Дата:
Сообщение: Re: [GENERAL] Forms for entering data into postgresql‏