[streaming replication] 9.1.3 streaming replication bug ?

Поиск

Список

Период

Сортировка

От	乔志强
Тема	[streaming replication] 9.1.3 streaming replication bug ?
Дата	9 апреля 2012 г. 07:33:43
Msg-id	E81554BCB8813E49A8916AACC0503A850B4A913E@lc-shmail3.SHANGHAI.LEADCORETECH.COM обсуждение исходный текст
Ответ на	Re: 9.1.3 Standby catchup mode (Adrian Klaver <adrian.klaver@gmail.com>)
Ответы	Re: [streaming replication] 9.1.3 streaming replicationbug ? Re: [streaming replication] 9.1.3 streaming replication bug ? Re: [streaming replication] 9.1.3 streaming replication bug ?
Список	pgsql-general

Дерево обсуждения

I use postgresql-9.1.3-1-windows-x64.exe on windows 2008 R2 x64.

1 master and 1 standby. The standby is a synchronous standby use streaming replication (synchronous_standby_names =
'*',archive_mode = off), the master output:
 
       standby "walreceiver" is now the synchronous standby with priority 1
the standby output:
       LOG:  streaming replication successfully connected to primary

Then run the test program to write and commit large blob(10 to 1000 MB bytes rand size) to master server use 40
threads(40sessions) in loop,
 
The Master and standby is run on the same machine, and the client run on another machine with 100 mbps network.


But after some minutes the master output:
       requested WAL segment XXX has already been removed
the standby output:
       FATAL:  could not receive data from WAL stream: FATAL:  requested WAL segment XXX
            has already been removed


Question:
Why the master deletes the WAL segment before send to standby in synchronous mode? It is a streaming replication bug ?


I see if no standby connect to master when synchronous_standby_names = '*', 
all commit will delay to standby connect to master. It is good.

Use a bigger wal_keep_segments?  But I think the master should keep all WAL segments not sent to online standby (sync
orasync).
 
wal_keep_segments shoud be only for offline standby. 

If use synchronous_standby_names for sync standby, if no online standby, all commit will delay to standby connect to
master,
 
So wal_keep_segments is only for offline async standby actually.



////////////////////////////////////////

master server output:
LOG:  database system was interrupted; last known up at 2012-03-30 15:37:03 HKT
LOG:  database system was not properly shut down; automatic recovery in progress

LOG:  redo starts at 0/136077B0
LOG:  record with zero length at 0/17DF1E10
LOG:  redo done at 0/17DF1D98
LOG:  last completed transaction was at log time 2012-03-30 15:37:03.148+08
FATAL:  the database system is starting up
LOG:  database system is ready to accept connections
LOG:  autovacuum launcher started
   ///////////////////// the standby is a synchronous standby
     LOG:  standby "walreceiver" is now the synchronous standby with priority 1
   /////////////////////
LOG:  checkpoints are occurring too frequently (16 seconds apart)
HINT:  Consider increasing the configuration parameter "checkpoint_segments".
LOG:  checkpoints are occurring too frequently (23 seconds apart)
HINT:  Consider increasing the configuration parameter "checkpoint_segments".
LOG:  checkpoints are occurring too frequently (24 seconds apart)
HINT:  Consider increasing the configuration parameter "checkpoint_segments".
LOG:  checkpoints are occurring too frequently (20 seconds apart)
HINT:  Consider increasing the configuration parameter "checkpoint_segments".
LOG:  checkpoints are occurring too frequently (22 seconds apart)
HINT:  Consider increasing the configuration parameter "checkpoint_segments".
FATAL:  requested WAL segment 000000010000000000000032 has already been removed
FATAL:  requested WAL segment 000000010000000000000032 has already been removed
FATAL:  requested WAL segment 000000010000000000000032 has already been removed
LOG:  checkpoints are occurring too frequently (8 seconds apart)
HINT:  Consider increasing the configuration parameter "checkpoint_segments".
FATAL:  requested WAL segment 000000010000000000000032 has already been removed 



////////////////////////
standby server output:
LOG:  database system was interrupted while in recovery at log time 2012-03-30 1
4:44:31 HKT
HINT:  If this has occurred more than once some data might be corrupted and you
might need to choose an earlier recovery target.
LOG:  entering standby mode
LOG:  redo starts at 0/16E4760
LOG:  consistent recovery state reached at 0/12D984D8
LOG:  database system is ready to accept read only connections
LOG:  record with zero length at 0/17DF1E68
LOG:  invalid magic number 0000 in log file 0, segment 50, offset 6946816
LOG:  streaming replication successfully connected to primary
FATAL:  could not receive data from WAL stream: FATAL:  requested WAL segment 00
0000010000000000000032 has already been removed

В списке pgsql-general по дате отправления:

Вход в личный кабинет

Восстановление пароля

Подтверждение аккаунта

Изменение пароля

[streaming replication] 9.1.3 streaming replication bug ?