Обсуждение: Segmentation fault occurs when the standby becomes primary, in SR
Hi, When I created the trigger file to activate the standby server, I got the segmentation fault: sby [11342]: LOG: trigger file found: ../trigger sby [11343]: FATAL: terminating walreceiver process due to administrator command sby [11342]: LOG: redo done at 0/10000E0 sby [11342]: LOG: last completed transaction was at log time 2000-01-01 09:21:04.685861+09 sby [11341]: LOG: startup process (PID 11342) was terminated by signal 11: Segmentation fault sby [11341]: LOG: terminating any other active server processes This happens in the following scenario: 0. The trigger file is found. 1. The variable StandbyMode is reset to FALSE before re-fetching the last applied record. 2. That record attempts to be read from the archive. 3. RestoreArchivedFile() goes through the following condition expression because the StandbyMode is off. if (StandbyMode && recoveryRestoreCommand == NULL) goto not_available; 4. RestoreArchivedFile() wrongly constructs the command to be executed even though restore_command has not been supplied (this is possible in standby mode). ---> Segmentation fault! The attached patch would fix the bug. Regards, -- Fujii Masao NIPPON TELEGRAPH AND TELEPHONE CORPORATION NTT Open Source Software Center
Вложения
Fujii Masao wrote: > When I created the trigger file to activate the standby server, > I got the segmentation fault: > > ... > The attached patch would fix the bug. Thanks, committed. (I kept the old comment, though, I liked it better) Now, whether we should even allow setting up a standby without restore_command is another question. It's *possible*, but you need to enable archiving in the master anyway to take an on-line backup, and you need the archive to catch up if the standby ever falls behind too much. Then again, if the database is small, maybe you don't mind taking a new base backup if the standby falls behind. And you *can* take a base backup with a dummy archive_command (ie. archive_command='/bin/true'), if you trust that the WAL files stay in pg_xlog long enough for standby to stream them from there. Perhaps we should require a restore_command. If you know what you're doing, you can always use '/bin/false' as restore_command to hack around it. -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com
On Thu, Jan 28, 2010 at 2:23 PM, Heikki Linnakangas <heikki.linnakangas@enterprisedb.com> wrote: > Perhaps we should require a restore_command. If you know what you're > doing, you can always use '/bin/false' as restore_command to hack around it. That seems kind of needlessly hacky (and it won't work on Windows). Seems like it doesn't cost anything to let it be omitted altogether. ...Robert
On Fri, Jan 29, 2010 at 4:23 AM, Heikki Linnakangas <heikki.linnakangas@enterprisedb.com> wrote: > Thanks, committed. (I kept the old comment, though, I liked it better) Thanks! > Then again, if the database is small, maybe you don't mind taking a new > base backup if the standby falls behind. And you *can* take a base > backup with a dummy archive_command (ie. archive_command='/bin/true'), > if you trust that the WAL files stay in pg_xlog long enough for standby > to stream them from there. Yeah, this is one of the case that restore_command is not required for SR. > Perhaps we should require a restore_command. If you know what you're > doing, you can always use '/bin/false' as restore_command to hack around it. One of main aim of SR is an easy-to-setup. So I don't want to impose such a hacky setting of restore_command on users. Regards, -- Fujii Masao NIPPON TELEGRAPH AND TELEPHONE CORPORATION NTT Open Source Software Center