restartpoints stop generating on streaming-replication slave

Поиск

Список

Период

Сортировка

От	Mathieu Fenniak
Тема	restartpoints stop generating on streaming-replication slave
Дата	22 августа 2012 г. 00:22:21
Msg-id	CAHoiPjxe6DNO9mr6TKb4p-jcRibjzRnPOXoj1M4a-2bSn266PQ@mail.gmail.com обсуждение исходный текст
Список	pgsql-hackers

Дерево обсуждения

<span style="color:rgb(34,34,34);font-family:arial,sans-serif;font-size:13px;background-color:rgb(255,255,255)">Hi
all,</span><div
style="color:rgb(34,34,34);font-family:arial,sans-serif;font-size:13px;background-color:rgb(255,255,255)"><br
/></div><div
style="color:rgb(34,34,34);font-family:arial,sans-serif;font-size:13px;background-color:rgb(255,255,255)">I'vebeen
investigatingan issue with our PostgreSQL 9.1.1 (Linux x86-64 CentOS 5.8) database where restartpoints suddenly stop
beinggenerated on the streaming-replication slave after working correctly for a week or two.  The symptom of the
problemis that the pg_xlog directory on the slave doesn't get cleaned up, and the log_checkpoints output
(eg. restartpointstarting: time) stops appearing.</div><div
style="color:rgb(34,34,34);font-family:arial,sans-serif;font-size:13px;background-color:rgb(255,255,255)"><br
/></div><divstyle="color:rgb(34,34,34);font-family:arial,sans-serif;font-size:13px;background-color:rgb(255,255,255)">
Iwas able to extract a core dump of the bgwriter process while it was in BgWriterNap.  I inspected ckpt_start_time and
last_checkpoint_time;ckpt_start_time was 1345578533 (... 19:48:53 GMT) and last_checkpoint_time was 1345578248
(... 19:44:08GMT).  Based upon these values, I concluded that it's performing checkpoints but missing the "if
(ckpt_performed)"condition (ie. CreateRestartPoint returns false); it's then setting last_checkpoint_time to now - 5
minutes+ 15 seconds.</div><div
style="color:rgb(34,34,34);font-family:arial,sans-serif;font-size:13px;background-color:rgb(255,255,255)"><br
/></div><divstyle="color:rgb(34,34,34);font-family:arial,sans-serif;font-size:13px;background-color:rgb(255,255,255)">
Thereseems to be two causes of a false retval in CreateRestartPoint; the first is if !RecoveryInProgress(), and the
secondis if "the last checkpoint record we've replayed is already our last restartpoint".  The first condition doesn't
seemlikely; does anyone know how we might be hitting the second condition?  We have continuous traffic on the master
serverin the range of 1000 txn/s, and the slave seems to be completely up-to-date, so I don't understand how we could
behitting this condition.</div><div
style="color:rgb(34,34,34);font-family:arial,sans-serif;font-size:13px;background-color:rgb(255,255,255)"><br
/></div><divstyle="color:rgb(34,34,34);font-family:arial,sans-serif;font-size:13px;background-color:rgb(255,255,255)">
Mathieu</div><div
style="color:rgb(34,34,34);font-family:arial,sans-serif;font-size:13px;background-color:rgb(255,255,255)"><br/></div>

В списке pgsql-hackers по дате отправления:

Предыдущее

От: Robert Haas
Дата: 22 августа 2012 г., 00:20:45
Сообщение: Re: reviewing the "Reduce sinval synchronization overhead" patch / b4fbe392f8ff6ff1a66b488eb7197eef9e1770a4

Следующее

От: Tatsuo Ishii
Дата: 22 августа 2012 г., 00:26:17
Сообщение: Re: multi-master pgbench?

Вход в личный кабинет

Восстановление пароля

Подтверждение аккаунта

Изменение пароля

restartpoints stop generating on streaming-replication slave

Предыдущее

Следующее