PG 9.3.12: Replication appears to have worked, but getting error messages in logs

Поиск

Список

Период

Сортировка

От	David Caldwell
Тема	PG 9.3.12: Replication appears to have worked, but getting error messages in logs
Дата	4 апреля 2016 г. 18:31:54
Msg-id	4db92cca-fd08-b015-eed8-436b1a1421cf@porkrind.org обсуждение исходный текст
Ответы	Re: PG 9.3.12: Replication appears to have worked, but getting error messages in logs
Список	pgsql-general

Дерево обсуждения

Hello,

We're using streaming replication. Our technique for spinning up a db
slave is this:

rsync from master (gross copy)
pg_start_backup() on server
rsync from master (correct copy)
pg_stop_backup()
drop in recovery.conf into slave directory
enable hot_standby in slave conf
start slave

After starting the slave, I'm getting this error every 5 seconds in the log:

ERROR:  requested WAL segment 0000000100000E2200000005 has already been
removed

But I can connect to the DB and make queries and new records are
definitely streaming in.

I thought I just didn't have enough wal segments, so I bumped up the
number on the master and restarted the process. It just finished the
second time and the exact same error message is in the logs again (same
wal segment number).


When I ran pg_start_backup() and pg_stop_backup(), the output was:

 pg_start_backup
-----------------
 E27/3100A200

 pg_stop_backup
----------------
 E28/7D357950


The backup_label file looked like this:

START WAL LOCATION: E27/3100A200 (file 0000000100000E2700000031)
CHECKPOINT LOCATION: E27/31C9C740
BACKUP METHOD: pg_start_backup
BACKUP FROM: master
START TIME: 2016-04-02 12:34:25 PDT
LABEL: clone


During the rsync it copied
  pg_xlog/0000000100000E25000000F1
to
  pg_xlog/0000000100000E2800000071


So I'm confused: why is the E22 wal being requested? It seems to predate
the backup by a lot.

Does the slave really contain all the data? If not, how can I tell what
is missing (and why is it accepting streaming data if it's missing
something)?


One more piece of the puzzle that may or may not be relevant:

The current master used to be streaming replication slave. The original
master had a disk failure and so we switched one of the backup slaves
into a master. We've replaced the disk on the original server and we're
now trying to make it a streaming replication slave. This is the part
that's failing. If I do rough estimates of how fast the Exx number is
incrementing and compute backwards, E22 seems like about the time of the
original disk failure, give or take.


Thanks,
  David

Вложения

smime.p7s

В списке pgsql-general по дате отправления:

Предыдущее

От: Adrian Klaver
Дата: 03 апреля 2016 г., 19:23:28
Сообщение: Re: CORRUPTION on TOAST table

Следующее

От: Rodney Lott
Дата: 04 апреля 2016 г., 21:04:45
Сообщение: postgresql 9.3.10, FIPS mode and DRBG issues.

Вход в личный кабинет

Восстановление пароля

Подтверждение аккаунта

Изменение пароля

PG 9.3.12: Replication appears to have worked, but getting error messages in logs

Вложения

Предыдущее

Следующее