Big UPDATE breaking replication

Поиск
Список
Период
Сортировка
От Kouber Saparev
Тема Big UPDATE breaking replication
Дата
Msg-id 51ADD54F.3030702@saparev.com
обсуждение исходный текст
Ответы Re: Big UPDATE breaking replication  (Albe Laurenz <laurenz.albe@wien.gv.at>)
Re: Big UPDATE breaking replication  (Steve Crawford <scrawford@pinpointresearch.com>)
Список pgsql-admin
Hello,

We are using the 9.1 built-in streaming replication.

Recently our slave nodes fell behind because of an UPDATE statement. It
took about 3 minutes to execute, but it affected half a million records,
hence the replication broke with the "requested WAL segment ... has
already been removed" series of error messages.

The WAL settings we have are:

max_wal_senders = 6
wal_keep_segments = 60
max_standby_archive_delay = 300s


I guess increasing the wal_keep_segments value would prevent it from
happening in the future, but increase it with how much? What value would
be high enough?

Also we noticed some strange error message appearing shortly before and
after this same statement: "LOG:  out of file descriptors: Too many open
files; release and retry".

Could it be related somehow and what does it mean exactly?

Here's an excerpt from the master DB log:


May 30 12:23:09 DB1 postgres[28201]: [13-1] user=www,db=xxx LOG:  out of
file descriptors: Too many open files; release and retry
May 30 12:23:09 DB1 postgres[28201]: [13-2] user=www,db=xxx CONTEXT:
writing block 0 of relation base/2819385/2820788
May 30 12:23:09 DB1 postgres[28201]: [13-3] user=www,db=xxx STATEMENT:
UPDATE
May 30 12:23:09 DB1 postgres[28201]: [13-4] ^I              message
May 30 12:23:09 DB1 postgres[28201]: [13-5] ^I            SET
May 30 12:23:09 DB1 postgres[28201]: [13-6] ^I
sender_has_deleted=TRUE,
May 30 12:23:09 DB1 postgres[28201]: [13-7] ^I
receiver_has_deleted=TRUE
May 30 12:23:09 DB1 postgres[28201]: [13-8] ^I            WHERE
from_profile_sid=870

...

May 30 12:39:47 DB1 postgres[9053]: [2-1] user=postgres,db=[unknown]
FATAL:  requested WAL segment 00000001000002DE000000BD has already been
removed


Regards,
--
Kouber Saparev


В списке pgsql-admin по дате отправления:

Предыдущее
От: prakhar jauhari
Дата:
Сообщение: Re: Steps to switch from Master to standby mode :
Следующее
От: Albe Laurenz
Дата:
Сообщение: Re: Big UPDATE breaking replication