Re: standby not cleaning up WAL files often enough

Поиск

Список

Период

Сортировка

От	Jorge Torralba
Тема	Re: standby not cleaning up WAL files often enough
Дата	23 апреля 2015 г. 03:01:30
Msg-id	CACut7uTdbjrcnyRuhXuSCPWGow+J1XxZC0v4sFefek7oxY578w@mail.gmail.com обсуждение исходный текст
Ответ на	standby not cleaning up WAL files often enough (Jeff Frost <jeff@pgexperts.com>)
Ответы	Re: standby not cleaning up WAL files often enough
Список	pgsql-admin

Дерево обсуждения

Stupid question but it needs to be asked.

Do you have this in your recovery.conf file ?

archive_cleanup_command = 'pg_archivecleanup path to archived files %r'

On Wed, Apr 22, 2015 at 4:22 PM, Jeff Frost <jeff@pgexperts.com> wrote:

I’ve got an interesting situation on a new 9.4.1 streaming replica.

Occasionally it will get disconnected and need to catch up by pulling files from the wal-archive via wal-e. This works fine and most of the time it will catch up and reconnect.

Ocasionally though, it will run the 50G pg_xlog partition out of disk space because it doesn’t seem to clean up the WAL files nearly fast enough. For those who haven’t done the math in their head, that’s a bit over 3,000 WAL files.

checkpoint_segments = 33
checkpoint_completion_target = 0.9
checkpoint_timeout = 10min

Here’s how the logs look:

2015-04-22 20:48:34.331 GMT,,,32645,,553298b3.7f85,2666,,2015-04-18 17:47:31 GMT,,0,LOG,00000,"restartpoint complete: wrote 382002 buffers (36.4%); 0 transaction log file(s) added, 461 removed, 67 recycled; write=197.586 s, sync=0.253 s, total=199.001 s; sync files=397, longest=0.146 s, average=0.000 s",,,,,,,,,""
2015-04-22 20:52:01.176 GMT,,,32645,,553298b3.7f85,2669,,2015-04-18 17:47:31 GMT,,0,LOG,00000,"restartpoint complete: wrote 395868 buffers (37.8%); 0 transaction log file(s) added, 428 removed, 67 recycled; write=205.487 s, sync=0.281 s, total=206.843 s; sync files=510, longest=0.241 s, average=0.000 s",,,,,,,,,""
2015-04-22 20:55:13.082 GMT,,,32645,,553298b3.7f85,2672,,2015-04-18 17:47:31 GMT,,0,LOG,00000,"restartpoint complete: wrote 369073 buffers (35.2%); 0 transaction log file(s) added, 461 removed, 67 recycled; write=190.356 s, sync=0.469 s, total=191.903 s; sync files=481, longest=0.363 s, average=0.000 s",,,,,,,,,""
2015-04-22 20:58:30.610 GMT,,,32645,,553298b3.7f85,2675,,2015-04-18 17:47:31 GMT,,0,LOG,00000,"restartpoint complete: wrote 380614 buffers (36.3%); 0 transaction log file(s) added, 461 removed, 67 recycled; write=196.295 s, sync=0.150 s, total=197.526 s; sync files=478, longest=0.062 s, average=0.000 s",,,,,,,,,""
2015-04-22 21:02:01.334 GMT,,,32645,,553298b3.7f85,2678,,2015-04-18 17:47:31 GMT,,0,LOG,00000,"restartpoint complete: wrote 396343 buffers (37.8%); 0 transaction log file(s) added, 461 removed, 67 recycled; write=209.380 s, sync=0.192 s, total=210.722 s; sync files=444, longest=0.161 s, average=0.000 s",,,,,,,,,""
2015-04-22 21:05:33.490 GMT,,,32645,,553298b3.7f85,2681,,2015-04-18 17:47:31 GMT,,0,LOG,00000,"restartpoint complete: wrote 404200 buffers (38.5%); 0 transaction log file(s) added, 428 removed, 67 recycled; write=210.967 s, sync=0.182 s, total=212.154 s; sync files=505, longest=0.142 s, average=0.000 s",,,,,,,,,""
2015-04-22 21:08:37.649 GMT,,,32645,,553298b3.7f85,2684,,2015-04-18 17:47:31 GMT,,0,LOG,00000,"restartpoint complete: wrote 417375 buffers (39.8%); 0 transaction log file(s) added, 461 removed, 67 recycled; write=182.664 s, sync=0.391 s, total=184.157 s; sync files=448, longest=0.198 s, average=0.000 s",,,,,,,,,""
2015-04-22 21:08:47.816 GMT,,,32645,,553298b3.7f85,2687,,2015-04-18 17:47:31 GMT,,0,LOG,00000,"restartpoint complete: wrote 914929 buffers (87.3%); 0 transaction log file(s) added, 485 removed, 10 recycled; write=5.511 s, sync=3.434 s, total=10.165 s; sync files=459, longest=0.484 s, average=0.007 s",,,,,,,,,""
2015-04-22 21:10:07.464 GMT,,,32645,,553298b3.7f85,2690,,2015-04-18 17:47:31 GMT,,0,LOG,00000,"restartpoint complete: wrote 8421 buffers (0.8%); 0 transaction log file(s) added, 395 removed, 67 recycled; write=70.043 s, sync=0.278 s, total=71.243 s; sync files=182, longest=0.234 s, average=0.001 s",,,,,,,,,”"

All of those are started by "xlog" and not by "time'.

At first I was thinking perhaps wal-e was prefetching way too many WAL files, but it appears to only pre-fetch 8 by default:

https://github.com/wal-e/wal-e/blob/master/wal_e/cmd.py#L316

Also, I verified that there were almost 3k WAL files in the pg_xlog directory itself.

I guess my question is: how can I avoid this behavior other than just continuing to grow that pg_xlog partition?

Thanks,

Jorge Torralba
----------------------------

Note: This communication may contain privileged or other confidential information. If you are not the intended recipient, please do not print, copy, retransmit, disseminate or otherwise use the information. Please indicate to the sender that you have received this email in error and delete the copy you received. Thank You.

В списке pgsql-admin по дате отправления:

Предыдущее

От: Jeff Frost
Дата: 23 апреля 2015 г., 02:23:03
Сообщение: standby not cleaning up WAL files often enough

Следующее

От: Jeff Frost
Дата: 23 апреля 2015 г., 03:02:37
Сообщение: Re: standby not cleaning up WAL files often enough

Вход в личный кабинет

Восстановление пароля

Подтверждение аккаунта

Изменение пароля

Re: standby not cleaning up WAL files often enough

Предыдущее

Следующее