Re: PostgreSQL 10.5 : Logical replication timeout results in PANIC inpg_wal "No space left on device"

Поиск

Список

Период

Сортировка

От	Rui DeSousa
Тема	Re: PostgreSQL 10.5 : Logical replication timeout results in PANIC inpg_wal "No space left on device"
Дата	17 ноября 2018 г. 23:47:02
Msg-id	291265BC-C620-4112-87E8-CCF0A0BAAA1B@crazybean.net обсуждение исходный текст
Ответ на	Re: PostgreSQL 10.5 : Logical replication timeout results in PANIC inpg_wal "No space left on device" (Achilleas Mantzios <achill@matrix.gatewaynet.com>)
Ответы	Re: PostgreSQL 10.5 : Logical replication timeout results in PANIC inpg_wal "No space left on device" Re: PostgreSQL 10.5 : Logical replication timeout results in PANIC inpg_wal "No space left on device"
Список	pgsql-admin

Дерево обсуждения

On Nov 17, 2018, at 6:07 AM, Achilleas Mantzios <achill@matrix.gatewaynet.com> wrote:

You may read the PostgreSQL backend sources (grep for SO_KEEPALIVE), the code supports KEEPALIVE.

Postgres supports it; but the question is it on for the given connection?

I checked on a bare minimal default installation, (after tweaking the kernel tunables to smaller values of course), keepalive msgs are sent and ACK'ed at the specified intervals, checked with wireshark, port 5432. You should test this yourself.

I just configured Postgres with streaming replication using the following versions and TCP keep alive was enabled by default for the WAL receiver connection and also psql connections.

Linux debian 4.9.0-7-amd64 #1 SMP Debian 4.9.110-3+deb9u2 (2018-08-13) x86_64 GNU/Linux

PostgreSQL 10.6 (Debian 10.6-1.pgdg90+1) on x86_64-pc-linux-gnu, compiled by gcc (Debian 6.3.0-18+deb9u1) 6.3.0 20170516, 64-bit

root@debian:~# netstat -anp --timers | grep -e Timer -e EST | grep -e Timer -e 5432

Proto Recv-Q Send-Q Local Address Foreign Address State PID/Program name Timer

tcp 0 0 10.6.6.101:47546 10.6.6.100:5432 ESTABLISHED 989/telnet off (0.00/0/0)

tcp 0 0 10.6.6.101:47544 10.6.6.100:5432 ESTABLISHED 953/psql keepalive (7103.36/0/0)

tcp 0 0 10.6.6.101:47542 10.6.6.100:5432 ESTABLISHED 922/postgres: 10/ma keepalive (7088.03/0/0)

As you can see from above; telnet does not enable keep alive on the connection. I would check the troubled system with the above netstat command to verify that keep alive is in fact enabled on the WAL receiver connection.

If it’s enabled the connection should have terminated after the 18 hours and hopefully less now with your new setting. I have no idea why it wouldn’t terminate and reconnect other than tcp keep live is either off or a bug in Linux/Postgres.

В списке pgsql-admin по дате отправления:

Предыдущее

От: Laurenz Albe
Дата: 17 ноября 2018 г., 22:28:10
Сообщение: Re: checkpoint occurs very often when vacuum full running

Следующее

От: Rui DeSousa
Дата: 18 ноября 2018 г., 00:18:36
Сообщение: Re: PostgreSQL 10.5 : Logical replication timeout results in PANIC inpg_wal "No space left on device"

Вход в личный кабинет

Восстановление пароля

Подтверждение аккаунта

Изменение пароля

Re: PostgreSQL 10.5 : Logical replication timeout results in PANIC inpg_wal "No space left on device"

Предыдущее

Следующее