Wrong configuration of tcp_user_timeout can terribly affects tcp_keepalives mechanism

Поиск
Список
Период
Сортировка
От PG Doc comments form
Тема Wrong configuration of tcp_user_timeout can terribly affects tcp_keepalives mechanism
Дата
Msg-id 160741519849.701.13355787096244067178@wrigleys.postgresql.org
обсуждение исходный текст
Список pgsql-docs
The following documentation comment has been logged on the website:

Page: https://www.postgresql.org/docs/13/bug-reporting.html
Description:

These changes are applicable for PostgreSQL 12 and later versions.

It is necessary to add a Note to tcp_user_timeout configuration parameter
with following content:

WARNING:
If tcp_user_timeout is not 0, tcp_keepalives_count specified value is
ignored and is calculated by UNIX kernel according to the following
expression:

tcp_keepalives_count = (tcp_user_timeout - tcp_keepalives_idle * 1000) /
(tcp_keepalives_interval * 1000).

The idled TCP connection will be aborted at next after calculated
tcp_keepalives_count keepalive probe, i.e. (tcp_keepalives_count + 1) probe
fails with ETIMEDOUT error code.
So, to let tcp_keepalives work properly tcp_user_timeout has to be between
(tcp_keepalives_idle + tcp_keepalives_interval * (tcp_keepalives_count - 1))
and (tcp_keepalives_idle + tcp_keepalives_interval *
tcp_keepalives_count).


Description the nature of the problem.

This is a very important information which can lead to the 'unexpected'
abort of TCP connection which is established in PostgreSQL. Moreover, it is
almost impossible to detect this problem in running system.

There are two interconnected articles in the internet which investigate this
problem:
1. Author investigated an impact of TCP_USER_TIMEOUT socket option to
keepalive mechanism (idle state of TCP connection)
https://blog.cloudflare.com/when-tcp-sockets-refuse-to-die/
2. Author investigated how TCP_USER_TIMEOUT works itself (active data
transfer state of TCP connection)
https://pracucci.com/linux-tcp-rto-min-max-and-tcp-retries2.html

This problem takes place only in an unreliable network (where TCP packets
loss can happen) and has a timing nature. So, it is hard to detect the
problem in PostgreSQL.
But it is easy to reproduce the problem via a simple TCP server and TCP
client application. The following has to be done:
1. implement a simple TCP client and TCP server application
2. TCP server has to call recv() call immediatelly after connection
acceptance
3. TCP client has to print out a port number of established connection:
   struct sockaddr_in sin;
   socklen_t len = sizeof(sin);
   if (getsockname(sd, (struct sockaddr *)&sin, &len) == -1)
      perror("getsockname");
   else
      printf("port number %d\n", ntohs(sin.sin_port)); 
4. enable and configure TCP keepalive mechanism in TCP server (these options
are inherrited by accepted socket):
int optval = 1;
setsockopt(sd, SOL_SOCKET, SO_KEEPALIVE, &optval, sizeof(optval));
optval = 20;
setsockopt(sd, IPPROTO_TCP, TCP_KEEPIDLE, 20, sizeof(optval));
optval = 3;
setsockopt(sd, IPPROTO_TCP, TCP_KEEPINTVL, 3, sizeof(optval));
optval = 3;
setsockopt(sd, IPPROTO_TCP, TCP_KEEPCNT, 3, sizeof(optval));
5. run the processes with different TCP_USER_TIMEOUT values:
optval = 20000;
setsockopt(sd, IPPROTO_TCP, TCP_USER_TIMEOUT, &optval, sizeof(optval));
6. to monitor the work of keepalive we can use wireshark of tcpdump tool on
TCP server side (the port is a port assigned to the TCP client, host is a
server IP address):
tcpdump -i eth1 host 192.168.2.2 and port 60710
7. to simulater TCP packets loss use DROP rule in iptables on TCP server
side
iptables -A INPUT -p tcp --sport 60710 -j DROP
This rule drops all in-come traffic for TCP port 60760. So, we are able to
send a TCP packet but an acknowledgement is 'lost' (dropped by a
firewall).
The rule has to be applied in 5-6 seconds after successful keepalive run,
i.e. in 5-6 seconds after last appearence of KeepAlive packet in wireshark
or ACK  packet in tcpdump.

To simplify description of the test results I do following approach:
0 s # last data send
+6 s # iptables -A INPUT -p tcp --sport <client_port> -j DROP
+20 s # 1st probe of keepalive
+23 s # 2nd probe of keep alive
+26 s # 3rd probe of keep alive

The connection is aborted, when RST packet appears in tcpdump.

Test results with different TCP_USER_TIMEOUT values:
TCP_USER_TIMEOUT = 0
0 s # last data send
+6 s # iptables -A INPUT -p tcp --sport <client_port> -j DROP
+20 s # 1st probe of keepalive
+23 s # 2nd probe of keep alive
+26 s # 3rd probe of keep alive
+26 s # RST (ECONNREFUSED)

TCP_USER_TIMEOUT = 10000
+6 s # iptables -A INPUT -p tcp --sport <client_port> -j DROP
+20 s # 1st probe of keepalive
+23 s # RST (ETIMEDOUT)


TCP_USER_TIMEOUT = 20000
+6 s # iptables -A INPUT -p tcp --sport <client_port> -j DROP
+20 s # 1st probe of keepalive
+23 s # RST (ETIMEDOUT)

TCP_USER_TIMEOUT = 21000
0 s # last data send
+6 s # iptables -A INPUT -p tcp --sport <client_port> -j DROP
+20 s # 1st probe of keepalive
+23 s # RST (ETIMEDOUT)

TCP_USER_TIMEOUT = 24000
0 s # last data send
+6 s # iptables -A INPUT -p tcp --sport <client_port> -j DROP
+20 s # 1st probe of keepalive
+23 s # 2nd probe of keep alive
+26 s # RST (ETIMEDOUT)

TCP_USER_TIMEOUT = 27000 
0 s # last data send
+6 s # iptables -A INPUT -p tcp --sport <client_port> -j DROP
+20 s # 1st probe of keepalive
+23 s # 2nd probe of keep alive
+26 s # 3rd probe of keep alive
+29 s # RST (ETIMEDOUT)

TCP_USER_TIMEOUT = 35000 
0 s # last data send
+6 s # iptables -A INPUT -p tcp --sport <client_port> -j DROP
+20 s # 1st probe of keepalive
+23 s # 2nd probe of keep alive
+26 s # 3rd probe of keep alive
+29 s # 4th probe of keep alive
+32 s # 5th probe of keep alive
+35 s # RST (ETIMEDOUT)

TCP_USER_TIMEOUT = 45000 # twice and a little larger than TCP_KEEPALIVE_IDLE

0 s # last data send
+6 s # iptables -A INPUT -p tcp --sport <client_port> -j DROP
+20 s # 1st probe of keepalive
+23 s # 2nd probe of keep alive
+26 s # 3rd probe of keep alive
+29 s # 4th probe of keep alive
+32 s # 5th probe of keep alive
+35 s # 6th probe of keep alive
+38 s # 7th proble of keep alive
+41 s # 8th probe of keep alive
+44 s # 9th probe of keep alive
+47 s # RST (ETIMEDOUT)

В списке pgsql-docs по дате отправления:

Предыдущее
От: PG Doc comments form
Дата:
Сообщение: pg_upgrade is not correct
Следующее
От: Laurenz Albe
Дата:
Сообщение: Re: Mention invalid null byte sequence