Wrong configuration of tcp_user_timeout can terribly affects tcp_keepalives mechanism
От | PG Doc comments form |
---|---|
Тема | Wrong configuration of tcp_user_timeout can terribly affects tcp_keepalives mechanism |
Дата | |
Msg-id | 160741519849.701.13355787096244067178@wrigleys.postgresql.org обсуждение исходный текст |
Список | pgsql-docs |
The following documentation comment has been logged on the website: Page: https://www.postgresql.org/docs/13/bug-reporting.html Description: These changes are applicable for PostgreSQL 12 and later versions. It is necessary to add a Note to tcp_user_timeout configuration parameter with following content: WARNING: If tcp_user_timeout is not 0, tcp_keepalives_count specified value is ignored and is calculated by UNIX kernel according to the following expression: tcp_keepalives_count = (tcp_user_timeout - tcp_keepalives_idle * 1000) / (tcp_keepalives_interval * 1000). The idled TCP connection will be aborted at next after calculated tcp_keepalives_count keepalive probe, i.e. (tcp_keepalives_count + 1) probe fails with ETIMEDOUT error code. So, to let tcp_keepalives work properly tcp_user_timeout has to be between (tcp_keepalives_idle + tcp_keepalives_interval * (tcp_keepalives_count - 1)) and (tcp_keepalives_idle + tcp_keepalives_interval * tcp_keepalives_count). Description the nature of the problem. This is a very important information which can lead to the 'unexpected' abort of TCP connection which is established in PostgreSQL. Moreover, it is almost impossible to detect this problem in running system. There are two interconnected articles in the internet which investigate this problem: 1. Author investigated an impact of TCP_USER_TIMEOUT socket option to keepalive mechanism (idle state of TCP connection) https://blog.cloudflare.com/when-tcp-sockets-refuse-to-die/ 2. Author investigated how TCP_USER_TIMEOUT works itself (active data transfer state of TCP connection) https://pracucci.com/linux-tcp-rto-min-max-and-tcp-retries2.html This problem takes place only in an unreliable network (where TCP packets loss can happen) and has a timing nature. So, it is hard to detect the problem in PostgreSQL. But it is easy to reproduce the problem via a simple TCP server and TCP client application. The following has to be done: 1. implement a simple TCP client and TCP server application 2. TCP server has to call recv() call immediatelly after connection acceptance 3. TCP client has to print out a port number of established connection: struct sockaddr_in sin; socklen_t len = sizeof(sin); if (getsockname(sd, (struct sockaddr *)&sin, &len) == -1) perror("getsockname"); else printf("port number %d\n", ntohs(sin.sin_port)); 4. enable and configure TCP keepalive mechanism in TCP server (these options are inherrited by accepted socket): int optval = 1; setsockopt(sd, SOL_SOCKET, SO_KEEPALIVE, &optval, sizeof(optval)); optval = 20; setsockopt(sd, IPPROTO_TCP, TCP_KEEPIDLE, 20, sizeof(optval)); optval = 3; setsockopt(sd, IPPROTO_TCP, TCP_KEEPINTVL, 3, sizeof(optval)); optval = 3; setsockopt(sd, IPPROTO_TCP, TCP_KEEPCNT, 3, sizeof(optval)); 5. run the processes with different TCP_USER_TIMEOUT values: optval = 20000; setsockopt(sd, IPPROTO_TCP, TCP_USER_TIMEOUT, &optval, sizeof(optval)); 6. to monitor the work of keepalive we can use wireshark of tcpdump tool on TCP server side (the port is a port assigned to the TCP client, host is a server IP address): tcpdump -i eth1 host 192.168.2.2 and port 60710 7. to simulater TCP packets loss use DROP rule in iptables on TCP server side iptables -A INPUT -p tcp --sport 60710 -j DROP This rule drops all in-come traffic for TCP port 60760. So, we are able to send a TCP packet but an acknowledgement is 'lost' (dropped by a firewall). The rule has to be applied in 5-6 seconds after successful keepalive run, i.e. in 5-6 seconds after last appearence of KeepAlive packet in wireshark or ACK packet in tcpdump. To simplify description of the test results I do following approach: 0 s # last data send +6 s # iptables -A INPUT -p tcp --sport <client_port> -j DROP +20 s # 1st probe of keepalive +23 s # 2nd probe of keep alive +26 s # 3rd probe of keep alive The connection is aborted, when RST packet appears in tcpdump. Test results with different TCP_USER_TIMEOUT values: TCP_USER_TIMEOUT = 0 0 s # last data send +6 s # iptables -A INPUT -p tcp --sport <client_port> -j DROP +20 s # 1st probe of keepalive +23 s # 2nd probe of keep alive +26 s # 3rd probe of keep alive +26 s # RST (ECONNREFUSED) TCP_USER_TIMEOUT = 10000 +6 s # iptables -A INPUT -p tcp --sport <client_port> -j DROP +20 s # 1st probe of keepalive +23 s # RST (ETIMEDOUT) TCP_USER_TIMEOUT = 20000 +6 s # iptables -A INPUT -p tcp --sport <client_port> -j DROP +20 s # 1st probe of keepalive +23 s # RST (ETIMEDOUT) TCP_USER_TIMEOUT = 21000 0 s # last data send +6 s # iptables -A INPUT -p tcp --sport <client_port> -j DROP +20 s # 1st probe of keepalive +23 s # RST (ETIMEDOUT) TCP_USER_TIMEOUT = 24000 0 s # last data send +6 s # iptables -A INPUT -p tcp --sport <client_port> -j DROP +20 s # 1st probe of keepalive +23 s # 2nd probe of keep alive +26 s # RST (ETIMEDOUT) TCP_USER_TIMEOUT = 27000 0 s # last data send +6 s # iptables -A INPUT -p tcp --sport <client_port> -j DROP +20 s # 1st probe of keepalive +23 s # 2nd probe of keep alive +26 s # 3rd probe of keep alive +29 s # RST (ETIMEDOUT) TCP_USER_TIMEOUT = 35000 0 s # last data send +6 s # iptables -A INPUT -p tcp --sport <client_port> -j DROP +20 s # 1st probe of keepalive +23 s # 2nd probe of keep alive +26 s # 3rd probe of keep alive +29 s # 4th probe of keep alive +32 s # 5th probe of keep alive +35 s # RST (ETIMEDOUT) TCP_USER_TIMEOUT = 45000 # twice and a little larger than TCP_KEEPALIVE_IDLE 0 s # last data send +6 s # iptables -A INPUT -p tcp --sport <client_port> -j DROP +20 s # 1st probe of keepalive +23 s # 2nd probe of keep alive +26 s # 3rd probe of keep alive +29 s # 4th probe of keep alive +32 s # 5th probe of keep alive +35 s # 6th probe of keep alive +38 s # 7th proble of keep alive +41 s # 8th probe of keep alive +44 s # 9th probe of keep alive +47 s # RST (ETIMEDOUT)
В списке pgsql-docs по дате отправления: