Re: Error: server closed the connection unexpectedly

Поиск
Список
Период
Сортировка
От Sachin Kotwal
Тема Re: Error: server closed the connection unexpectedly
Дата
Msg-id CA+N_YAdgsgeah1SPxH6hpPNm8v_0zRYytyMixBtmssbAFdmjTQ@mail.gmail.com
обсуждение исходный текст
Ответ на Re: Error: server closed the connection unexpectedly  (Laurenz Albe <laurenz.albe@cybertec.at>)
Список pgsql-general
Hi All,

I am also facing a similar issue. In my case I am connecting to AWS Aurora PostgreSQL.

When my psql session is idle for a while ( for 5 - 10 minutes ) , It gets terminated and reconnect but query/transaction started by earlier session remain in `idle in transaction` state and the only option left to clean it is terminate that session with `pg_terminate_backend()`. Issue happens when we are connecting from either psql or DBeaver client but not with the application.

Here is example :
```
test=> begin;
BEGIN
test=*> LOCK TABLE test.dependency_queue IN EXCLUSIVE MODE;
LOCK TABLE
test=*>
test=*> rollback;
server closed the connection unexpectedly
This probably means the server terminated abnormally
before or while processing the request.
The connection to the server was lost. Attempting reset:
Succeeded.
psql (13.4, server 11.9)
SSL connection (protocol: TLSv1.2, cipher: ECDHE-RSA-AES256-GCM-SHA384, bits: 256, compression: off)

test=> select now()-query_start, pid,state, usename, application_name, client_addr, wait_event_type, query from pg_stat_activity where state!='idle';
    ?column?     |  pid  |        state        | usename  |    application_name    | client_addr  | wait_event_type |                                                                     query                                                                      
-----------------+-------+---------------------+----------+------------------------+--------------+-----------------+------------------------------------------------------------------------------------------------------------------------------------------------
 00:08:22.762573 | 20233 | active              | postgres | PostgreSQL JDBC Driver | 10.0.174.30  | Lock            | LOCK TABLE test.dependency_queue IN EXCLUSIVE MODE
 00:09:28.101577 | 14830 | idle in transaction | postgres | psql                   | 10.0.173.200 | Client          | LOCK TABLE test.dependency_queue IN EXCLUSIVE MODE;
 00:00:00        | 10198 | active              | postgres | psql                   | 10.0.103.144 |                 | select now()-query_start, pid,state, usename, application_name, client_addr, wait_event_type, query from pg_stat_activity where state!='idle';
(3 rows)

test=> rollback;
WARNING:  there is no transaction in progress
ROLLBACK
```


Nothing in the postgresql logs apart from below statements multiple times:
```
cat postgresql.log.2021-10-22-1200 | grep '10.0.173.200'            
2021-10-22 12:00:09 UTC:10.0.173.200(4082):[unknown]@[unknown]:[729]:LOG:  connection received: host=10.0.173.200 port=4082
2021-10-22 12:00:09 UTC:10.0.173.200(4082):[unknown]@[unknown]:[729]:LOG:  incomplete startup packet
```


Now because the psql session was terminated incorrectly, another session from the application trying to get explicit lock on the same table is waiting for a long time, until we are not terminating the hanging session.

Below are timeout/keepalive settings on my PostgreSQL instance:
```
select name, setting, unit  from pg_settings where name ilike '%timeout%';
                name                 | setting  | unit
-------------------------------------+----------+------
 archive_timeout                     | 300      | s
 authentication_timeout              | 60       | s
 checkpoint_timeout                  | 60       | s
 deadlock_timeout                    | 1000     | ms
 idle_in_transaction_session_timeout | 86400000 | ms
 lock_timeout                        | 0        | ms
 statement_timeout                   | 0        | ms
 wal_receiver_timeout                | 30000    | ms
 wal_sender_timeout                  | 60000    | ms
(9 rows)

select name, setting, unit  from pg_settings where name ilike '%keep%';
          name           | setting | unit
-------------------------+---------+------
 tcp_keepalives_count    | 9       |
 tcp_keepalives_idle     | 7200    | s
 tcp_keepalives_interval | 75      | s
 wal_keep_segments       | 0       |
(4 rows)
```

macOS Catelina 10.15.7 timeout settings :
```
sysctl -A | grep keep              

net.inet.tcp.keepidle: 7200000
net.inet.tcp.keepintvl: 75000
net.inet.tcp.keepinit: 75000
net.inet.tcp.keepcnt: 8
net.inet.tcp.always_keepalive: 0
net.inet.mptcp.keepalive: 840
net.link.ether.inet.keep_announcements: 1
net.key.natt_keepalive_interval: 20
net.inet6.ip6.keepfaith: 0
net.necp.pass_keepalives: 1
```

I can set ` lock_timeout` and ` idle_in_transaction_session_timeout` to lower value so that the hanging session can get terminated automatically by postgres but I would like to know root cause of the issue and fix it there instead of adjusting postgresql timeout setting values.

Please let me know which setting on my client or server is not set properly and because of that the issue is happening.

Regards,
Sachin


On Mon, Oct 11, 2021 at 1:38 PM Laurenz Albe <laurenz.albe@cybertec.at> wrote:
On Mon, 2021-10-11 at 00:12 +0530, Anil wrote:
> Please find the server log, if you can find something (due to some reasons not being able to get OS system log, will be shared soon). 

Nothing to see.

Perhaps an ill-configured network component closes idle connection?
In that case, try keepalive.

Yours,
Laurenz Albe
--
Cybertec | https://www.cybertec-postgresql.com





--

Thanks and Regards,
Sachin Kotwal

В списке pgsql-general по дате отправления:

Предыдущее
От: Abhishek Bhola
Дата:
Сообщение: [Major version upgrade] pg_upgrade fails despite passing check mode
Следующее
От: Ninad Shah
Дата:
Сообщение: Re: spannerdb migration to PostgreSQL