Обсуждение: BUG #17776: Connections are terminated unexpectedly sometimes

Поиск
Список
Период
Сортировка

BUG #17776: Connections are terminated unexpectedly sometimes

От
PG Bug reporting form
Дата:
The following bug has been logged on the website:

Bug reference:      17776
Logged by:          Francesco Tagliani
Email address:      fran.tm213@gmail.com
PostgreSQL version: 14.6
Operating system:   Ubuntu 20
Description:

I am running Ruby on Rails app with postgresql 14.6 on ubuntu 20.
For some reason, the connections are terminated unexpectedly while running a
background job.
Regarding background job, it requires around 100~200 connections over
6~7hrs.

I've installed pghero and configured postgresql based on
pgtune(https://pgtune.leopard.in.ua/)
This is current configuration

# DB Version: 14
# OS Type: linux
# DB Type: web
# Total Memory (RAM): 48 GB
# CPUs num: 12
# Connections num: 1000
# Data Storage: hdd

max_connections = 1000
shared_buffers = 12GB
effective_cache_size = 36GB
maintenance_work_mem = 2GB
checkpoint_completion_target = 0.9
wal_buffers = 16MB
default_statistics_target = 100
random_page_cost = 4
effective_io_concurrency = 2
work_mem = 3145kB
min_wal_size = 1GB
max_wal_size = 4GB
max_worker_processes = 12
max_parallel_workers_per_gather = 4
max_parallel_workers = 12
max_parallel_maintenance_workers = 4

I've checked postgresql log around that time.

2023-02-06 06:45:41.702 UTC [341438] user@db_production LOG:  could not
receive data from client: Connection reset by peer
2023-02-06 08:26:56.134 UTC [352190] user@db_production LOG:  could not
receive data from client: Connection reset by peer
2023-02-06 08:26:56.134 UTC [352191] user@db_production LOG:  could not
receive data from client: Connection reset by peer

And I can not find other logs.
Can you please advise me how to debug this issue or fix this issue?

Thanks


Re: BUG #17776: Connections are terminated unexpectedly sometimes

От
Tom Lane
Дата:
PG Bug reporting form <noreply@postgresql.org> writes:
> I am running Ruby on Rails app with postgresql 14.6 on ubuntu 20.
> For some reason, the connections are terminated unexpectedly while running a
> background job.
> Regarding background job, it requires around 100~200 connections over
> 6~7hrs.
> I've checked postgresql log around that time.
> 2023-02-06 06:45:41.702 UTC [341438] user@db_production LOG:  could not
> receive data from client: Connection reset by peer
> 2023-02-06 08:26:56.134 UTC [352190] user@db_production LOG:  could not
> receive data from client: Connection reset by peer
> 2023-02-06 08:26:56.134 UTC [352191] user@db_production LOG:  could not
> receive data from client: Connection reset by peer

Looks like something in your network infrastructure is timing out
and dropping idle connections too quickly.  If you can't fix the
actual problem, Postgres' TCP-timeout settings might provide a
workaround by maintaining an illusion that the connection is
busy.  See

 tcp_keepalives_count
 tcp_keepalives_idle
 tcp_keepalives_interval
 tcp_user_timeout

            regards, tom lane



Re: BUG #17776: Connections are terminated unexpectedly sometimes

От
Francesco Tagliani
Дата:
Thanks for the tip, Tom! 

I've checked those 4 values you suggested and they're not configured.
That means they're default.

# - TCP settings -
# see "man tcp" for details

#tcp_keepalives_idle = 0                # TCP_KEEPIDLE, in seconds;
                                        # 0 selects the system default
#tcp_keepalives_interval = 0            # TCP_KEEPINTVL, in seconds;
                                        # 0 selects the system default
#tcp_keepalives_count = 0               # TCP_KEEPCNT;
                                        # 0 selects the system default
#tcp_user_timeout = 0                   # TCP_USER_TIMEOUT, in milliseconds;
                                        # 0 selects the system default

#client_connection_check_interval = 0   # time between checks for client
                                        # disconnection while running queries;
                                        # 0 for never


Do you have any recommended values for them?

I found some logs I don't understand.

```
2023-02-05 00:37:16.858 UTC [1739876] [unknown]@[unknown] FATAL:  unsupported frontend protocol 65363.19778: server supports 3.0 to 3.0
2023-02-06 12:26:48.844 UTC [467480] [unknown]@[unknown] FATAL:  unsupported frontend protocol 16.0: server supports 3.0 to 3.0
2023-02-06 02:17:08.205 UTC [257863] [unknown]@[unknown] FATAL:  unsupported frontend protocol 65363.19778: server supports 3.0 to 3.0
2023-02-06 06:37:00.411 UTC [345467] [unknown]@[unknown] FATAL:  unsupported frontend protocol 0.0: server supports 3.0 to 3.0
2023-02-06 06:37:00.679 UTC [345468] [unknown]@[unknown] FATAL:  unsupported frontend protocol 255.255: server supports 3.0 to 3.0
```
Do you have any suggestions for those logs?

Thanks

On Mon, Feb 6, 2023 at 11:39 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
PG Bug reporting form <noreply@postgresql.org> writes:
> I am running Ruby on Rails app with postgresql 14.6 on ubuntu 20.
> For some reason, the connections are terminated unexpectedly while running a
> background job.
> Regarding background job, it requires around 100~200 connections over
> 6~7hrs.
> I've checked postgresql log around that time.
> 2023-02-06 06:45:41.702 UTC [341438] user@db_production LOG:  could not
> receive data from client: Connection reset by peer
> 2023-02-06 08:26:56.134 UTC [352190] user@db_production LOG:  could not
> receive data from client: Connection reset by peer
> 2023-02-06 08:26:56.134 UTC [352191] user@db_production LOG:  could not
> receive data from client: Connection reset by peer

Looks like something in your network infrastructure is timing out
and dropping idle connections too quickly.  If you can't fix the
actual problem, Postgres' TCP-timeout settings might provide a
workaround by maintaining an illusion that the connection is
busy.  See

 tcp_keepalives_count
 tcp_keepalives_idle
 tcp_keepalives_interval
 tcp_user_timeout

                        regards, tom lane

Re: BUG #17776: Connections are terminated unexpectedly sometimes

От
Tom Lane
Дата:
Francesco Tagliani <fran.tm213@gmail.com> writes:
> I found some logs I don't understand.

> ```
> 2023-02-05 00:37:16.858 UTC [1739876] [unknown]@[unknown] FATAL:
>  unsupported frontend protocol 65363.19778: server supports 3.0 to 3.0
> 2023-02-06 12:26:48.844 UTC [467480] [unknown]@[unknown] FATAL:
>  unsupported frontend protocol 16.0: server supports 3.0 to 3.0
> 2023-02-06 02:17:08.205 UTC [257863] [unknown]@[unknown] FATAL:
>  unsupported frontend protocol 65363.19778: server supports 3.0 to 3.0
> 2023-02-06 06:37:00.411 UTC [345467] [unknown]@[unknown] FATAL:
>  unsupported frontend protocol 0.0: server supports 3.0 to 3.0
> 2023-02-06 06:37:00.679 UTC [345468] [unknown]@[unknown] FATAL:
>  unsupported frontend protocol 255.255: server supports 3.0 to 3.0
> ```

Something port-scanning your server, perhaps?  None of those
protocol numbers match anything that Postgres-related code
would use.

            regards, tom lane



Re: BUG #17776: Connections are terminated unexpectedly sometimes

От
Francesco Tagliani
Дата:
Hi Tom,
The Ubuntu is on Azure, So is it possible port scanning app is automatically installed by azure in ubuntu?

Regarding tcp values, do i need to keep current values or do you have any recommended values?

Thanks

On Tue, Feb 7, 2023 at 12:18 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:
Francesco Tagliani <fran.tm213@gmail.com> writes:
> I found some logs I don't understand.

> ```
> 2023-02-05 00:37:16.858 UTC [1739876] [unknown]@[unknown] FATAL:
>  unsupported frontend protocol 65363.19778: server supports 3.0 to 3.0
> 2023-02-06 12:26:48.844 UTC [467480] [unknown]@[unknown] FATAL:
>  unsupported frontend protocol 16.0: server supports 3.0 to 3.0
> 2023-02-06 02:17:08.205 UTC [257863] [unknown]@[unknown] FATAL:
>  unsupported frontend protocol 65363.19778: server supports 3.0 to 3.0
> 2023-02-06 06:37:00.411 UTC [345467] [unknown]@[unknown] FATAL:
>  unsupported frontend protocol 0.0: server supports 3.0 to 3.0
> 2023-02-06 06:37:00.679 UTC [345468] [unknown]@[unknown] FATAL:
>  unsupported frontend protocol 255.255: server supports 3.0 to 3.0
> ```

Something port-scanning your server, perhaps?  None of those
protocol numbers match anything that Postgres-related code
would use.

                        regards, tom lane