Обсуждение: Issue with pg_basebackup v.11

Поиск

Список

Период

Сортировка

Issue with pg_basebackup v.11

От

Ninad Shah

Дата:

22 октября 2021 г., 18:40:31

Hello experts,

I am facing an issue with a customer's production server while trying to take backup using pg_basebackup.

Below is the log from pg_basebackup execution.

* 115338208/1304172127 kB (8%), 0/1 tablespace (...atastaging/base/115868/154220.1)

115355616/1304172127 kB (8%), 0/1 tablespace (...atastaging/base/115868/154220.1)

115372640/1304172127 kB (8%), 0/1 tablespace (...atastaging/base/115868/154220.1)

115389568/1304172127 kB (8%), 0/1 tablespace (...atastaging/base/115868/154220.1)

115405792/1304172127 kB (8%), 0/1 tablespace (...atastaging/base/115868/154220.1)

115423776/1304172127 kB (8%), 0/1 tablespace (...atastaging/base/115868/154220.1)

115440640/1304172127 kB (8%), 0/1 tablespace (...atastaging/base/115868/154220.2)

115454656/1304172127 kB (8%), 0/1 tablespace (...atastaging/base/115868/154220.2)

pgbasebackup: could not read COPY data: could not receive data from server: Connection timed out

pgbasebackup: removing contents of data directory "/u01/PostgreSQL/11/datastaging"*

It copied nearly 110 GB of data and exited. Initially, we suspected it as a network/OS issue. However, we tried to copy a 150 GB large file over the network, which finished successfully.

What I observed is that it takes a couple of hours between below 2 lines.

115454656/1304172127 kB (8%), 0/1 tablespace (...atastaging/base/115868/154220.2)

pgbasebackup: could not read COPY data: could not receive data from server: Connection timed out

In other words, it run for an hour, and later, it takes 2 hours before it times out.

Can someone please help me out here?

Regards,

Ninad Shah

Re: Issue with pg_basebackup v.11

От

Tom Lane

Дата:

22 октября 2021 г., 19:09:55

Ninad Shah <nshah.postgres@gmail.com> writes:
> What I observed is that it takes a couple of hours between below 2 lines.

>  115454656/1304172127 kB (8%), 0/1 tablespace
> (...atastaging/base/115868/154220.2)
> pgbasebackup: could not read COPY data: could not receive data from server:
> Connection timed out

We have heard reports of network connections dropping while pg_basebackup
is busy doing something disk-intensive such as fsync'ing.  The apparent
2-hour delay here does not mean that pg_basebackup was out to lunch for
2 hours; more likely that reflects the TCP timeout delay before the kernel
realizes that the connection is lost.  The actual blame probably resides
with some firewall or router that has a short timeout for idle
connections.

I'd try turning on fairly aggressive TCP keepalive settings for the
connection, say keepalives_idle=30 or so.

            regards, tom lane

Re: Issue with pg_basebackup v.11

От

Ninad Shah

Дата:

23 октября 2021 г., 06:33:51

Hey Tom,

Thank you for your response. Actually, when we copy data using scp/rsync, it works without any issue. But, it fails while attempting to transfer using pg_basebackup.

Would keepalive setting address and mitigate the issue?

Regards,

Ninad Shah

On Fri, 22 Oct 2021 at 21:39, Tom Lane <tgl@sss.pgh.pa.us> wrote:

Ninad Shah <nshah.postgres@gmail.com> writes:
> What I observed is that it takes a couple of hours between below 2 lines.

> 115454656/1304172127 kB (8%), 0/1 tablespace
> (...atastaging/base/115868/154220.2)
> pgbasebackup: could not read COPY data: could not receive data from server:
> Connection timed out

We have heard reports of network connections dropping while pg_basebackup
is busy doing something disk-intensive such as fsync'ing. The apparent
2-hour delay here does not mean that pg_basebackup was out to lunch for
2 hours; more likely that reflects the TCP timeout delay before the kernel
realizes that the connection is lost. The actual blame probably resides
with some firewall or router that has a short timeout for idle
connections.

I'd try turning on fairly aggressive TCP keepalive settings for the
connection, say keepalives_idle=30 or so.

regards, tom lane

Re: Issue with pg_basebackup v.11

От

Tom Lane

Дата:

23 октября 2021 г., 17:42:21

Ninad Shah <nshah.postgres@gmail.com> writes:
> Would keepalive setting address and mitigate the issue?

[ shrug... ]  Maybe; nobody else has more information about this
situation than you do.  I suggested something to experiment with.

            regards, tom lane

Re: Issue with pg_basebackup v.11

От

Ninad Shah

Дата:

25 октября 2021 г., 14:08:14

Thanks Tom.

Regards,

Ninad Shah

On Sat, 23 Oct 2021 at 20:12, Tom Lane <tgl@sss.pgh.pa.us> wrote:

Ninad Shah <nshah.postgres@gmail.com> writes:
> Would keepalive setting address and mitigate the issue?

[ shrug... ] Maybe; nobody else has more information about this
situation than you do. I suggested something to experiment with.

regards, tom lane

Вход в личный кабинет

Восстановление пароля

Подтверждение аккаунта

Изменение пароля

Обсуждение: Issue with pg_basebackup v.11

Issue with pg_basebackup v.11

Re: Issue with pg_basebackup v.11

Re: Issue with pg_basebackup v.11

Re: Issue with pg_basebackup v.11

Re: Issue with pg_basebackup v.11