Обсуждение: pg_basebackup: could not receive data from WAL stream

Поиск
Список
Период
Сортировка

pg_basebackup: could not receive data from WAL stream

От
greigwise
Дата:
Hello.

On postgresql 10.5, my pg_basebackup is failing with this error:

pg_basebackup: could not receive data from WAL stream: server closed the
connection unexpectedly
This probably means the server terminated abnormally
before or while processing the request

In the postgres log files, I'm seeing:

2018-09-02 00:57:32 UTC bkp_user 5b8b278c.11c3f [unknown] LOG:  terminating
walsender process due to replication timeout

I'm running the following command right on the database server itself:

pg_basebackup -U repl -D /var/tmp/pg_basebackup_20180901 -Ft -z

It seems to be an intermittent problem.. I've had it fail or succeed about
50/50.  I even bumped up the wal_sender_timeout to 2000.  One notable thing
is that I'm running on an ec2 instance on AWS.

Any advice would be helpful.

Greig Wise



--
Sent from: http://www.postgresql-archive.org/PostgreSQL-general-f1843780.html


Re: pg_basebackup: could not receive data from WAL stream

От
greigwise
Дата:
I should also add that when it fails, it's always right at the very end of
the backup when it's very nearly done or maybe even after it's done.

Thanks again.

Greig



--
Sent from: http://www.postgresql-archive.org/PostgreSQL-general-f1843780.html


Re: pg_basebackup: could not receive data from WAL stream

От
Adrian Klaver
Дата:
On 09/01/2018 09:06 PM, greigwise wrote:
> Hello.
> 
> On postgresql 10.5, my pg_basebackup is failing with this error:
> 
> pg_basebackup: could not receive data from WAL stream: server closed the
> connection unexpectedly
> This probably means the server terminated abnormally
> before or while processing the request
> 
> In the postgres log files, I'm seeing:
> 
> 2018-09-02 00:57:32 UTC bkp_user 5b8b278c.11c3f [unknown] LOG:  terminating
> walsender process due to replication timeout
> 
> I'm running the following command right on the database server itself:
> 
> pg_basebackup -U repl -D /var/tmp/pg_basebackup_20180901 -Ft -z
> 
> It seems to be an intermittent problem.. I've had it fail or succeed about
> 50/50.  I even bumped up the wal_sender_timeout to 2000.  One notable thing
> is that I'm running on an ec2 instance on AWS.

The unit for wal_sender_timeout is ms so the above is 2 seconds whereas 
the default value is 60 seconds(60s in postgresql.conf file).

See below for setting units in file:

https://www.postgresql.org/docs/10/static/config-setting.html

Also what is your max_wal_senders setting?

> 
> Any advice would be helpful.
> 
> Greig Wise
> 
> 
> 
> --
> Sent from: http://www.postgresql-archive.org/PostgreSQL-general-f1843780.html
> 
> 


-- 
Adrian Klaver
adrian.klaver@aklaver.com


Re: pg_basebackup: could not receive data from WAL stream

От
Kaixi Luo
Дата:
wal_sender_timeout should be as long as necessary. Each wal file is 16MB, so it should be *at least* as long as the time needed to transfer 16MB*wal_keep_segments. Take a look at the size of your pg_xlog folder.

On Sun, Sep 2, 2018 at 3:41 PM Adrian Klaver <adrian.klaver@aklaver.com> wrote:
On 09/01/2018 09:06 PM, greigwise wrote:
> Hello.
>
> On postgresql 10.5, my pg_basebackup is failing with this error:
>
> pg_basebackup: could not receive data from WAL stream: server closed the
> connection unexpectedly
> This probably means the server terminated abnormally
> before or while processing the request
>
> In the postgres log files, I'm seeing:
>
> 2018-09-02 00:57:32 UTC bkp_user 5b8b278c.11c3f [unknown] LOG:  terminating
> walsender process due to replication timeout
>
> I'm running the following command right on the database server itself:
>
> pg_basebackup -U repl -D /var/tmp/pg_basebackup_20180901 -Ft -z
>
> It seems to be an intermittent problem.. I've had it fail or succeed about
> 50/50.  I even bumped up the wal_sender_timeout to 2000.  One notable thing
> is that I'm running on an ec2 instance on AWS.

The unit for wal_sender_timeout is ms so the above is 2 seconds whereas
the default value is 60 seconds(60s in postgresql.conf file).

See below for setting units in file:

https://www.postgresql.org/docs/10/static/config-setting.html

Also what is your max_wal_senders setting?

>
> Any advice would be helpful.
>
> Greig Wise
>
>
>
> --
> Sent from: http://www.postgresql-archive.org/PostgreSQL-general-f1843780.html
>
>


--
Adrian Klaver
adrian.klaver@aklaver.com