I have already tried experimenting with linux dirty_ratio etc. You can only fine tune up to a limit. The backup process
stillfills up the buffer cache very quickly. Yes, my database is about 5-6 GB in size and will grow bigger over time.
If wish there was a way to slow down pg_basebackup or force it to use direct I/O.
________________________________________
From: Haribabu Kommi [kommi.haribabu@gmail.com]
Sent: Monday, March 10, 2014 8:31 PM
To: Aggarwal, Ajay
Cc: pgsql-general@postgresql.org
Subject: Re: [GENERAL] replication timeout in pg_basebackup
On Tue, Mar 11, 2014 at 7:07 AM, Aggarwal, Ajay <aaggarwal@verizon.com> wrote:
> Thanks Hari Babu.
>
> I think what is happening is that my dirty cache builds up quickly for the
> volume where I am backing up. This would trigger flush of these dirty pages
> to the disk. While this flush is going on pg_basebackup tries to do fsync()
> on a received WAL file and gets blocked.
But the sync is executed for every WAL file finish. Does your database
is big in size?
Does your setup is write-heavy operations?
In Linux when it tries to write a bunch of buffers at once, the fysnc
call might block for some time.
In the following link there are some "Tuning Recommendations for
write-heavy operations" which might be useful to you.
http://www.westnet.com/~gsmith/content/linux-pdflush.htm
Any other ideas to handle these kind of problems?
Regards,
Hari Babu
Fujitsu Australia