On Wed, Aug 21, 2013 at 5:09 AM, Fujii Masao <masao.fujii@gmail.com> wrote:
>> Thu Aug 15 18:33:09 MSK 2013 ERROR archive_wal.sh: Problem occured
>> during WAL archiving: pg_receivexlog: could not send feedback packet:
>> server closed the connection unexpectedly
>>
>> At the same time postgres reported this error in log:
>>
>> 2013-08-15 18:32:51 MSK 30945 postgres@[unknown] from [local]
>> [vxid:53/0 txid:0] [streaming 2A97/6FA48000] LOG: terminating
>> walsender process due to replication timeout
>
> Is the value of replication_timeout sufficiently-larger than the status-interval
> of pg_receivexlog?
The replication_timeout is 60s.
The archive_wal.sh (script-wrapper around pg_receivexlog) reports its
status straight away as it falls with an error. Below is the
explanation of how it works.
This is the core of archive_wal.sh:
(
flock -xn 543 || exit 0
result=$($PGRECEIVEXLOG -n -D $WAL_ARCHIVE_DIR 2>&1) || \
die "Problem occured during WAL archiving: $result."
) 543>$WAL_LOCK_FILE
And it is set to run by cron once a minute reporting me by email on occasions:
MAILTO=gray.ru@gmail.com
* * * * * /bin/bash /var/lib/pgsql/tmsdb/archive_wal.sh
>>/var/log/tmsdb/archive_wal.log
--
Kind regards,
Sergey Konoplev
PostgreSQL Consultant and DBA
http://www.linkedin.com/in/grayhemp
+1 (415) 867-9984, +7 (901) 903-0499, +7 (988) 888-1979
gray.ru@gmail.com