On 2014-12-16 18:37:48 +0200, Heikki Linnakangas wrote:
> On 12/11/2014 04:21 PM, Marco Nenciarini wrote:
> >Il 11/12/14 12:38, Andres Freund ha scritto:
> >>On December 11, 2014 9:56:09 AM CET, Heikki Linnakangas <hlinnakangas@vmware.com> wrote:
> >>>On 12/11/2014 05:45 AM, Andres Freund wrote:
> >>>
> >>>Yeah. I was not able to reproduce this, but I'm clearly missing
> >>>something, since both you and Sergey have seen this happening. Can you
> >>>write a script to reproduce?
> >>
> >>Not right now, I only have my mobile... Its quite easy though. Create a pg-basebackup from a standby. Create a
recovery.confwith a broken primary conninfo. Start. Shutdown. Fix conninfo. Start.
> >>
> >
> >Just tested it. There steps are not sufficient to reproduce the issue on
> >a test installation. I suppose because, on small test datadir, the
> >checkpoint location and the redo location on the pg_control are the same
> >present in the backup_label.
> >
> >To trigger this bug you need to have at least a restartpoint happened on
> >standby between the start and the end of the backup.
> >
> >you could simulate it issuing a checkpoint on master, a checkpoint on
> >standby (to force a restartpoint), then copying the pg_control from the
> >standby.
> >
> >This way I've been able to reproduce it.
>
> Ok, got it. I was able to reproduce this by using pg_basebackup
> --max-rate=1024, and issuing "CHECKPOINT" in the standby while the backup
> was running.
FWIW, I can reproduce it without any such hangups. I've just tested it
using my local scripts:
# create primary
$ reinit-pg-dev-master
$ run-pg-dev-master
# create first standby
$ reinit-pg-dev-master-standby
$ run-pg-dev-master-standby
# create 2nd standby
$ pg_basebackup -h /tmp/ -p 5441 -D /tmp/tree --write-recovery-conf
$ PGHOST=frakbar run-pg-dev-master-standby -D /tmp/tree
LOG: creating missing WAL directory "pg_xlog/archive_status"
LOG: entering standby mode
FATAL: could not connect to the primary server: could not translate host name "frakbar" to address: Name or service
notknown
$ PGHOST=/tmp run-pg-dev-master-standby -D /tmp/tree
LOG: started streaming WAL from primary at 0/2000000 on timeline 1
FATAL: backup_label contains data inconsistent with control file
HINT: This means that the backup is corrupted and you will have to use another backup for recovery.
After the fix I just pushed that sequence works.
Greetings,
Andres Freund
-- Andres Freund http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training &
Services