Обсуждение: missing history file

Поиск
Список
Период
Сортировка

missing history file

От
"George Wilk"
Дата:

My warm standby server is looking for a history file when booting up.  It is looking for 00000001.history file to be exact.  Since my *live* server doesn’t produce such file, I create an empty 00000001.history file in the archive directory and that seems to satisfy this requirement allowing the standby server to come up in the recovery mode. 

 

It would be nice to know though how such file is created.  I know that the *live* server creates *.backup file as a result of pg_stop_backup() command and this file is accurately archived on the standby server.  Should I be renaming this file, or is there some other mechanism to tell my standby server what the history file is?

 

Thanks in advance,

~george

Re: missing history file

От
"Simon Riggs"
Дата:
On Fri, 2007-06-29 at 07:55 -0400, George Wilk wrote:

> My warm standby server is looking for a history file when booting up.
> It is looking for 00000001.history file to be exact.

Just ignore 00000001. Recovery will work fine even if absent. Don't
ignore all history files though, just that one. Hmmm, come to think of
it, why is it requesting it at all? We should just skip that request.

>  Since my *live* server doesn’t produce such file, I create an empty
> 00000001.history file in the archive directory and that seems to
> satisfy this requirement allowing the standby server to come up in the
> recovery mode.

Well, that should at least generate a message that says "history file
empty, using targetTLI".

> It would be nice to know though how such file is created.  I know that
> the *live* server creates *.backup file as a result of
> pg_stop_backup() command and this file is accurately archived on the
> standby server.  Should I be renaming this file, or is there some
> other mechanism to tell my standby server what the history file is?

The timeline history file is only required when you do a recovery of a
system that has itself already undergone a PITR.

Have a look at pg_standby, accessible via CVS in contrib/pg_standby.

--
  Simon Riggs
  EnterpriseDB   http://www.enterprisedb.com



Re: missing history file

От
"George Wilk"
Дата:
Thanks, Simon.  I will ignore the request for the history file in my
recovery_command from now on.

Is the timeline history file needed when trying to put the standby server
back into the recovery mode, after it assumed the primary role?  (i.e.
standby server goes *live*, and is subsequently restarted in the recovery
mode).  Is this a valid scenario at all, or should I be taking a new base
backup and starting over?  I am running into some problems when attempting
this.

~george

-----Original Message-----
From: Simon Riggs [mailto:simon@2ndquadrant.com]
Sent: Friday, June 29, 2007 9:42 AM
To: George Wilk
Cc: pgsql-admin@postgresql.org
Subject: Re: [ADMIN] missing history file

On Fri, 2007-06-29 at 07:55 -0400, George Wilk wrote:

> My warm standby server is looking for a history file when booting up.
> It is looking for 00000001.history file to be exact.

Just ignore 00000001. Recovery will work fine even if absent. Don't
ignore all history files though, just that one. Hmmm, come to think of
it, why is it requesting it at all? We should just skip that request.

>  Since my *live* server doesn't produce such file, I create an empty
> 00000001.history file in the archive directory and that seems to
> satisfy this requirement allowing the standby server to come up in the
> recovery mode.

Well, that should at least generate a message that says "history file
empty, using targetTLI".

> It would be nice to know though how such file is created.  I know that
> the *live* server creates *.backup file as a result of
> pg_stop_backup() command and this file is accurately archived on the
> standby server.  Should I be renaming this file, or is there some
> other mechanism to tell my standby server what the history file is?

The timeline history file is only required when you do a recovery of a
system that has itself already undergone a PITR.

Have a look at pg_standby, accessible via CVS in contrib/pg_standby.

--
  Simon Riggs
  EnterpriseDB   http://www.enterprisedb.com





Re: missing history file

От
"Simon Riggs"
Дата:
On Fri, 2007-06-29 at 09:57 -0400, George Wilk wrote:
> Thanks, Simon.  I will ignore the request for the history file in my
> recovery_command from now on.
>
> Is the timeline history file needed when trying to put the standby server
> back into the recovery mode, after it assumed the primary role?  (i.e.
> standby server goes *live*, and is subsequently restarted in the recovery
> mode).

No, not needed.

--
  Simon Riggs
  EnterpriseDB   http://www.enterprisedb.com



Re: missing history file

От
Tom Lane
Дата:
"Simon Riggs" <simon@2ndquadrant.com> writes:
> Just ignore 00000001. Recovery will work fine even if absent. Don't
> ignore all history files though, just that one. Hmmm, come to think of
> it, why is it requesting it at all? We should just skip that request.

No, because then people would misdesign their recovery scripts to not
be able to deal with not finding a history file.  As things are, they
will certainly be exposed to that case in any testing they do.  If we
optimize this call away, then they won't see the case until they're in
very deep doo-doo.

            regards, tom lane

Re: missing history file

От
"George Wilk"
Дата:
Tom,

When and by what process is the history file being created?  My standby
server seems to be looking for it when put back in the recovery mode, after
functioning as primary for a while.
How should I handle missing history file in my script?

Cheers,
~george

-----Original Message-----
From: Tom Lane [mailto:tgl@sss.pgh.pa.us]
Sent: Friday, June 29, 2007 10:53 AM
To: Simon Riggs
Cc: George Wilk; pgsql-admin@postgresql.org
Subject: Re: [ADMIN] missing history file

"Simon Riggs" <simon@2ndquadrant.com> writes:
> Just ignore 00000001. Recovery will work fine even if absent. Don't
> ignore all history files though, just that one. Hmmm, come to think of
> it, why is it requesting it at all? We should just skip that request.

No, because then people would misdesign their recovery scripts to not
be able to deal with not finding a history file.  As things are, they
will certainly be exposed to that case in any testing they do.  If we
optimize this call away, then they won't see the case until they're in
very deep doo-doo.

            regards, tom lane



Re: missing history file

От
"Kevin Grittner"
Дата:
>>> On Fri, Jun 29, 2007 at  9:52 AM, in message <12838.1183128750@sss.pgh.pa.us>,
Tom Lane <tgl@sss.pgh.pa.us> wrote:
> "Simon Riggs" <simon@2ndquadrant.com> writes:
>> Just ignore 00000001. Recovery will work fine even if absent. Don't
>> ignore all history files though, just that one. Hmmm, come to think of
>> it, why is it requesting it at all? We should just skip that request.
>
> No, because then people would misdesign their recovery scripts to not
> be able to deal with not finding a history file.  As things are, they
> will certainly be exposed to that case in any testing they do.  If we
> optimize this call away, then they won't see the case until they're in
> very deep doo-doo.

We certainly were exposed to the case.  We weren't able to turn up any
documenation on it, so we added these lines to our recovery script:

if [[ $1 == *.history ]] ; then
  exit 1
fi

Our warm standbys have apparently been working fine since.

Is there documentation of this that we missed?

Are our warm standby databases useful at this point, or have we wandered
into very deeep doo-doo already?

Based on Simon's email, I went and modified one line of our script.
I'll paste the current form below my "signature".  Please let me know
if we're off base.

-Kevin


#! /bin/bash

# Pick out county name from the back of the path.
# The value of $PWD will be: /var/pgsql/data/county/<countyName>/data
countyName=`dirname $PWD`
countyName=`basename $countyName`

while [ ! -f /var/pgsql/data/county/$countyName/wal-files/$1.gz \
     -a ! -f /var/pgsql/data/county/$countyName/DONE \
  -o -f /var/pgsql/data/county/$countyName/wal-files/rsync-in-progress ]
do
if [ $1 == 00000001.history ] ; then
  exit 1
fi
sleep 10   #         /* wait for ~10 sec */
done

gunzip < /var/pgsql/data/county/$countyName/wal-files/$1.gz > "$2"


Re: missing history file

От
Tom Lane
Дата:
"George Wilk" <gwilk@ellacoya.com> writes:
> When and by what process is the history file being created?  My standby
> server seems to be looking for it when put back in the recovery mode, after
> functioning as primary for a while.
> How should I handle missing history file in my script?

History files are only created when you do a PITR recovery that stops
short of the end of WAL (ie, you gave it an explicit stopping point
criterion).  So basically they never appear except by manual
intervention on the primary server.  A standby script should probably
handle requests for them by looking to see if they're available, and
returning 'em if so, but not waiting if they are not.

Offhand I would recommend the same strategy for any requested filename
that's not a plain WAL segment file (ie, all hex digits).

            regards, tom lane

Re: missing history file

От
"Kevin Grittner"
Дата:
>>> On Fri, Jun 29, 2007 at 11:47 AM, in message <5332.1183135679@sss.pgh.pa.us>,
Tom Lane <tgl@sss.pgh.pa.us> wrote:
>
> History files are only created when you do a PITR recovery that stops
> short of the end of WAL (ie, you gave it an explicit stopping point
> criterion).  So basically they never appear except by manual
> intervention on the primary server.  A standby script should probably
> handle requests for them by looking to see if they're available, and
> returning 'em if so, but not waiting if they are not.
>
> Offhand I would recommend the same strategy for any requested filename
> that's not a plain WAL segment file (ie, all hex digits).

I suspect that it's worth waiting for something like this, too?:

000000010000000A000000CF.0000E744.backup




Re: missing history file

От
Tom Lane
Дата:
"Kevin Grittner" <Kevin.Grittner@wicourts.gov> writes:
> Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> Offhand I would recommend the same strategy for any requested filename
>> that's not a plain WAL segment file (ie, all hex digits).
>
> I suspect that it's worth waiting for something like this, too?:
> 000000010000000A000000CF.0000E744.backup

No, I don't think so.  AFAICS the slave server would only ask for one
of those during its initial cold start from a base backup, and it'll be
looking for the one that should have been generated at completion of
that base backup.  If it ain't there, it's unlikely to appear later.

            regards, tom lane

Re: missing history file

От
"Kevin Grittner"
Дата:
>>> On Fri, Jun 29, 2007 at 12:29 PM, in message <5750.1183138146@sss.pgh.pa.us>,
Tom Lane <tgl@sss.pgh.pa.us> wrote:
> "Kevin Grittner" <Kevin.Grittner@wicourts.gov> writes:
>> Tom Lane <tgl@sss.pgh.pa.us> wrote:
>>> Offhand I would recommend the same strategy for any requested filename
>>> that's not a plain WAL segment file (ie, all hex digits).
>>
>> I suspect that it's worth waiting for something like this, too?:
>> 000000010000000A000000CF.0000E744.backup
>
> No, I don't think so.  AFAICS the slave server would only ask for one
> of those during its initial cold start from a base backup, and it'll be
> looking for the one that should have been generated at completion of
> that base backup.  If it ain't there, it's unlikely to appear later.

Fair enough.  It would have saved us some time if this was mentioned
in the warm standby documentation.  I'll try to put a doc patch together.

-Kevin




Re: missing history file

От
"Simon Riggs"
Дата:
On Fri, 2007-06-29 at 10:52 -0400, Tom Lane wrote:
> "Simon Riggs" <simon@2ndquadrant.com> writes:
> > Just ignore 00000001. Recovery will work fine even if absent. Don't
> > ignore all history files though, just that one. Hmmm, come to think of
> > it, why is it requesting it at all? We should just skip that request.
>
> No, because then people would misdesign their recovery scripts to not
> be able to deal with not finding a history file.  As things are, they
> will certainly be exposed to that case in any testing they do.  If we
> optimize this call away, then they won't see the case until they're in
> very deep doo-doo.

Main reason for suggesting this is that there is already code to
optimize the call away in one place, but not in another, which seems
strange either way.

--
  Simon Riggs
  EnterpriseDB   http://www.enterprisedb.com