Обсуждение: problem with pg_standby

Поиск
Список
Период
Сортировка

problem with pg_standby

От
Gerd Koenig
Дата:
Hello,

we currently setup a standby database with archive_command sending the WALs
from master to standby.
This works as expected, but the standby database doesn't restore the WALs from
the given directory in recovery.conf and I have no idea why...

recovery.conf:
----------------
restore_command = '/usr/bin/pg_standby -d -s 20 -t /tmp/pgsql.trigger.5432
/var/lib/pgsql/wal_exchange %f %p %r 2>>
/var/lib/pgsql/data/pg_log/standby.log'

The file standby.log contains:
--------------------------------
Trigger file            : /tmp/pgsql.trigger.5432
Waiting for WAL file    : 00000001.history
WAL file path           : /var/lib/pgsql/wal_exchange/00000001.history
Restoring to...         : pg_xlog/RECOVERYHISTORY
Sleep interval          : 20 seconds
Max wait interval       : 0 forever
Command for restore     : cp "/var/lib/pgsql/wal_exchange/00000001.history"
"pg_xlog/RECOVERYHISTORY"
Keep archive history    : 000000000000000000000000 and later
running restore         :cp: cannot stat
`/var/lib/pgsql/wal_exchange/00000001.history': No such file or directory
cp: cannot stat `/var/lib/pgsql/wal_exchange/00000001.history': No such file or
directory
cp: cannot stat `/var/lib/pgsql/wal_exchange/00000001.history': No such file or
directory
cp: cannot stat `/var/lib/pgsql/wal_exchange/00000001.history': No such file or
directory
not restored            : history file not found

Trigger file            : /tmp/pgsql.trigger.5432
Waiting for WAL file    : 000000010000001E0000001D.00000020.backup
WAL file path           :
/var/lib/pgsql/wal_exchange/000000010000001E0000001D.00000020.backup
Restoring to...         : pg_xlog/RECOVERYHISTORY
Sleep interval          : 20 seconds
Max wait interval       : 0 forever
Command for restore     : cp
"/var/lib/pgsql/wal_exchange/000000010000001E0000001D.00000020.backup"
"pg_xlog/RECOVERYHISTORY"
Keep archive history    : 000000000000000000000000 and later
WAL file not present yet. Checking for trigger file...
WAL file not present yet. Checking for trigger file...
WAL file not present yet. Checking for trigger file...
WAL file not present yet. Checking for trigger file...
WAL file not present yet. Checking for trigger file...
-------------------------------

And in the directory "/var/lib/pgsql/wal_exchange" I have dozens of WALs in
the meantime, but they didn't get restored, and in the logs I receive "WAL file
not present...."
The file "pg_xlog/RECOVERYHISTORY" doesn't exist, btw..., and the file
xx00000020.backup contains:
""
START WAL LOCATION: 1E/1D000020 (file 000000010000001E0000001D)
STOP WAL LOCATION: 1E/1F000000 (file 000000010000001E0000001E)
CHECKPOINT LOCATION: 1E/1D000020
START TIME: 2010-08-03 21:11:23 CEST
LABEL: initial_backup
STOP TIME: 2010-08-03 21:44:06 CEST
""

I can stop and restart the standby db as often as I want ending up in a
working sync-mechanism, as long as the WALs being copied from master to
standby, right ?

Any help appreciated, many thanks .....GERD.....

Re: problem with pg_standby

От
"Joshua D. Drake"
Дата:
On Tue, 2010-08-03 at 22:37 +0200, Gerd Koenig wrote:
> Hello,
>
> we currently setup a standby database with archive_command sending the WALs
> from master to standby.
> This works as expected, but the standby database doesn't restore the WALs from
> the given directory in recovery.conf and I have no idea why...

What are you using to manage this? I would strongly suggest Walmgr or
PITRTools. They take a lot of the guess work out of all this stuff.

Joshua D. Drake

--
PostgreSQL.org Major Contributor
Command Prompt, Inc: http://www.commandprompt.com/ - 509.416.6579
Consulting, Training, Support, Custom Development, Engineering
http://twitter.com/cmdpromptinc | http://identi.ca/commandprompt

Re: problem with pg_standby

От
Gerd Koenig
Дата:
Hi again,

I just want to drop you an additional note. After several attempts of
debugging pg_standby is restoring the WAL files as expected, but I cannot
explain why...

First startup of postgres was with init-script provided by the rpm
installation (/etc/init.d/postgresql start as user root). The pg processes
have been started thereby as user postgres.
An additional stop/start by init-script ended up in the same problem
situation, where pg_standby obviously wasn't able to get the file
000000010000001E0000001D.00000020.backup, even if it was there (and readable
by user postgres ;-) ).

Afterwards (3rd attempt) I started postgres not by init-script, but by pg_ctl
as user postgres, and......hej....heureka......pg_standby finds all needed files
and restores them.

Since even the init-script starts pg as user postgres I have no idea what
differs from init-script to direct call of pg_ctl as user postgres...?!?!

Somebody an explanation ?

kind regards....GERD....


On Tuesday, August 03, 2010 10:37:12 pm Gerd Koenig wrote:
> Hello,
>
> we currently setup a standby database with archive_command sending the WALs
> from master to standby.
> This works as expected, but the standby database doesn't restore the WALs
> from the given directory in recovery.conf and I have no idea why...
>
> recovery.conf:
> ----------------
> restore_command = '/usr/bin/pg_standby -d -s 20 -t /tmp/pgsql.trigger.5432
> /var/lib/pgsql/wal_exchange %f %p %r 2>>
> /var/lib/pgsql/data/pg_log/standby.log'
>
> The file standby.log contains:
> --------------------------------
> Trigger file            : /tmp/pgsql.trigger.5432
> Waiting for WAL file    : 00000001.history
> WAL file path           : /var/lib/pgsql/wal_exchange/00000001.history
> Restoring to...         : pg_xlog/RECOVERYHISTORY
> Sleep interval          : 20 seconds
> Max wait interval       : 0 forever
> Command for restore     : cp "/var/lib/pgsql/wal_exchange/00000001.history"
> "pg_xlog/RECOVERYHISTORY"
> Keep archive history    : 000000000000000000000000 and later
> running restore         :cp: cannot stat
> `/var/lib/pgsql/wal_exchange/00000001.history': No such file or directory
> cp: cannot stat `/var/lib/pgsql/wal_exchange/00000001.history': No such
> file or directory
> cp: cannot stat `/var/lib/pgsql/wal_exchange/00000001.history': No such
> file or directory
> cp: cannot stat `/var/lib/pgsql/wal_exchange/00000001.history': No such
> file or directory
> not restored            : history file not found
>
> Trigger file            : /tmp/pgsql.trigger.5432
> Waiting for WAL file    : 000000010000001E0000001D.00000020.backup
> WAL file path           :
> /var/lib/pgsql/wal_exchange/000000010000001E0000001D.00000020.backup
> Restoring to...         : pg_xlog/RECOVERYHISTORY
> Sleep interval          : 20 seconds
> Max wait interval       : 0 forever
> Command for restore     : cp
> "/var/lib/pgsql/wal_exchange/000000010000001E0000001D.00000020.backup"
> "pg_xlog/RECOVERYHISTORY"
> Keep archive history    : 000000000000000000000000 and later
> WAL file not present yet. Checking for trigger file...
> WAL file not present yet. Checking for trigger file...
> WAL file not present yet. Checking for trigger file...
> WAL file not present yet. Checking for trigger file...
> WAL file not present yet. Checking for trigger file...
> -------------------------------
>
> And in the directory "/var/lib/pgsql/wal_exchange" I have dozens of WALs in
> the meantime, but they didn't get restored, and in the logs I receive "WAL
> file not present...."
> The file "pg_xlog/RECOVERYHISTORY" doesn't exist, btw..., and the file
> xx00000020.backup contains:
> ""
> START WAL LOCATION: 1E/1D000020 (file 000000010000001E0000001D)
> STOP WAL LOCATION: 1E/1F000000 (file 000000010000001E0000001E)
> CHECKPOINT LOCATION: 1E/1D000020
> START TIME: 2010-08-03 21:11:23 CEST
> LABEL: initial_backup
> STOP TIME: 2010-08-03 21:44:06 CEST
> ""
>
> I can stop and restart the standby db as often as I want ending up in a
> working sync-mechanism, as long as the WALs being copied from master to
> standby, right ?
>
> Any help appreciated, many thanks .....GERD.....

Re: problem with pg_standby

От
Greg Smith
Дата:
Gerd Koenig wrote:
> Since even the init-script starts pg as user postgres I have no idea what
> differs from init-script to direct call of pg_ctl as user postgres...?!?!
>

Do you have SELinux turned on?  That can do weird stuff like this--the
init script will be running with restrictions the manual pg_ctl launch
will not have.  Should be information in /var/log/ somewhere if that's
the case.

--
Greg Smith  2ndQuadrant US  Baltimore, MD
PostgreSQL Training, Services and Support
greg@2ndQuadrant.com   www.2ndQuadrant.us


Re: problem with pg_standby

От
"Joshua D. Drake"
Дата:
On Tue, 2010-08-03 at 22:37 +0200, Gerd Koenig wrote:
> Hello,
>
> we currently setup a standby database with archive_command sending the WALs
> from master to standby.
> This works as expected, but the standby database doesn't restore the WALs from
> the given directory in recovery.conf and I have no idea why...

What are you using to manage this? I would strongly suggest Walmgr or
PITRTools. They take a lot of the guess work out of all this stuff.

Joshua D. Drake

--
PostgreSQL.org Major Contributor
Command Prompt, Inc: http://www.commandprompt.com/ - 509.416.6579
Consulting, Training, Support, Custom Development, Engineering
http://twitter.com/cmdpromptinc | http://identi.ca/commandprompt


Re: problem with pg_standby

От
Gerd Koenig
Дата:
Hello Greg,

thanks for the hint, yes, SELinux caused the troubles. It complained about
wrong filecontext while starting postgres via init-script.

Filecontext was: var_lib_t and it should be: postgresql_t

regards...GERD....

On Tuesday, August 03, 2010 11:54:45 pm Greg Smith wrote:
> Gerd Koenig wrote:
> > Since even the init-script starts pg as user postgres I have no idea what
> > differs from init-script to direct call of pg_ctl as user postgres...?!?!
>
> Do you have SELinux turned on?  That can do weird stuff like this--the
> init script will be running with restrictions the manual pg_ctl launch
> will not have.  Should be information in /var/log/ somewhere if that's
> the case.

Re: problem with pg_standby

От
Greg Smith
Дата:
Gerd Koenig wrote:
> thanks for the hint, yes, SELinux caused the troubles. It complained about
> wrong filecontext while starting postgres via init-script.
>
> Filecontext was: var_lib_t and it should be: postgresql_t
>
If you want to keep SELinux on, basically you have to relabel the
directory you are putting those into so it can access them.  You might
find some useful hints on that topic at
http://serverfault.com/questions/32333/how-does-selinux-affect-the-home-directory
(including the observation that /home is a bad place for them).  You
might think you can just run the chcon command to reset the labels, but
it doesn't quite work like that; you have to change the policy and then
use restorecon to correct them.

--
Greg Smith  2ndQuadrant US  Baltimore, MD
PostgreSQL Training, Services and Support
greg@2ndQuadrant.com   www.2ndQuadrant.us


Re: problem with pg_standby

От
Gerd Koenig
Дата:
thanks Greg,
yes, we want SELinux in enforcing mode.
Thereby (and to ensure persistence) just chcon is the wrong way and
* semanage fcontext -a -t postgresql_t '<dir>'
* restorecon -vvFR <dir>
is much better ;-)

regards--GERD--

On Wednesday, August 04, 2010 07:56:56 am Greg Smith wrote:
> Gerd Koenig wrote:
> > thanks for the hint, yes, SELinux caused the troubles. It complained
> > about wrong filecontext while starting postgres via init-script.
> >
> > Filecontext was: var_lib_t and it should be: postgresql_t
>
> If you want to keep SELinux on, basically you have to relabel the
> directory you are putting those into so it can access them.  You might
> find some useful hints on that topic at
> http://serverfault.com/questions/32333/how-does-selinux-affect-the-home-dir
> ectory (including the observation that /home is a bad place for them).  You
> might think you can just run the chcon command to reset the labels, but it
> doesn't quite work like that; you have to change the policy and then use
> restorecon to correct them.