Обсуждение: [ADMIN] Bad recovery: no pg_xlog/RECOVERYXLOG

Поиск
Список
Период
Сортировка

[ADMIN] Bad recovery: no pg_xlog/RECOVERYXLOG

От
Marcin Koziej
Дата:
Hi!

I try to setup continuous archiving with PG 9.6 according to this
documentation:
https://www.postgresql.org/docs/9.6/static/continuous-archiving.html

I have Postgres wal_archive set to replica, I have archive on and
archive command is properly copying WAL segments to backup storage.

Having this running, I make a successful tar base backup using
pg_basebackup.

I then stop the DB, remove the data directory, unpack base backup to it,
create recovery.conf with a proper restore_command, run the server, and get:

LOG:  database system was interrupted; last known up at 2017-10-25
15:47:37 UTC
LOG:  starting archive recovery
Object 'pg_small3/pg_xlog/RECOVERYXLOG.lzo' not found
Cannot download pg_xlog/RECOVERYXLOG.lzo
LOG:  invalid checkpoint record
FATAL:  could not locate required checkpoint record
HINT:  If you are not restoring from a backup, try removing the file
"/var/lib/postgresql/data/backup_label".
LOG:  startup process (PID 20) exited with exit code 1
LOG:  aborting startup due to startup process failure
LOG:  database system is shut down

The message about "pg_xlog/RECOVERYXLOG.lzo" is written out by
restore_command. Indeed, the file is not in the backup storage, and
pg_xlog/RECOVERYXLOG was NEVER sent there by archive_command (which
compresses and adds .lzo extension)!

What could I be doing wrong?

--
Marcin Koziej

GPG key: http://go.cahoots.pl/gpg/  Ϟ  Twitter: @movonw




--
Sent via pgsql-admin mailing list (pgsql-admin@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-admin

Re: [ADMIN] Bad recovery: no pg_xlog/RECOVERYXLOG

От
Marcin Koziej
Дата:
I got it.

The semantics of archive_command suggets, that for recovery_command %f
is basename of %p. This was not the case: %p is local file in data dir,
%f is remote (backed up) file name.

Marcin Koziej 

GPG key: http://go.cahoots.pl/gpg/  Ϟ  Twitter: @movonw 

On 25.10.2017 18:50, Marcin Koziej wrote:
> Hi!
>
> I try to setup continuous archiving with PG 9.6 according to this
> documentation:
> https://www.postgresql.org/docs/9.6/static/continuous-archiving.html
>
> I have Postgres wal_archive set to replica, I have archive on and
> archive command is properly copying WAL segments to backup storage.
>
> Having this running, I make a successful tar base backup using
> pg_basebackup.
>
> I then stop the DB, remove the data directory, unpack base backup to it,
> create recovery.conf with a proper restore_command, run the server, and get:
>
> LOG:  database system was interrupted; last known up at 2017-10-25
> 15:47:37 UTC
> LOG:  starting archive recovery
> Object 'pg_small3/pg_xlog/RECOVERYXLOG.lzo' not found
> Cannot download pg_xlog/RECOVERYXLOG.lzo
> LOG:  invalid checkpoint record
> FATAL:  could not locate required checkpoint record
> HINT:  If you are not restoring from a backup, try removing the file
> "/var/lib/postgresql/data/backup_label".
> LOG:  startup process (PID 20) exited with exit code 1
> LOG:  aborting startup due to startup process failure
> LOG:  database system is shut down
>
> The message about "pg_xlog/RECOVERYXLOG.lzo" is written out by
> restore_command. Indeed, the file is not in the backup storage, and
> pg_xlog/RECOVERYXLOG was NEVER sent there by archive_command (which
> compresses and adds .lzo extension)!
>
> What could I be doing wrong?
>



-- 
Sent via pgsql-admin mailing list (pgsql-admin@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-admin

Re: [ADMIN] Bad recovery: no pg_xlog/RECOVERYXLOG

От
Samed YILDIRIM
Дата:
Hi Marcin,
 
Could you please share archive_command and restore_command you used? If you are using script inside restore or archive comman, please also share them. It looks like the problem relevant with them.
 
Best regards.
Samed YILDIRIM
 
 
25.10.2017, 19:52, "Marcin Koziej" <marcin@cahoots.pl>:

Hi!

I try to setup continuous archiving with PG 9.6 according to this
documentation:
https://www.postgresql.org/docs/9.6/static/continuous-archiving.html

I have Postgres wal_archive set to replica, I have archive on and
archive command is properly copying WAL segments to backup storage.

Having this running, I make a successful tar base backup using
pg_basebackup.

I then stop the DB, remove the data directory, unpack base backup to it,
create recovery.conf with a proper restore_command, run the server, and get:

LOG:  database system was interrupted; last known up at 2017-10-25
15:47:37 UTC
LOG:  starting archive recovery
Object 'pg_small3/pg_xlog/RECOVERYXLOG.lzo' not found
Cannot download pg_xlog/RECOVERYXLOG.lzo
LOG:  invalid checkpoint record
FATAL:  could not locate required checkpoint record
HINT:  If you are not restoring from a backup, try removing the file
"/var/lib/postgresql/data/backup_label".
LOG:  startup process (PID 20) exited with exit code 1
LOG:  aborting startup due to startup process failure
LOG:  database system is shut down

The message about "pg_xlog/RECOVERYXLOG.lzo" is written out by
restore_command. Indeed, the file is not in the backup storage, and
pg_xlog/RECOVERYXLOG was NEVER sent there by archive_command (which
compresses and adds .lzo extension)!

What could I be doing wrong?
 --
Marcin Koziej


GPG key: http://go.cahoots.pl/gpg/ Ϟ Twitter: @movonw



 --
Sent via pgsql-admin mailing list (pgsql-admin@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-admin

Re: [ADMIN] Bad recovery: no pg_xlog/RECOVERYXLOG

От
Marcin Koziej
Дата:

Now it's fixed, but if anyone needs I'm attaching all scripts to 1) backup and restore wal's and 2) backup and restore base backup from OpenStack SWIFT


Marcin Koziej 

GPG key: http://go.cahoots.pl/gpg/  Ϟ  Twitter: @movonw 
On 25.10.2017 19:30, Samed YILDIRIM wrote:
Hi Marcin,
 
Could you please share archive_command and restore_command you used? If you are using script inside restore or archive comman, please also share them. It looks like the problem relevant with them.
 
Best regards.
Samed YILDIRIM
 
 
25.10.2017, 19:52, "Marcin Koziej" <marcin@cahoots.pl>:

Hi!

I try to setup continuous archiving with PG 9.6 according to this
documentation:
https://www.postgresql.org/docs/9.6/static/continuous-archiving.html

I have Postgres wal_archive set to replica, I have archive on and
archive command is properly copying WAL segments to backup storage.

Having this running, I make a successful tar base backup using
pg_basebackup.

I then stop the DB, remove the data directory, unpack base backup to it,
create recovery.conf with a proper restore_command, run the server, and get:

LOG:  database system was interrupted; last known up at 2017-10-25
15:47:37 UTC
LOG:  starting archive recovery
Object 'pg_small3/pg_xlog/RECOVERYXLOG.lzo' not found
Cannot download pg_xlog/RECOVERYXLOG.lzo
LOG:  invalid checkpoint record
FATAL:  could not locate required checkpoint record
HINT:  If you are not restoring from a backup, try removing the file
"/var/lib/postgresql/data/backup_label".
LOG:  startup process (PID 20) exited with exit code 1
LOG:  aborting startup due to startup process failure
LOG:  database system is shut down

The message about "pg_xlog/RECOVERYXLOG.lzo" is written out by
restore_command. Indeed, the file is not in the backup storage, and
pg_xlog/RECOVERYXLOG was NEVER sent there by archive_command (which
compresses and adds .lzo extension)!

What could I be doing wrong?
 --
Marcin Koziej


GPG key: http://go.cahoots.pl/gpg/ Ϟ Twitter: @movonw



 --
Sent via pgsql-admin mailing list (pgsql-admin@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-admin


Вложения

Re: [ADMIN] Bad recovery: no pg_xlog/RECOVERYXLOG

От
Stephen Frost
Дата:
Greetings,

* Marcin Koziej (marcin@cahoots.pl) wrote:
> Now it's fixed, but if anyone needs I'm attaching all scripts to 1)
> backup and restore wal's and 2) backup and restore base backup from
> OpenStack SWIFT

Interesting, but these scripts seem to be seriously lacking in error
checking (what happens if the copy to swift fails..?  or pg_basebackup
fails?) and it's unclear how you can be sure that the WAL file has been
sync'd to disk which is important or you might end up having holes in
your WAL stream if the swift system fails.  There's also no checking to
make sure that the WAL needed for a given pg_basebackup ever actually
made it to the swift system, which is required to ensure you have a
consistent backup.

Generally speaking, these kinds of scripts really aren't a good choice
for doing backups of PG.  I'd strongly suggest you look at one of the
existing tools which are developed specifically for doing backups of PG
and are well tested, supported, and maintained.  If you'd like support
for a new storage system, I know that at least pgBackRest's storage
layer is pluggable and adding a new storage option is pretty straight
forward.

Thanks!

Stephen

Re: [ADMIN] Bad recovery: no pg_xlog/RECOVERYXLOG

От
Mark Kirkwood
Дата:

On 31/10/17 04:47, Stephen Frost wrote:
> Greetings,
>
> * Marcin Koziej (marcin@cahoots.pl) wrote:
>> Now it's fixed, but if anyone needs I'm attaching all scripts to 1)
>> backup and restore wal's and 2) backup and restore base backup from
>> OpenStack SWIFT
> Interesting, but these scripts seem to be seriously lacking in error
> checking (what happens if the copy to swift fails..?  or pg_basebackup
> fails?) and it's unclear how you can be sure that the WAL file has been
> sync'd to disk which is important or you might end up having holes in
> your WAL stream if the swift system fails.  There's also no checking to
> make sure that the WAL needed for a given pg_basebackup ever actually
> made it to the swift system, which is required to ensure you have a
> consistent backup.
>
> Generally speaking, these kinds of scripts really aren't a good choice
> for doing backups of PG.  I'd strongly suggest you look at one of the
> existing tools which are developed specifically for doing backups of PG
> and are well tested, supported, and maintained.  If you'd like support
> for a new storage system, I know that at least pgBackRest's storage
> layer is pluggable and adding a new storage option is pretty straight
> forward.
>
>
I'm not convinced that his approach is bad.

The script checks the result of the 'swift upload' for the base backup, 
it is the wal backup one that does not explicitly check the 'swift 
upload' result (this should really be added). To be fair, anything wrong 
with the swift system will likely be discovered immediately beforehand 
where he does a 'swift stat'!

I'd guess his original problem was an improperly setup recovery.conf, 
rather than the overall design.

regards

Mark


-- 
Sent via pgsql-admin mailing list (pgsql-admin@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-admin

Re: [ADMIN] Bad recovery: no pg_xlog/RECOVERYXLOG

От
Stephen Frost
Дата:
Mark,

* Mark Kirkwood (mark.kirkwood@catalyst.net.nz) wrote:
> On 31/10/17 04:47, Stephen Frost wrote:
> >* Marcin Koziej (marcin@cahoots.pl) wrote:
> >>Now it's fixed, but if anyone needs I'm attaching all scripts to 1)
> >>backup and restore wal's and 2) backup and restore base backup from
> >>OpenStack SWIFT
> >Interesting, but these scripts seem to be seriously lacking in error
> >checking (what happens if the copy to swift fails..?  or pg_basebackup
> >fails?) and it's unclear how you can be sure that the WAL file has been
> >sync'd to disk which is important or you might end up having holes in
> >your WAL stream if the swift system fails.  There's also no checking to
> >make sure that the WAL needed for a given pg_basebackup ever actually
> >made it to the swift system, which is required to ensure you have a
> >consistent backup.
> >
> >Generally speaking, these kinds of scripts really aren't a good choice
> >for doing backups of PG.  I'd strongly suggest you look at one of the
> >existing tools which are developed specifically for doing backups of PG
> >and are well tested, supported, and maintained.  If you'd like support
> >for a new storage system, I know that at least pgBackRest's storage
> >layer is pluggable and adding a new storage option is pretty straight
> >forward.
>
> I'm not convinced that his approach is bad.

I was the same way for a long time, thinking that shell scripts could
reasonably be used with certain caveats, but the devil really is in the
details and it's far too easy to miss things in shell scripts (such as
not checking return codes, or not doing so properly, or various other
issues).  Also, you didn't address things like verifying that you
actually have all the WAL needed for a valid backup, and how to handle
retention?

> The script checks the result of the 'swift upload' for the base
> backup, it is the wal backup one that does not explicitly check the
> 'swift upload' result (this should really be added). To be fair,
> anything wrong with the swift system will likely be discovered
> immediately beforehand where he does a 'swift stat'!

Things could certainly break between those two calls to swift, in a
variety of ways.

> I'd guess his original problem was an improperly setup
> recovery.conf, rather than the overall design.

I agree that the original issue is unlikely to be related to these
scripts.  That doesn't mean that using them is a good idea.

Thanks!

Stephen

Re: [ADMIN] Bad recovery: no pg_xlog/RECOVERYXLOG

От
Mark Kirkwood
Дата:

On 01/11/17 00:47, Stephen Frost wrote:
> Mark,
>
> * Mark Kirkwood (mark.kirkwood@catalyst.net.nz) wrote:
>> On 31/10/17 04:47, Stephen Frost wrote:
>>> * Marcin Koziej (marcin@cahoots.pl) wrote:
>>>> Now it's fixed, but if anyone needs I'm attaching all scripts to 1)
>>>> backup and restore wal's and 2) backup and restore base backup from
>>>> OpenStack SWIFT
>>> Interesting, but these scripts seem to be seriously lacking in error
>>> checking (what happens if the copy to swift fails..?  or pg_basebackup
>>> fails?) and it's unclear how you can be sure that the WAL file has been
>>> sync'd to disk which is important or you might end up having holes in
>>> your WAL stream if the swift system fails.  There's also no checking to
>>> make sure that the WAL needed for a given pg_basebackup ever actually
>>> made it to the swift system, which is required to ensure you have a
>>> consistent backup.
>>>
>>> Generally speaking, these kinds of scripts really aren't a good choice
>>> for doing backups of PG.  I'd strongly suggest you look at one of the
>>> existing tools which are developed specifically for doing backups of PG
>>> and are well tested, supported, and maintained.  If you'd like support
>>> for a new storage system, I know that at least pgBackRest's storage
>>> layer is pluggable and adding a new storage option is pretty straight
>>> forward.
>> I'm not convinced that his approach is bad.
> I was the same way for a long time, thinking that shell scripts could
> reasonably be used with certain caveats, but the devil really is in the
> details and it's far too easy to miss things in shell scripts (such as
> not checking return codes, or not doing so properly, or various other
> issues).  Also, you didn't address things like verifying that you
> actually have all the WAL needed for a valid backup, and how to handle
> retention?
>
>> The script checks the result of the 'swift upload' for the base
>> backup, it is the wal backup one that does not explicitly check the
>> 'swift upload' result (this should really be added). To be fair,
>> anything wrong with the swift system will likely be discovered
>> immediately beforehand where he does a 'swift stat'!
> Things could certainly break between those two calls to swift, in a
> variety of ways.
>
>> I'd guess his original problem was an improperly setup
>> recovery.conf, rather than the overall design.
> I agree that the original issue is unlikely to be related to these
> scripts.  That doesn't mean that using them is a good idea.
>
>

Exactly - ,the original issue is unlikely to be related to these scripts!

While I agree that the scripts concerned would benefit from some 
development/qa etc, I'm really disagreeing with your point that folk 
'should not do this'. I would like to say that folk 'should do this' - 
but try to do it well - and we should help them (isn't that idea kinda 
tied up with the whole point of these lists)? I'm not keen on us being 
seen to stifle development just because we know of another existing 
product that *might* be better. To me that seems like a slippery slope 
that gets us endorsing only certain vendors solutions (whether they be 
open source or not). I'm not a fan of that at all.

regards

Mark


-- 
Sent via pgsql-admin mailing list (pgsql-admin@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-admin

Re: [ADMIN] Bad recovery: no pg_xlog/RECOVERYXLOG

От
Stephen Frost
Дата:
Mark,

* Mark Kirkwood (mark.kirkwood@catalyst.net.nz) wrote:
> While I agree that the scripts concerned would benefit from some
> development/qa etc, I'm really disagreeing with your point that folk
> 'should not do this'. I would like to say that folk 'should do this'
> - but try to do it well - and we should help them (isn't that idea
> kinda tied up with the whole point of these lists)?

I'm all for someone else starting up a new project to improve the
situation around backups for PG.  That not going to be three shell
scripts amounting to maybe 100 lines of code and what I really am
concerned about is people seeing these simple shell scripts thinking
"oh, I'll just use these simple things" without realizing that they're
going to end up in a bad spot because those simple shell scripts aren't
sufficient to do backups with PG properly and reliably.

We could store data in a CSV file and access it through shell scripts
too and call it a database.  If someone posted those as an alternative
to PG, I don't doubt that they'd get shot down pretty hard too.

These aren't just perfectionism complaints about shell scripts being
used to do backups of PG either, I've seen people using them and doing
so in ways that result in not having reliable backups which has then
lead to literally days of work be lost.

Put these shell scripts out on a github website with a big "in
development, not for production use, do not use" readme and continue to
hack on them as much as you'd like.  Don't post them to these lists with
a "this is how you do backups in PG".

Thanks!

Stephen

Re: [ADMIN] Bad recovery: no pg_xlog/RECOVERYXLOG

От
Mark Kirkwood
Дата:

On 02/11/17 01:28, Stephen Frost wrote:
> Mark,
>
> * Mark Kirkwood (mark.kirkwood@catalyst.net.nz) wrote:
>> While I agree that the scripts concerned would benefit from some
>> development/qa etc, I'm really disagreeing with your point that folk
>> 'should not do this'. I would like to say that folk 'should do this'
>> - but try to do it well - and we should help them (isn't that idea
>> kinda tied up with the whole point of these lists)?
> I'm all for someone else starting up a new project to improve the
> situation around backups for PG.  That not going to be three shell
> scripts amounting to maybe 100 lines of code and what I really am
> concerned about is people seeing these simple shell scripts thinking
> "oh, I'll just use these simple things" without realizing that they're
> going to end up in a bad spot because those simple shell scripts aren't
> sufficient to do backups with PG properly and reliably.
>
> We could store data in a CSV file and access it through shell scripts
> too and call it a database.  If someone posted those as an alternative
> to PG, I don't doubt that they'd get shot down pretty hard too.
>
> These aren't just perfectionism complaints about shell scripts being
> used to do backups of PG either, I've seen people using them and doing
> so in ways that result in not having reliable backups which has then
> lead to literally days of work be lost.
>
> Put these shell scripts out on a github website with a big "in
> development, not for production use, do not use" readme and continue to
> hack on them as much as you'd like.  Don't post them to these lists with
> a "this is how you do backups in PG".
>
>

I don't think either the original script author - or myself - are 
attempting to suggest a few shell scripts were the next, complete 
coverage backup solution (ahem - it is only yourself that is pushing 
this extreme interpretation)!

However there is the use case for people that just want a minimal backup 
solution that works for their specific environment, and don't want to 
bring along a lot of extra machinery that a full coverage 
all-singing-and-dancing product includes - this *can* be accomplished by 
a few shell scripts. Yes, it does mean that you spend extra time testing 
and debugging [1]. Err - I think that is all the original author (who is 
probably scared off now), was wanting a bit of help with.

regards

Mark

[1] which is where a pre-existing more complex solution is likely to be 
better - it has had more testing in the field, and of course it is fine 
to point that out.


-- 
Sent via pgsql-admin mailing list (pgsql-admin@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-admin

Re: [ADMIN] Bad recovery: no pg_xlog/RECOVERYXLOG

От
Stephen Frost
Дата:
Mark,

* Mark Kirkwood (mark.kirkwood@catalyst.net.nz) wrote:
> However there is the use case for people that just want a minimal
> backup solution that works for their specific environment, and don't
> want to bring along a lot of extra machinery that a full coverage
> all-singing-and-dancing product includes - this *can* be
> accomplished by a few shell scripts. Yes, it does mean that you
> spend extra time testing and debugging [1]. Err - I think that is
> all the original author (who is probably scared off now), was
> wanting a bit of help with.

This is exactly the issue that concerns me.  I'm not suggesting that
these scripts are, or need to be, the end-all, be-all of PG backup
solutions.

What I'm pointing out is that shell-script based solutions are *broken*,
not that they are lacking in features.  Many, many years ago I also used
to think it was possible to perform a PG backup using just shell scripts
and have it be successful and reliable, but since then I've seen too
many cases where exactly that has lead to incomplete and invalid
backups to be able to agree that they're reasonable to use.  Not having
a way to reliably sync the WAL files copied by archive command to disk,
in particular, really is an issue, it's not some feature, it's a
requirement of a functional PG backup system.  The other requirement for
a functional PG backup system is a check to verify that all of the WAL
for a given backup has been archived safely to disk, otherwise the
backup is incomplete and can't be used.

Both of those basic requirements are, at best, extremely difficult to
do in a shell script.  Maybe it's possible to do, but I've certainly yet
to see it and I'm not going to agree that such "simple" shell scripts
should be posted to our mailing lists without someone pointing out that
they're broken because, otherwise, people will take and use them and end
up with backups that are broken (often right when they end up actually
needing it).

If you'd like to develop a shell script that addresses these basic
requirements of file-based PG backups and ask for critique on it while
making it clear that it's in development, I'd be happy to provide
comments on it.  I won't agree that any shell-based solution that
doesn't have these basic requirements met is an acceptable option.

Thanks!

Stephen

Re: [ADMIN] Bad recovery: no pg_xlog/RECOVERYXLOG

От
Mark Kirkwood
Дата:

On 02/11/17 11:18, Stephen Frost wrote:
> Mark,
>
> * Mark Kirkwood (mark.kirkwood@catalyst.net.nz) wrote:
>> However there is the use case for people that just want a minimal
>> backup solution that works for their specific environment, and don't
>> want to bring along a lot of extra machinery that a full coverage
>> all-singing-and-dancing product includes - this *can* be
>> accomplished by a few shell scripts. Yes, it does mean that you
>> spend extra time testing and debugging [1]. Err - I think that is
>> all the original author (who is probably scared off now), was
>> wanting a bit of help with.
> This is exactly the issue that concerns me.  I'm not suggesting that
> these scripts are, or need to be, the end-all, be-all of PG backup
> solutions.
>
> What I'm pointing out is that shell-script based solutions are *broken*,
> not that they are lacking in features.  Many, many years ago I also used
> to think it was possible to perform a PG backup using just shell scripts
> and have it be successful and reliable, but since then I've seen too
> many cases where exactly that has lead to incomplete and invalid
> backups to be able to agree that they're reasonable to use.  Not having
> a way to reliably sync the WAL files copied by archive command to disk,
> in particular, really is an issue, it's not some feature, it's a
> requirement of a functional PG backup system.  The other requirement for
> a functional PG backup system is a check to verify that all of the WAL
> for a given backup has been archived safely to disk, otherwise the
> backup is incomplete and can't be used.
>
> Both of those basic requirements are, at best, extremely difficult to
> do in a shell script.  Maybe it's possible to do, but I've certainly yet
> to see it and I'm not going to agree that such "simple" shell scripts
> should be posted to our mailing lists without someone pointing out that
> they're broken because, otherwise, people will take and use them and end
> up with backups that are broken (often right when they end up actually
> needing it).
>
> If you'd like to develop a shell script that addresses these basic
> requirements of file-based PG backups and ask for critique on it while
> making it clear that it's in development, I'd be happy to provide
> comments on it.  I won't agree that any shell-based solution that
> doesn't have these basic requirements met is an acceptable option.
>
>

Ok, that is interesting. In my experience, provided the a) construction 
you are using in archive_command correctly reports success/failure, and 
b) you have some monitoring that checks for archive failure then that 
requirement of you having the required logs will be fine. Finally that 
c) your pg_basebackup concoction properly checks return codes then you 
are fine.

All these are reasonably straightforward to implement via shell.

Also, if what you are suggesting were actually the case, almost 
everyone's streaming replication (and/or log shipping) would be broken 
all the time.

With respect to 'If I would like to develop etc etc..' - err, all I was 
doing in this thread was helping the original poster make his stuff a 
bit better - I'll continue to do that.

Best wishes

Mark


-- 
Sent via pgsql-admin mailing list (pgsql-admin@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-admin

Re: [ADMIN] Bad recovery: no pg_xlog/RECOVERYXLOG

От
Mark Kirkwood
Дата:
On 02/11/17 11:18, Stephen Frost wrote:

> Not having
> a way to reliably sync the WAL files copied by archive command to disk,
> in particular, really is an issue, it's not some feature, it's a
> requirement of a functional PG backup system.  The other requirement for
> a functional PG backup system is a check to verify that all of the WAL
> for a given backup has been archived safely to disk, otherwise the
> backup is incomplete and can't be used.
>
>

Funnily enough, the original poster's scripts were attempting to address 
(at least some) of this: he was sending stuff to swift, so if he got a 
ok return code then it is *there* - that being the whole point of a 
distributed, fault tolerant object store (I do swift support BTW).

I wonder if you are seeing this discussion in the light of folk doing 
backups to unreliable storage locations (e.g: the same server, NFS etc 
etc), then sure I completely agree with what you are saying (these issue 
impact backup designs no matter what tool is used to write them).

Best wishes

Mark


-- 
Sent via pgsql-admin mailing list (pgsql-admin@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-admin

Re: [ADMIN] Bad recovery: no pg_xlog/RECOVERYXLOG

От
Stephen Frost
Дата:
Mark,

* Mark Kirkwood (mark.kirkwood@catalyst.net.nz) wrote:
> Ok, that is interesting. In my experience, provided the a)
> construction you are using in archive_command correctly reports
> success/failure, and b) you have some monitoring that checks for
> archive failure then that requirement of you having the required
> logs will be fine. Finally that c) your pg_basebackup concoction
> properly checks return codes then you are fine.
>
> All these are reasonably straightforward to implement via shell.

Sure, that'll work much of the time, but that's about like saying that
PG could run without fsync being enabled much of the time and everything
will be ok.  Both are accurate, but hopefully you'll agree that PG
really should always be run with fsync enabled.

> Also, if what you are suggesting were actually the case, almost
> everyone's streaming replication (and/or log shipping) would be
> broken all the time.

No, again, this isn't an argument about if it'll work most of the time
or not, it's about if it's correct.  PG without fsync will work most of
the time too, but that doesn't mean it's actually correct.

> With respect to 'If I would like to develop etc etc..' - err, all I
> was doing in this thread was helping the original poster make his
> stuff a bit better - I'll continue to do that.

Ignoring the basic requirements which I outlined isn't helping him get
to a reliable backup system.

Thanks!

Stephen

Re: [ADMIN] Bad recovery: no pg_xlog/RECOVERYXLOG

От
Stephen Frost
Дата:
Mark,

* Mark Kirkwood (mark.kirkwood@catalyst.net.nz) wrote:
> On 02/11/17 11:18, Stephen Frost wrote:
>
> >Not having
> >a way to reliably sync the WAL files copied by archive command to disk,
> >in particular, really is an issue, it's not some feature, it's a
> >requirement of a functional PG backup system.  The other requirement for
> >a functional PG backup system is a check to verify that all of the WAL
> >for a given backup has been archived safely to disk, otherwise the
> >backup is incomplete and can't be used.
>
> Funnily enough, the original poster's scripts were attempting to
> address (at least some) of this: he was sending stuff to swift, so
> if he got a ok return code then it is *there* - that being the whole
> point of a distributed, fault tolerant object store (I do swift
> support BTW).

There's different levels of storage reliability even in swift and that
doesn't do anything to address the issue that you don't know if all of
the WAL for a given backup has actually made it to swift.

Perhaps it might be useful to also point out here that pg_basebackup is
going to exit just as soon as it's done copying the files- it's not
going to wait for the WAL to finish getting to swift before returning
'success' because you didn't ask pg_basebackup to pull the WAL in these
scripts.

What that means is that you could have everything be successful, per
your definitions, and still not have a valid backup, and then you decide
to rotate off your older backup and then there's a crash.  Guess what?
You don't have a valid backup anymore because you haven't got all of the
necessary WAL for the pg_basebackup that you did do, so you can't use
that, and you nuked your prior backup, so that's gone too.  Hopefully
you have more backups than that, but if not, because you trusted in
these scripts and the guarantees of swift, then you've just lost
everything.

> I wonder if you are seeing this discussion in the light of folk
> doing backups to unreliable storage locations (e.g: the same server,
> NFS etc etc), then sure I completely agree with what you are saying
> (these issue impact backup designs no matter what tool is used to
> write them).

That you're arguing so hard about this one specific shell script which
happens to be based on swift really doesn't convince me that
recommending shell-script based backup solutions on PG is a good idea.
Doing backups locally may not be ideal for various reasons, but at
least if you're making sure to properly fsync the data out to the RAID'd
disks, and verifying that your backups are fully fsync'd and that you've
checked to make sure you have all of the WAL for a given backup (and
that it's all fsync'd) then I'd argue that it's at least conceptually
correct.  The same goes for NFS, or sending the data to another server,
assuming they're set up properly to respect fsync.

Simply skipping the requirements to verify that you've got all of the
WAL for the backup and that you've made sure that it's all stored on
reliable storage isn't correct.

Doing proper backups of PG is *hard*.  There's a lot of things you have
to do correctly to get them to actually be consistently reliable in the
face of even single-point failures.  Having swift provide reliability
guarantees for the archived WAL, provided the shell script is perfectly
written to catch all errors and report them back to PG correctly, is
great, but it still doesn't address the other requirement of ensuring
that all WAL has actually been archived before considering a given
backup as complete, and you have to decide what level of guarantees you
want from swift and configure it appropriately.

If you want simple script-based backups, then use pg_basebackup and make
it do the WAL handling as well and then make sure that you've got your
script set up to check error codes from pg_basebackup and that you're
actually monitoring your backups.  Even then there's risks of issues
which boil down to cases where even we didn't fsync things out properly
leading to cases where WAL or files could be lost due to a crash after
pg_basebackup finishing.  Hopefully those have all been addressed now,
but it's a testiment to the difficulty of doing these things correctly.

Thanks!

Stephen

Re: [ADMIN] Bad recovery: no pg_xlog/RECOVERYXLOG

От
Mark Kirkwood
Дата:
Stephen,

On 03/11/17 00:11, Stephen Frost wrote:

>
> Sure, that'll work much of the time, but that's about like saying that
> PG could run without fsync being enabled much of the time and everything
> will be ok.  Both are accurate, but hopefully you'll agree that PG
> really should always be run with fsync enabled.

It is completely different - this is a 'straw man' argument, and justs 
serves to confuse this discussion.

>
>> Also, if what you are suggesting were actually the case, almost
>> everyone's streaming replication (and/or log shipping) would be
>> broken all the time.
> No, again, this isn't an argument about if it'll work most of the time
> or not, it's about if it's correct.  PG without fsync will work most of
> the time too, but that doesn't mean it's actually correct.

No, it is pointing out that if your argument were correct, then there 
should be the above side effects - there are not, which is significant.

The crux of your argument seems to be concerning the synchronization 
between pg_basbackup finishing and being sure you have the required 
archive logs. Now just so we are all clear, when pg_basebackup ends it 
essentially calls do_pg_stop_backup (from xlog.c) which ensures that all 
required WAL files are archived, or to be precise here makes sure 
archive_command has been run successfully for each required WAL file.

Your entire argument seems about whether said WAL is fsync'ed to disk, 
and how this is impossible to ensure in a shell script. Actually it is 
possible quite simply: e.g suppose you archive command is:

rsync ... targetserver:/disk

There are several ways to get that to sync:

rsync .. targetserver:/disk && ssh target server sync

Alternatively amend  vm.dirty_bytes on targetserver to be < 16M, or 
mount the /disk with sync option!

So it is clearly *possible*.

However, I think you are obsessing over the minutiae of fsync to single 
server/disk when there are much more important (read likely to happen) 
problems to consider. For me, the critical consideration is, not 'are 
the WAL files there *right now*'..but 'will they be there tomorrow when 
I need them for a restore'? Next is 'will they be the same/undamaged 
when I read them tomorrow'?

This is why I'm *not* obsessing about fsyncing...make where you store 
these WAL files *reliable*...either via proxying/ip splitting so you 
send stuff to more that one server (if we are still thinking server + 
disk = backup solution). Alternatively use a distributed object store 
(Swift, S3 etc) that handle that for you, and in addition they checksum 
and heal any individual node data corruption for you as well.
>> With respect to 'If I would like to develop etc etc..' - err, all I
>> was doing in this thread was helping the original poster make his
>> stuff a bit better - I'll continue to do that.
> Ignoring the basic requirements which I outlined isn't helping him get
> to a reliable backup system.

Actually I was helping him get a *reliable* backup system, I think you 
misunderstood how swift changes the picture compared to a single 
server/single disk design.

regards

Mark


-- 
Sent via pgsql-admin mailing list (pgsql-admin@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-admin

Re: [ADMIN] Bad recovery: no pg_xlog/RECOVERYXLOG

От
Stephen Frost
Дата:
Mark,

* Mark Kirkwood (mark.kirkwood@catalyst.net.nz) wrote:
> On 03/11/17 00:11, Stephen Frost wrote:
> >Sure, that'll work much of the time, but that's about like saying that
> >PG could run without fsync being enabled much of the time and everything
> >will be ok.  Both are accurate, but hopefully you'll agree that PG
> >really should always be run with fsync enabled.
>
> It is completely different - this is a 'straw man' argument, and
> justs serves to confuse this discussion.

I don't see it as any different at all.  The point I was trying to make
there is that there's a minimum requirement for backups, just as there
is for ACID compliance, and any solution needs to meet that minimum to
be considered.

> The crux of your argument seems to be concerning the synchronization
> between pg_basbackup finishing and being sure you have the required
> archive logs. Now just so we are all clear, when pg_basebackup ends
> it essentially calls do_pg_stop_backup (from xlog.c) which ensures
> that all required WAL files are archived, or to be precise here
> makes sure archive_command has been run successfully for each
> required WAL file.

pg_basebackup talks the replication protocol, to be clear, and sends a
BASE_BACKUP message, of which one of the options is 'NOWAIT' to indicate
if the server should wait until all of the WAL has been archived.
Typically, pg_basebackup does send a 'NOWAIT' to tell the server to not
hold up the final message until all of the WAL has been archived,
because it's handling the verification of the WAL having been archived.
In the unusual case that WAL isn't included with the pg_basebackup it
looks like it would wait for the archive_command to complete, which is
better than I had thought (and hadn't noticed on my first glance through
the code), though that does depend on a functional and perfect
archive_command, and there's no shortage of reasons for why that might
not be the case at the time the backup is happening.

That's an awful lot of action-at-a-distance hope for me to be
comfortable with, however.  A backup solution really does need to verify
that the WAL has been completely and reliably stored, as discussed in
the documentation, before claiming a backup is valid, and there's
basically no reason not to unless the tool you've chosen to use makes
that particularly difficult (even if not *technically* impossible, given
enough effort).  If your solution is built on the assumption that WAL
archiving is always working and there's no check happening during backup
to verify that you've got all the WAL then I have serious doubts about
it being reliable.  If you're independently monitoring that all WAL has
been archived, that's certainly helpful, but I don't consider that to be
a complete substitute for making sure that you've got all of the WAL for
a given backup.

> Your entire argument seems about whether said WAL is fsync'ed to
> disk, and how this is impossible to ensure in a shell script.
[...]
> So it is clearly *possible*.

Yes, it's possible, but it's not something I'd recommend doing and none
of your arguments have made me any more likely to recommend trying to
ensure a proper backup has completed using shell scripts.  What I fail
to understand is your insistence on it being a good idea.  I've seen
lots and lots of attempts at it, even made some myself, and have come to
the generally agreed upon conclusion that it's both a bad idea to hack
together your own backup solution for PG and that, even if you do want
to try, using shell scripts to attempt to accomplish it is a bad idea.
There's much better solutions out there which are really what folks
should be using.  I'm not against using pg_basebackup either, but if
you're using it, let it handle the archiving because it does verify that
all of the WAL has been archived properly.

> Actually I was helping him get a *reliable* backup system, I think
> you misunderstood how swift changes the picture compared to a single
> server/single disk design.

I do understand the goals of things like swift and s3 and the intent
behind them to provide a better store than local disks, and I'm not
against using them, to be clear, but they only address one of the
requirements that I outlined for a reliable backup solution.  I mention
both requirements consistently to, hopefully, ensure that those coming
along later to read these threads remember that it's more than just
making sure that you verify all the WAL has been archived during a
backup- but that they've been archived and actually fsync'd or written
out to reliable storage.

Thanks!

Stephen

Re: [ADMIN] Bad recovery: no pg_xlog/RECOVERYXLOG

От
Mark Kirkwood
Дата:

On 07/11/17 02:37, Stephen Frost wrote:
> Mark,
>
> * Mark Kirkwood (mark.kirkwood@catalyst.net.nz) wrote:
>> On 03/11/17 00:11, Stephen Frost wrote:
>>> Sure, that'll work much of the time, but that's about like saying that
>>> PG could run without fsync being enabled much of the time and everything
>>> will be ok.  Both are accurate, but hopefully you'll agree that PG
>>> really should always be run with fsync enabled.
>> It is completely different - this is a 'straw man' argument, and
>> justs serves to confuse this discussion.
> I don't see it as any different at all.  The point I was trying to make
> there is that there's a minimum requirement for backups, just as there
> is for ACID compliance, and any solution needs to meet that minimum to
> be considered.

Ok and apologies - I thought you were going all 'schoolboy debating' on 
me :-) . I'll discuss how I'm seeing this:

In the case of a db server running with fsync off, one crash and it may 
not be able to be restarted - ever, so pretty severe loss of service.

In the case of a backup server crashing immediately after a backup 
(assuming archive logs and backup going to same host for simplicity), 
then *if undected* it could mean that later you cannot restore this 
backup - very bad...so in that case I agree with you. However detection 
(i.e monitoring) is essential otherwise a meticulously fsync'd set of 
WAL can be lost or corrupted by the various usual suspects too (bad 
ram/hba/disk...) - with the same result. So assuming we have monitoring 
doing its thing, after the backup server crashes then missing or damaged 
WAL can be retrieved from our still running db server - or if they have 
been recycled, then we need to do another backup. No loss of service.

>> The crux of your argument seems to be concerning the synchronization
>> between pg_basbackup finishing and being sure you have the required
>> archive logs. Now just so we are all clear, when pg_basebackup ends
>> it essentially calls do_pg_stop_backup (from xlog.c) which ensures
>> that all required WAL files are archived, or to be precise here
>> makes sure archive_command has been run successfully for each
>> required WAL file.
> pg_basebackup talks the replication protocol, to be clear, and sends a
> BASE_BACKUP message, of which one of the options is 'NOWAIT' to indicate
> if the server should wait until all of the WAL has been archived.
> Typically, pg_basebackup does send a 'NOWAIT' to tell the server to not
> hold up the final message until all of the WAL has been archived,
> because it's handling the verification of the WAL having been archived.
> In the unusual case that WAL isn't included with the pg_basebackup it
> looks like it would wait for the archive_command to complete, which is
> better than I had thought (and hadn't noticed on my first glance through
> the code), though that does depend on a functional and perfect
> archive_command, and there's no shortage of reasons for why that might
> not be the case at the time the backup is happening.
>
> That's an awful lot of action-at-a-distance hope for me to be
> comfortable with, however.  A backup solution really does need to verify
> that the WAL has been completely and reliably stored, as discussed in
> the documentation, before claiming a backup is valid, and there's
> basically no reason not to unless the tool you've chosen to use makes
> that particularly difficult (even if not *technically* impossible, given
> enough effort).  If your solution is built on the assumption that WAL
> archiving is always working and there's no check happening during backup
> to verify that you've got all the WAL then I have serious doubts about
> it being reliable.  If you're independently monitoring that all WAL has
> been archived, that's certainly helpful, but I don't consider that to be
> a complete substitute for making sure that you've got all of the WAL for
> a given backup.
>
>> Your entire argument seems about whether said WAL is fsync'ed to
>> disk, and how this is impossible to ensure in a shell script.
> [...]
>> So it is clearly *possible*.
> Yes, it's possible, but it's not something I'd recommend doing and none
> of your arguments have made me any more likely to recommend trying to
> ensure a proper backup has completed using shell scripts.  What I fail
> to understand is your insistence on it being a good idea.  I've seen
> lots and lots of attempts at it, even made some myself, and have come to
> the generally agreed upon conclusion that it's both a bad idea to hack
> together your own backup solution for PG and that, even if you do want
> to try, using shell scripts to attempt to accomplish it is a bad idea.
> There's much better solutions out there which are really what folks
> should be using.  I'm not against using pg_basebackup either, but if
> you're using it, let it handle the archiving because it does verify that
> all of the WAL has been archived properly.
>
>> Actually I was helping him get a *reliable* backup system, I think
>> you misunderstood how swift changes the picture compared to a single
>> server/single disk design.

Ok, so I think we have moved closer to seeing each other's point of view 
- been an interesting discussion so far!

> I do understand the goals of things like swift and s3 and the intent
> behind them to provide a better store than local disks, and I'm not
> against using them, to be clear, but they only address one of the
> requirements that I outlined for a reliable backup solution.  I mention
> both requirements consistently to, hopefully, ensure that those coming
> along later to read these threads remember that it's more than just
> making sure that you verify all the WAL has been archived during a
> backup- but that they've been archived and actually fsync'd or written
> out to reliable storage.
>
>

Here I think you have still not grasped that (e.g) swift achieves *both* 
of these - without you attempting to call fsync after your uploads. (for 
instance in our swift cluster, you would have to have all three data 
centers down to lose access to your uploaded WAL...and we run with the 
various storage mounted with barrier=on so the files will be there when 
the centers return) - note that swift PUT operation  (which is what 
upload is doing) does fsync at the end.

regards

Mark


-- 
Sent via pgsql-admin mailing list (pgsql-admin@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-admin