Re: Consistent file-level backup of pg data directory

Поиск
Список
Период
Сортировка
От gator_ml@yahoo.de
Тема Re: Consistent file-level backup of pg data directory
Дата
Msg-id lajpmd$nor$1@ger.gmane.org
обсуждение исходный текст
Ответ на Re: Consistent file-level backup of pg data directory  (Jeff Janes <jeff.janes@gmail.com>)
Ответы Re: Consistent file-level backup of pg data directory
Re: Consistent file-level backup of pg data directory
Список pgsql-general
On 2014-01-07 21:54, Jeff Janes wrote:
> On Tue, Jan 7, 2014 at 10:58 AM, <gator_ml@yahoo.de
> <mailto:gator_ml@yahoo.de>> wrote:
>     - keep a copy "backup_label" under a different name
[...[
> Why under a different name?  That sounds dangerous to me.
... otherwise it would be useless, because "pg_stop_backup"
will delete the "backup_label" file

>     - call pg_stop_backup()
>     - the run the regular backup procedure
>
>
> You have this order backwards.  You can only tell postgres that you are
> done doing the backup once you are done doing the backup.
this is what I tried 1st - unfortunately, this won't work at all,
because postgres creates another checkpoint _after_ pg_stop and will
refuse to run the restore procedure if this last wal file is missing
(which in my scenario inevitably will be the case)

> It is pretty much a disaster.  Do you really need rsync?  Otherwise, use
> pg_basebackup, it is easier to get right.

... it looks like I didn't explain clear enough, what I am trying to
accomplish. So let me try it again:

We have a general backup procedure, that runs every night. For
some machines, there may additionally be more
specialized/fine-grained measures, but in any case, it should be
possible to restore any given server without any machine-specific
knowledge from the last nightly backup following a
generic "restore-recipe". Our current rsync-based solution is a
pretty good compromise between many different requirements.

For machines running database systems, this means, this means,
that I need some way to get a consistent state of some point in
time. It does not particularly matter, which time exactly (in
particular, I do not care, if transactions committed during the
backup are included or not) For this purpose, I can define
machine-specific hook to run before/ after the actual backup
procedure.

Unfortunately, it does not look like there is any direct way to
accomplish this with postgres except shutting down the whole
database system while the backup is running. The systems will
be almost idle while the backup runs, and write transactions
will be very rare, but a total shutdown for every backup still
would be too radical.

Actually, with some tests copying the database directory while
writing data to the database and then restarting postgres with
such a copy, I did not manage to produce any visible
inconsistencies. If postgres was able to restore the last
consistent state in such a situation, then no additional measures
would be needed (like I said, it is not particularly important
which state this is - losing everything written during the last
15 minutes including committed transactions would be good enough)
.But at least as far as I can tell from the documentation, this
can not safely be assumed.

Otherwise, the procedures for "Continuous Archiving" are clearly
made for some a diffent purpose, it seems like this is about as
close as it gets to what I need. As far as I understand:

- between pg_start_backup() and pg_stop_backup() it should be safe
to copy the data directory any time
- all committed transactions during this time will go to the wal files
(actually, I noticed that files in other directories than pg_xlog are
modified, too ...)
- normally, it is intended to copy every single newly written wal file
somewhere else by whatever is defined as "archive_command", but
if this works, than any other way to copy all wal files written
after pg_start_backup should work just as well.

My original idea was, to call pg_start_backup() 1st, then run the
actual backup followed by pg_stop_backup. Because the procedure
for " Point-in-Time Recovery" is the same and no wal file may be
modified after it was written and "archive_command" called, it
follows, that any wal file plus the preceding wal files written
after pg_start_backup must contain everything that is needed to
restore any state up to the time the wal file was
written (otherwise the whole "PITR)" mechanism would not work).
Unfortunately, postgres refuses to run the restore procedure if
the last wal file (which is written after pg_stop_backup, so in
my case it will not be available) even if "recovery_target_time"
is explicitly set to some earlier time. I didn't try it, but
assuming that the last wal file is not really needed in this
case, it would be enough to just create an empty file with the
name postgres is looking for?

If I can't find, a better way, I probably could arrange to
separately copy this last wal file into the
backup (unfortunately, there is no easy way to do this ...)-:

Calling pg_stop_backup() _before_ the backup was of course sheer
desperation but this way postgres seemed to correctly run it's
restore procedure in my tests, but I don't know about the
postgres internals (actually, I personally am not even
responsible for any postgres server), that's why I asked here.

So again my question: is there any reasonable way to get a
consistent backup of a postgres server in my
scenario (pg_basebackup won't help much here - I could of course
use it to create a local copy before the backup, but this would
waste lots of space)

Regards,
                  Peter


В списке pgsql-general по дате отправления:

Предыдущее
От: Merlin Moncure
Дата:
Сообщение: Re: returning json object with subset of keys
Следующее
От: Nelson Green
Дата:
Сообщение: Last inserted row id with complex PK