a few questions on backup

Поиск
Список
Период
Сортировка
От Marco Colombo
Тема a few questions on backup
Дата
Msg-id 46487830.7000400@esiway.net
обсуждение исходный текст
Ответы Re: a few questions on backup  (Tom Lane <tgl@sss.pgh.pa.us>)
Re: a few questions on backup  (Hannes Dorbath <light@theendofthetunnel.de>)
Re: a few questions on backup  ("Prashant Ranjalkar" <prashant.ranjalkar@gmail.com>)
Список pgsql-general
Hello,
I have a few questions on backuping a PostgreSQL server (lets say
anything 8.x.x). I've read "Continuous Archiving and Point-In-Time
Recovery (PITR)" in the manual I'm still missing something...well
actually I think I don't but I've been debating on this with a friend
for a while, and there's something we don't seem to agree on, so I need
some bits of clarification. :)

Ok, let's say what I really need is "poor man's" PITR. That is, I
perform only one daily full backup, and I need to be able to restore the
database contents as they were at backup time. I don't even need to know
the exact time (yes, that means I don't really care about which
transactions turn out to be committed and which don't). Mmm, ok that
can't be rightfully called PITR at all, maybe.

Am I right in assuming that the following procedure is ok?

1) issue pg_start_backup();
2) copy (or tar or cpio) the data dir, w/o pg_xlog/
3) issue pg_stop_backup();
4) copy (or tar or cpio) pg_xlog/ contents.

That's all. Please note that I'm doing NO WAL archiving.

Whether this is going to work or not is based on what exactly
pg_start_backup() does. According to him (my friend), it may simply
prevent PostgreSQL from writing to data files until pg_stop_backup() is
issued, thus kind of enforcing a snapshot on the data files themselves.
Changes go to WAL only, and they are "played" on data files only after
the pg_stop_backup(). Hmm, I don't really like that. But if so, I don't
even need step 4), since the tar backup would be consistent. I'm able to
restore from that only, and data are those at the time of pg_start_backup().

Or, (that's me) pg_start_backup() only ensures that full_page_writes is
enabled during the backup phase, so that any change to data files can be
undone/redone based on the contents of WAL segments. The tar archive
contains potentially inconsistant data, that's why I need WAL segments
too. At restore time, PG eventually performs a crash recovery, and data
are those at the time of step 4) (which is an interval really, some time
between the start and the end of the copy, but I don't need to be able
to tell the exact time anyway).

BTW, I see a vulnerability in the above procedure... if there's enough
write activity so that PostgreSQL recycles WAL segments between 1) and
4), some changes may be lost. At step 4) I need to save _all_ WAL
segments that have been produced during the backup time. I assume that
activity at backup time is low enough that the event is "extremely
unlikely". Another doubt I have is about WAL checkpointing... is it
possible that a checkpoint happens during step 2), and the tar archive
gets data files both from before and from after the checkpoint (which I
think is bad) or does pg_start_backup() prevent WAL checkpointing, too?

Finally, if I'm missing something and the above is wrong, I think that
the only way to perform a full backup on a live database at filesystem
level, is to enable WAL archiving as the first step of the backup
procedure (assuming it's not usually on, of course), and later save all
the WAL segments that were _archived_ during that time, including the
one made at pg_stop_backup(), on the same backup medium.

If I understand the documentation right, nothing breaks if the
archive_command just returns OK w/o saving anything during regular
operation, and starts saving segments only during the backup time... I
mean, that prevents me from doing arbitrary PITR after the backup, but
all I want to do is performing a full backup, w/o being able to do any
partial backup after that. If so, I may write a script that does nothing
most of the time, and archives WAL segments only to be included in the
full backup.

TIA, (and sorry if the above sounds a bit messy)
.TM.

В списке pgsql-general по дате отправления:

Предыдущее
От: Paula Ta-Shma
Дата:
Сообщение: migrating a tablespace between postgres instances in shared disk environment
Следующее
От: Tom Lane
Дата:
Сообщение: Re: a few questions on backup