Re: a few questions on backup

Поиск
Список
Период
Сортировка
От Hannes Dorbath
Тема Re: a few questions on backup
Дата
Msg-id 4648820F.9030209@theendofthetunnel.de
обсуждение исходный текст
Ответ на a few questions on backup  (Marco Colombo <pgsql@esiway.net>)
Список pgsql-general
On 14.05.2007 16:54, Marco Colombo wrote:
> I have a few questions on backuping a PostgreSQL server (lets say
> anything 8.x.x). I've read "Continuous Archiving and Point-In-Time
> Recovery (PITR)" in the manual I'm still missing something...well
> actually I think I don't but I've been debating on this with a friend
> for a while, and there's something we don't seem to agree on, so I need
> some bits of clarification. :)
>
> Ok, let's say what I really need is "poor man's" PITR. That is, I
> perform only one daily full backup, and I need to be able to restore the
> database contents as they were at backup time. I don't even need to know
> the exact time (yes, that means I don't really care about which
> transactions turn out to be committed and which don't). Mmm, ok that
> can't be rightfully called PITR at all, maybe.
>
> Am I right in assuming that the following procedure is ok?
>
> 1) issue pg_start_backup();
> 2) copy (or tar or cpio) the data dir, w/o pg_xlog/
> 3) issue pg_stop_backup();
> 4) copy (or tar or cpio) pg_xlog/ contents.
>
> That's all. Please note that I'm doing NO WAL archiving.
>
> Whether this is going to work or not is based on what exactly
> pg_start_backup() does. According to him (my friend), it may simply
> prevent PostgreSQL from writing to data files until pg_stop_backup() is
> issued, thus kind of enforcing a snapshot on the data files themselves.
> Changes go to WAL only, and they are "played" on data files only after
> the pg_stop_backup(). Hmm, I don't really like that. But if so, I don't
> even need step 4), since the tar backup would be consistent. I'm able to
> restore from that only, and data are those at the time of pg_start_backup().
>
> Or, (that's me) pg_start_backup() only ensures that full_page_writes is
> enabled during the backup phase, so that any change to data files can be
> undone/redone based on the contents of WAL segments. The tar archive
> contains potentially inconsistant data, that's why I need WAL segments
> too. At restore time, PG eventually performs a crash recovery, and data
> are those at the time of step 4) (which is an interval really, some time
> between the start and the end of the copy, but I don't need to be able
> to tell the exact time anyway).
>
> BTW, I see a vulnerability in the above procedure... if there's enough
> write activity so that PostgreSQL recycles WAL segments between 1) and
> 4), some changes may be lost. At step 4) I need to save _all_ WAL
> segments that have been produced during the backup time. I assume that
> activity at backup time is low enough that the event is "extremely
> unlikely". Another doubt I have is about WAL checkpointing... is it
> possible that a checkpoint happens during step 2), and the tar archive
> gets data files both from before and from after the checkpoint (which I
> think is bad) or does pg_start_backup() prevent WAL checkpointing, too?
>
> Finally, if I'm missing something and the above is wrong, I think that
> the only way to perform a full backup on a live database at filesystem
> level, is to enable WAL archiving as the first step of the backup
> procedure (assuming it's not usually on, of course), and later save all
> the WAL segments that were _archived_ during that time, including the
> one made at pg_stop_backup(), on the same backup medium.
>
> If I understand the documentation right, nothing breaks if the
> archive_command just returns OK w/o saving anything during regular
> operation, and starts saving segments only during the backup time... I
> mean, that prevents me from doing arbitrary PITR after the backup, but
> all I want to do is performing a full backup, w/o being able to do any
> partial backup after that. If so, I may write a script that does nothing
> most of the time, and archives WAL segments only to be included in the
> full backup.

Are file system snapshots not the best approach for your requirements?

lvcreate -s -L5G -nbackup /dev/foo/postgresql
mount /dev/foo/backup /mnt/backup-snap
tar jcpvf pg-backup-<time_stamp>.bz2 /mnt/backup-snap

You can't do much wrong with that, it's fast and easy to use.

Just my 2cent..


--
Regards,
Hannes Dorbath

В списке pgsql-general по дате отправления:

Предыдущее
От: Tom Lane
Дата:
Сообщение: Re: a few questions on backup
Следующее
От: John Gateley
Дата:
Сообщение: Re: Fault Tolerant Postgresql (two machines, two postmasters, one disk array)