Re: PITR Functional Design v2 for 7.5

Поиск
Список
Период
Сортировка
От Andreas Pflug
Тема Re: PITR Functional Design v2 for 7.5
Дата
Msg-id 404EF917.2040200@pse-consulting.de
обсуждение исходный текст
Ответ на Re: PITR Functional Design v2 for 7.5  (Tom Lane <tgl@sss.pgh.pa.us>)
Ответы Re: PITR Functional Design v2 for 7.5  (Tom Lane <tgl@sss.pgh.pa.us>)
Список pgsql-hackers
Tom Lane wrote:

>Andreas Pflug <pgadmin@pse-consulting.de> writes:
>  
>
>>When I'm doing a file level hot backup, I can't be sure about the backup 
>>order. To be sure the cluster is in a consistent state regarding 
>>checkpoints, pg_clog must be the first directory backed up.
>>    
>>
>
>You are going off in the wrong direction entirely.
>
>Any hot-backup design that thinks safety can be ensured by "back up file A
>before file B" considerations is wrong.  That's because Postgres doesn't
>necessarily dump dirty blocks into the data area (or clog area) at any
>particular time.  Therefore, a filesystem-level backup taken while the
>postmaster is running is basically certain to be inconsistent.  You can
>*not* avoid this by being careful about the order you dump files in.
>Heck, you can't even be certain that a file you dump is internally
>consistent.
>  
>

Maybe my wording was misleading, seems Simon understood me as int was meant.
With "consistent state regarding checkpoints" I meant that all 
transactions that are marked as committed with the checkpoint are really 
present in the data files. Of course, there might be even more 
transactions which haven't been checkpointed so far, they'll need WAL 
replay.
To clarify:
I'd expect a cluster to be workable, if I
- disable VACUUM until backup completed
- issue CHECKPOINT
- backup clog (CHECKPOINT and backup clog are the "backup checkpoint")
- backup all datafiles (which include at least all completed transaction 
data at checkpoint time)
and then
- restore datafiles and clog
- bring up pgsql.
Certainly, all transactions after the backup checkpoint are lost. There 
might be fragments of newer transactions in data files, but they were 
never committed according to clog and thus rolled back.
WAL replay would add more completed transactions, making the cluster 
more up-to-date, but omitting this would be sufficient in many desaster 
recovery scenarios.
Did I miss something? If so, not only an API to get WAL data ordered out 
of pgsql is needed, but for the whole cluster.

Regards,
Andreas




В списке pgsql-hackers по дате отправления:

Предыдущее
От: Manfred Koizar
Дата:
Сообщение: Re: [PATCHES] log_line_info
Следующее
От: Andreas Pflug
Дата:
Сообщение: Re: grants