Re: [RFC] Incremental backup v2: add backup profile to base backup

Поиск
Список
Период
Сортировка
От Robert Haas
Тема Re: [RFC] Incremental backup v2: add backup profile to base backup
Дата
Msg-id CA+TgmoYdG1JvymERkGozpfazJBHTNbxSAvWMHGmK7dRioP8bAQ@mail.gmail.com
обсуждение исходный текст
Ответ на Re: [RFC] Incremental backup v2: add backup profile to base backup  (Marco Nenciarini <marco.nenciarini@2ndquadrant.it>)
Ответы Re: [RFC] Incremental backup v2: add backup profile to base backup
Список pgsql-hackers
On Mon, Oct 6, 2014 at 11:33 AM, Marco Nenciarini
<marco.nenciarini@2ndquadrant.it> wrote:
>> 1. Take a full backup.  Basically, we already have this.  In the
>> backup label file, make sure to note the newest LSN guaranteed to be
>> present in the backup.
>
> Don't we already have it in "START WAL LOCATION"?

Yeah, probably.  I was too lazy to go look for it, but that sounds
like the right thing.

>> 2. Take a differential backup.  In the backup label file, note the LSN
>> of the fullback to which the differential backup is relative, and the
>> newest LSN guaranteed to be present in the differential backup.  The
>> actual backup can consist of a series of 20-byte buffer tags, those
>> being the exact set of blocks newer than the base-backup's
>> latest-guaranteed-to-be-present LSN.  Each buffer tag is followed by
>> an 8kB block of data.  If a relfilenode is truncated or removed, you
>> need some way to indicate that in the backup; e.g. include a buffertag
>> with forknum = -(forknum + 1) and blocknum = the new number of blocks,
>> or InvalidBlockNumber if removed entirely.
>
> To have a working backup you need to ship each block which is newer than
> latest-guaranteed-to-be-present in full backup and not newer than
> latest-guaranteed-to-be-present in the current backup. Also, as a
> further optimization, you can think about not sending the empty space in
> the middle of each page.

Right.  Or compressing the data.

> My main concern here is about how postgres can remember that a
> relfilenode has been deleted, in order to send the appropriate "deletion
> tag".

You also need to handle truncation.

> IMHO the easiest way is to send the full list of files along the backup
> and let to the client the task to delete unneeded files. The backup
> profile has this purpose.
>
> Moreover, I do not like the idea of using only a stream of block as the
> actual differential backup, for the following reasons:
>
> * AFAIK, with the current infrastructure, you cannot do a backup with a
> block stream only. To have a valid backup you need many files for which
> the concept of LSN doesn't apply.
>
> * I don't like to have all the data from the various
> tablespace/db/whatever all mixed in the same stream. I'd prefer to have
> the blocks saved on a per file basis.

OK, that makes sense.  But you still only need the file list when
sending a differential backup, not when sending a full backup.  So
maybe a differential backup looks like this:

- Ship a table-of-contents file with a list relation files currently
present and the length of each in blocks.
- For each block that's been modified since the original backup, ship
a file called delta_<original file name> which is of the form <block
number><changed block contents> [...].

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



В списке pgsql-hackers по дате отправления:

Предыдущее
От: Andres Freund
Дата:
Сообщение: Re: Inefficient barriers on solaris with sun cc
Следующее
От: Marco Nenciarini
Дата:
Сообщение: Re: [RFC] Incremental backup v2: add backup profile to base backup