Re: block-level incremental backup

Поиск
Список
Период
Сортировка
От Gary M
Тема Re: block-level incremental backup
Дата
Msg-id CAGwOJnxsK1vQdT94BKGHY+z-gx7C1BQfMfYhH9uXwuSB_KQMgA@mail.gmail.com
обсуждение исходный текст
Ответ на Re: block-level incremental backup  (Andres Freund <andres@anarazel.de>)
Список pgsql-hackers
Having worked in the data storage industry since the '80s, I think backup is an important capability. Having said that, the ideas should be expanded to an overall data management strategy combining local and remote storage including cloud.

From my experience, record and transaction consistency is critical to any replication action, including backup.  The approach commonly includes a starting baseline, snapshot if you prefer, and a set of incremental changes to the snapshot.  I always used the transaction logs for both backup and remote replication to other DBMS. In standard ECMA-208 @94, you will note a file object with a transaction property. Although the language specifies files, a file may be any set of records.

SAN based snapshots usually occur on the SAN storage device, meaning if cached data (unwritten to disk) will not be snapshotted or inconsistently reference and likely result in a corrupted database on restore. 

Snapshots are point in time states of storage objects. Between snapshot periods, any number of changes many occur.  If a record of "all changes" are required, snapshot methods must be augmented with a historical record.. the transaction log.     

 Delta block methods for backups have been in practice for many years. ZFS had adopted the practice for block management. The ability of incremental backups, whether block, transactions or other methods, is dependent on prior data. Like primary storage, backup media can fail, become lost and be inadvertently corrupted. The result of incremental data backup loss is the restored data after the point of loss is likely corrupted.

cheers, 
garym

On Tue, Apr 9, 2019 at 10:35 AM Andres Freund <andres@anarazel.de> wrote:
Hi,

On 2019-04-09 11:48:38 -0400, Robert Haas wrote:
> 2. When you use pg_basebackup in this way, each relation file that is
> not sent in its entirety is replaced by a file with a different name.
> For example, instead of base/16384/16417, you might get
> base/16384/partial.16417 or however we decide to name them.

Hm. But that means that files that are shipped nearly in their entirety,
need to be fully rewritten. Wonder if it's better to ship them as files
with holes, and have the metadata in a separate file. That'd then allow
to just fill in the holes with data from the older version.  I'd assume
that there's a lot of workloads where some significantly sized relations
will get updated in nearly their entirety between backups.


> Each such file will store near the beginning of the file a list of all the
> blocks contained in that file, and the blocks themselves will follow
> at offsets that can be predicted from the metadata at the beginning of
> the file.  The idea is that you shouldn't have to read the whole file
> to figure out which blocks it contains, and if you know specifically
> what blocks you want, you should be able to reasonably efficiently
> read just those blocks.  A backup taken in this manner should also
> probably create some kind of metadata file in the root directory that
> stops the server from starting and lists other salient details of the
> backup.  In particular, you need the threshold LSN for the backup
> (i.e. contains blocks newer than this) and the start LSN for the
> backup (i.e. the LSN that would have been returned from
> pg_start_backup).

I wonder if we shouldn't just integrate that into pg_control or such. So
that:

> 3. There should be a new tool that knows how to merge a full backup
> with any number of incremental backups and produce a complete data
> directory with no remaining partial files.

Could just be part of server startup?


> - I imagine that the server would offer this functionality through a
> new replication command or a syntax extension to an existing command,
> so it could also be used by tools other than pg_basebackup if they
> wished.

Would this logic somehow be usable from tools that don't want to copy
the data directory via pg_basebackup (e.g. for parallelism, to directly
send to some backup service / SAN / whatnot)?


> - It would also be nice if pg_basebackup could write backups to places
> other than the local disk, like an object store, a tape drive, etc.
> But that also sounds like a separate effort.

Indeed seems separate. But worthwhile.

Greetings,

Andres Freund


В списке pgsql-hackers по дате отправления:

Предыдущее
От: Robbie Harwood
Дата:
Сообщение: Re: [PATCH v20] GSSAPI encryption support
Следующее
От: Robert Haas
Дата:
Сообщение: Re: block-level incremental backup