Re: Implementing incremental backup

Поиск
Список
Период
Сортировка
От Magnus Hagander
Тема Re: Implementing incremental backup
Дата
Msg-id CABUevEw+pnxV7zJ_oZe97v3SMr_0e_YD6aGbe-tJYVqxfyWKCQ@mail.gmail.com
обсуждение исходный текст
Ответ на Re: Implementing incremental backup  (Alvaro Herrera <alvherre@2ndquadrant.com>)
Список pgsql-hackers
On Thu, Jun 20, 2013 at 12:18 AM, Alvaro Herrera
<alvherre@2ndquadrant.com> wrote:
> Claudio Freire escribió:
>> On Wed, Jun 19, 2013 at 6:20 PM, Stephen Frost <sfrost@snowman.net> wrote:
>> > * Claudio Freire (klaussfreire@gmail.com) wrote:
>> >> I don't see how this is better than snapshotting at the filesystem
>> >> level. I have no experience with TB scale databases (I've been limited
>> >> to only hundreds of GB), but from my limited mid-size db experience,
>> >> filesystem snapshotting is pretty much the same thing you propose
>> >> there (xfs_freeze), and it works pretty well. There's even automated
>> >> tools to do that, like bacula, and they can handle incremental
>> >> snapshots.
>> >
>> > Large databases tend to have multiple filesystems and getting a single,
>> > consistent, snapshot across all of them while under load is..
>> > 'challenging'.  It's fine if you use pg_start/stop_backup() and you're
>> > saving the XLOGs off, but if you can't do that..
>>
>> Good point there.
>>
>> I still don't like the idea of having to mark each modified page. The
>> WAL compressor idea sounds a lot more workable. As in scalable.
>
> There was a project that removed "useless" WAL records from the stream,
> to make it smaller and useful for long-term archiving.  It only removed
> FPIs as far as I recall.  It's dead now, and didn't compile on recent
> (9.1?) Postgres because of changes in the WAL structs, IIRC.
>
> This doesn't help if you have a large lot of UPDATEs that touch the same
> set of rows over and over, though.  Tatsuo-san's proposal would allow
> this use-case to work nicely because you only keep one copy of such
> data, not one for each modification.
>
> If you have the two technologies, you could teach them to work in
> conjunction: you set up WAL replication, and tell the WAL compressor to
> prune updates for high-update tables (avoid useless traffic), then use
> incremental backup to back these up.  This seems like it would have a
> lot of moving parts and be rather bug-prone, though.

Just as a datapoint, I think this is basically what at least some
other database engine (sqlserver) calls "incremental" vs
"differential" backup.

"Differential" backup keep tracks of which blocks have changed (by one
way or another - maybe as simple as the LSN, but it doesn't matter
how, really) and backs up just those blocks (diffed back to the base
backup).

"Incremental" does the transaction log, which is basically what we do
with log archiving except it's not done in realtime - it's all saved
on the master until the backup command runs.

Of course, it's quite a been a few years since I set up one of those
in anger, so disclaimer for that info being out of date :)

Didn't pg_rman try to do something based on the page LSN to achieve
something similar to this?

--Magnus HaganderMe: http://www.hagander.net/Work: http://www.redpill-linpro.com/



В списке pgsql-hackers по дате отправления:

Предыдущее
От: Thomas Munro
Дата:
Сообщение: Re: Re: Adding IEEE 754:2008 decimal floating point and hardware support for it
Следующее
От: Dimitri Fontaine
Дата:
Сообщение: Re: event trigger API documentation?