Re: block-level incremental backup

Поиск
Список
Период
Сортировка
От Robert Haas
Тема Re: block-level incremental backup
Дата
Msg-id CA+TgmobZfdfdmL2m-QcY++LERZYtgqaomeEAapbdHiooJycKWQ@mail.gmail.com
обсуждение исходный текст
Ответ на Re: block-level incremental backup  (Stephen Frost <sfrost@snowman.net>)
Ответы Re: block-level incremental backup  (Konstantin Knizhnik <k.knizhnik@postgrespro.ru>)
Re: block-level incremental backup  (Stephen Frost <sfrost@snowman.net>)
Re: block-level incremental backup  (Anastasia Lubennikova <a.lubennikova@postgrespro.ru>)
Список pgsql-hackers
On Sat, Apr 20, 2019 at 4:32 PM Stephen Frost <sfrost@snowman.net> wrote:
> Having been around for a while working on backup-related things, if I
> was to implement the protocol for pg_basebackup today, I'd definitely
> implement "give me a list" and "give me this file" rather than the
> tar-based approach, because I've learned that people want to be
> able to do parallel backups and that's a decent way to do that.  I
> wouldn't set out and implement something new that's there's just no hope
> of making parallel.  Maybe the first write of pg_basebackup would still
> be simple and serial since it's certainly more work to make a frontend
> tool like that work in parallel, but at least the protocol would be
> ready to support a parallel option being added alter without being
> rewritten.
>
> And that's really what I was trying to get at here- if we've got the
> choice now to decide what this is going to look like from a protocol
> level, it'd be great if we could make it able to support being used in a
> parallel fashion, even if pg_basebackup is still single-threaded.

I think we're getting closer to a meeting of the minds here, but I
don't think it's intrinsically necessary to rewrite the whole method
of operation of pg_basebackup to implement incremental backup in a
sensible way.  One could instead just do a straightforward extension
to the existing BASE_BACKUP command to enable incremental backup.
Then, to enable parallel full backup and all sorts of out-of-core
hacking, one could expand the command language to allow tools to
access individual steps: START_BACKUP, SEND_FILE_LIST,
SEND_FILE_CONTENTS, STOP_BACKUP, or whatever.  The second thing makes
for an appealing project, but I do not think there is a technical
reason why it has to be done first.  Or for that matter why it has to
be done second.  As I keep saying, incremental backup and full backup
are separate projects and I believe it's completely reasonable for
whoever is doing the work to decide on the order in which they would
like to do the work.

Having said that, I'm curious what people other than Stephen (and
other pgbackrest hackers) think about the relative value of parallel
backup vs. incremental backup.  Stephen appears quite convinced that
parallel backup is full of win and incremental backup is a bit of a
yawn by comparison, and while I certainly would not want to discount
the value of his experience in this area, it sometimes happens on this
mailing list that [ drum roll please ] not everybody agrees about
everything.  So, what do other people think?

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



В списке pgsql-hackers по дате отправления:

Предыдущее
От: Tom Lane
Дата:
Сообщение: Re: jsonpath
Следующее
От: Alexander Korotkov
Дата:
Сообщение: Re: jsonpath