Re: block-level incremental backup

Поиск
Список
Период
Сортировка
От Stephen Frost
Тема Re: block-level incremental backup
Дата
Msg-id 20190422182640.GM6197@tamriel.snowman.net
обсуждение исходный текст
Ответ на Re: block-level incremental backup  (Robert Haas <robertmhaas@gmail.com>)
Ответы Re: block-level incremental backup  (Andres Freund <andres@anarazel.de>)
Re: block-level incremental backup  (Robert Haas <robertmhaas@gmail.com>)
Список pgsql-hackers
Greetings,

* Robert Haas (robertmhaas@gmail.com) wrote:
> On Mon, Apr 22, 2019 at 1:08 PM Stephen Frost <sfrost@snowman.net> wrote:
> > > One could instead just do a straightforward extension
> > > to the existing BASE_BACKUP command to enable incremental backup.
> >
> > Ok, how do you envision that?  As I mentioned up-thread, I am concerned
> > that we're talking too high-level here and it's making the discussion
> > more difficult than it would be if we were to put together specific
> > ideas and then discuss them.
> >
> > One way I can imagine to extend BASE_BACKUP is by adding LSN as an
> > optional parameter and then having the database server scan the entire
> > cluster and send a tarball which contains essentially a 'diff' file of
> > some kind for each file where we can construct a diff based on the LSN,
> > and then the complete contents of the file for everything else that
> > needs to be in the backup.
>
> /me scratches head.  Isn't that pretty much what I described in my
> original post?  I even described what that "'diff' file of some kind"
> would look like in some detail in the paragraph of that emailed
> numbered "2.", and I described the reasons for that choice at length
> in http://postgr.es/m/CA+TgmoZrqdV-tB8nY9P+1pQLqKXp5f1afghuoHh5QT6ewdkJ6g@mail.gmail.com
>
> I can't figure out how I'm managing to be so unclear about things
> about which I thought I'd been rather explicit.

There was basically zero discussion about what things would look like at
a protocol level (I went back and skimmed over the thread before sending
my last email to specifically see if I was going to get this response
back..).  I get the idea behind the diff file, the contents of which I
wasn't getting into above.

> > So, sure, that would work, but it wouldn't be able to be parallelized
> > and I don't think it'd end up being very exciting for the external tools
> > because of that, but it would be fine for pg_basebackup.
>
> Stop being such a pessimist.  Yes, if we only add the option to the
> BASE_BACKUP command, it won't directly be very exciting for external
> tools, but a lot of the work that is needed to do things that ARE
> exciting for external tools will have been done.  For instance, if the
> work to figure out which blocks have been modified via WAL-scanning
> gets done, and initially that's only exposed via BASE_BACKUP, it won't
> be much work for somebody to write code for a new code that exposes
> that information directly through some new replication command.
> There's a difference between something that's going in the wrong
> direction and something that's going in the right direction but not as
> far or as fast as you'd like.  And I'm 99% sure that everything I'm
> proposing here falls in the latter category rather than the former.

I didn't mean to imply that you're doing in the wrong direction here and
I thought I said somewhere in my last email more-or-less exactly the
same, that a great deal of the work needed for block-level incremental
backup would be done, but specifically that this proposal wouldn't allow
external tools to leverage that.  It sounds like what you're suggesting
now is that you're happy to implement the backend code, expose it in a
way that works just for pg_basebackup, and that if someone else wants to
add things to the protocol to make it easier for external tools to
leverage, great.  All I can say is that that's basically how we ended up
in the situation we're in today where pg_basebackup doesn't support
parallel backup but a bunch of external tools do and they don't go
through the backend to get there, even though they'd probably prefer to.

> > On the other hand, if you added new commands for 'list of files changed
> > since this LSN' and 'give me this file' and 'give me this file with the
> > changes in it since this LSN', then pg_basebackup could work with that
> > pretty easily in a single-threaded model (maybe with two connections to
> > the backend, but still in a single process, or maybe just by slurping up
> > the file list and then asking for each one) and the external tools could
> > leverage those new capabilities too for their backups, both full backups
> > and incremental ones.  This also wouldn't have to change how
> > pg_basebackup does full backups today one bit, so what we're really
> > talking about here is the direction to take the new code that's being
> > written, not about rewriting existing code.  I agree that it'd be a bit
> > more work...  but hopefully not *that* much more, and it would mean we
> > could later add parallel backup to pg_basebackup more easily too, if we
> > wanted to.
>
> For purposes of implementing parallel pg_basebackup, it would probably
> be better if the server rather than the client decided which files to
> send via which connection.  If the client decides, then every time the
> server finishes sending a file, the client has to request another
> file, and that introduces some latency: after the server finishes
> sending each file, it has to wait for the client to finish receiving
> the data, and it has to wait for the client to tell it what file to
> send next.  If the server decides, then it can just send data at top
> speed without a break.  So the ideal interface for pg_basebackup would
> really be something like:
>
> START_PARALLEL_BACKUP blah blah PARTICIPANTS 4;
>
> ...returning a cookie that can be then be used by each participant for
> an argument to a new commands:
>
> JOIN_PARALLLEL_BACKUP 'cookie';
>
> However, that is obviously extremely inconvenient for third-party
> tools.  It's possible we need both an interface like this -- for use
> by parallel pg_basebackup -- and a
> START_BACKUP/SEND_FILE_LIST/SEND_FILE_CONTENTS/STOP_BACKUP type
> interface for use by external tools.  On the other hand, maybe the
> additional overhead caused by managing the list of files to be fetched
> on the client side is negligible.  It'd be interesting to see, though,
> how busy the server is when running an incremental backup managed by
> an external tool like BART or pgbackrest on a cluster with a gazillion
> little-tiny relations.  I wonder if we'd find that it spends most of
> its time waiting for the client.

Thanks for sharing your thoughts on that, certainly having the backend
able to be more intelligent about streaming files to avoid latency is
good and possibly the best approach.  Another alternative to reducing
the latency would be to have a way for the client to request a set of
files, but I don't know that it'd be better.

I'm not really sure why the above is extremely inconvenient for
third-party tools, beyond just that they've already been written to work
with an assumption that the server-side of things isn't as intelligent
as PG is.

> > What I'm afraid will be lackluster is adding block-level incremental
> > backup support to pg_basebackup without any support for managing
> > backups or anything else.  I'm also concerned that it's going to mean
> > that people who want to use incremental backup with pg_basebackup are
> > going to have to write a lot of their own management code (probably in
> > shell scripts and such...) around that and if they get anything wrong
> > there then people are going to end up with bad backups that they can't
> > restore from, or they'll have corrupted clusters if they do manage to
> > get them restored.
>
> I think that this is another complaint that basically falls into the
> category of saying that this proposal might not fix everything for
> everybody, but that complaint could be levied against any reasonable
> development proposal.

I'm disappointed that the concerns about the trouble that end users are
likely to have with this didn't garner more discussion.

Thanks,

Stephen

Вложения

В списке pgsql-hackers по дате отправления:

Предыдущее
От: Andres Freund
Дата:
Сообщение: Re: clean up docs for v12
Следующее
От: Tom Lane
Дата:
Сообщение: Re: pg_dump is broken for partition tablespaces