Re: Remove Deprecated Exclusive Backup Mode

Поиск
Список
Период
Сортировка
От Stephen Frost
Тема Re: Remove Deprecated Exclusive Backup Mode
Дата
Msg-id 20190225142525.GO6197@tamriel.snowman.net
обсуждение исходный текст
Ответ на Re: Remove Deprecated Exclusive Backup Mode  (Laurenz Albe <laurenz.albe@cybertec.at>)
Список pgsql-hackers
Greetings,

* Laurenz Albe (laurenz.albe@cybertec.at) wrote:
> Stephen Frost wrote:
> > > It will be annoying if after this removal, companies must change their
> > > backup strategy by using specific postgres tools (pgbackrest, barman).
> >
> > You don't write your own database system using CSV files and shell
> > magic, do you?  I have to say that it continues to boggle my mind that
> > people insist that *this* part of the system has to be able to be
> > implementable using shell scripts.
> >
> > Folks, these are your backups we're talking about, your last resort if
> > everything else goes up in flames, why do you want to risk that by
> > implementing your own one-off solution, particularly when there's known
> > serious issues using that interface, and you want to just use shell
> > scripts to do it...?
>
> If we didn't think that simplicity in backup has some value, why does
> PostgreSQL provide pg_basebackup?

I'm all for people using pg_basebackup and encourage them to do so, but,
really, that tool *is* too simple by itself: you need to combine it with
something else that handles data retention, backup validation, etc.

This could be some independent backup solution though, since
pg_basebackup gives you a simple set of files for the backup that can
then be backed up using those other backup tools.  (The same is true of
pgbackrest, btw, and people certainly do use pgbackrest to back up to a
repository and then back that repo up using other tools, but in this
case pgbackrest does have some of those data retention and backup
validation capabilities itself, so it isn't necessary to do that.)

> Not everybody wants to use tools like pgbackrest and barman because it
> takes some effort to set them up properly.  It seems that you think that
> people who want something simpler are irresponsible.

I don't mind simpler (we actually try pretty hard to make pgbackrest
simple to use, in fact...  the simplest config only requires a 4-line
config file), provided that it actually does everything that you
need a backup tool to do.  I do think it's an unfortunate reality that
not enough people take the time to think about what they really need
from a backup solution and spend the time to either find or build a
proper solution.

> Incidentally, both barman and pgbackrest stream the backup to a backup server.

I'm pretty sure both can backup to a local repository too (I know
pgbackrest can), and not just stream to a backup server, so I'm not
really sure what you're getting at here.

> I think the most important use case for the low level backup API is when
> your database is so large that you want to leverage storage techniques
> like snapshots for backup, because you can't or don't want to stream all
> those terabytes across the network.

Snapshots place an awful lot of faith in your storage layer-
particularly when you aren't doing any kind of validation that the
snapshot is in good shape and that there hasn't been any in-place latent
corruption, ever.  If you think about it, a snapshot is actually very
similar to just always doing incremental backups with pgbackrest and
never doing a new full backup or even a check of those old files that
haven't changed to see if they're still valid.  It's not an approach I'd
recommend depending on exclusively.

There's ways to address that, of course, such as building a manifest of
checksums of the files in the data directory, such that you can then
validate that the snapshot is correct, and detect if any corruption ends
up happening (and checksum the manifest as well), though you need to
also track the LSN of the pg_start_backup, so you can know that changes
which happened after that LSN are in the WAL that's captured during the
checksum, and you need a place to put all of this for when you want to
go back and re-validate an old snapshot..  and likely other things that
I'm forgetting at the moment, but it's definitely something we've
thought about and have considered how we might add support to pgbackrest
for working with snapshots (and it wouldn't surprise me if that got
implemented before v13 even..).

If you are happy to place complete faith in snapshots and trust that
they're entirely atomic, you can always forgo dealing with any of this
and just snapshot PGDATA and let PG go through crash recovery when you
restore.

> I'm not playing devil's advocate here to annoy you.  I see the problems
> with the exclusive backup, and I see how it can hurt people.
> I just think that removing exclusive backup without some kind of help
> like Andres sketched above will make people unhappy.

Keeping it will definitely make me unhappy. :)

I'm not against something simple being added to help with the new API,
or with keeping the old API if someone can figure out how to solve for
the known issues with it, just to be clear.  I do worry that the
'simple' thing we add either won't help in a lot of cases or will end up
being much more complex and difficult to use than we're thinking, but we
can discuss that if someone actually implements it, I suppose.

Thanks!

Stephen

Вложения

В списке pgsql-hackers по дате отправления:

Предыдущее
От: Laurenz Albe
Дата:
Сообщение: Re: Remove Deprecated Exclusive Backup Mode
Следующее
От: Dean Rasheed
Дата:
Сообщение: Re: INSERT ... OVERRIDING USER VALUE vs GENERATED ALWAYS identity columns