Re: Add notes to pg_combinebackup docs

Поиск
Список
Период
Сортировка
От David Steele
Тема Re: Add notes to pg_combinebackup docs
Дата
Msg-id 23f0723c-11ef-4825-bb9f-bd7c28d6b994@pgmasters.net
обсуждение исходный текст
Ответ на Re: Add notes to pg_combinebackup docs  (Tomas Vondra <tomas.vondra@enterprisedb.com>)
Ответы Re: Add notes to pg_combinebackup docs  (Magnus Hagander <magnus@hagander.net>)
Список pgsql-hackers

On 4/11/24 20:51, Tomas Vondra wrote:
> On 4/11/24 02:01, David Steele wrote:
>>
>> I have a hard time seeing this feature as being very useful, especially
>> for large databases, until pg_combinebackup works on tar (and compressed
>> tar). Right now restoring an incremental requires at least twice the
>> space of the original cluster, which is going to take a lot of users by
>> surprise.
> 
> I do agree it'd be nice if pg_combinebackup worked with .tar directly,
> without having to extract the directories first. No argument there, but
> as I said in the other thread, I believe that's something we can add
> later. That's simply how incremental development works.

OK, sure, but if the plan is to make it practical later doesn't that 
make the feature something to be avoided now?

>> I know you have made some improvements here for COW filesystems, but my
>> experience is that Postgres is generally not run on such filesystems,
>> though that is changing a bit.
> 
> I'd say XFS is a pretty common choice, for example. And it's one of the
> filesystems that work great with pg_combinebackup.

XFS has certainly advanced more than I was aware.

> However, who says this has to be the filesystem the Postgres instance
> runs on? Who in their right mind put backups on the same volume as the
> instance anyway? At which point it can be a different filesystem, even
> if it's not ideal for running the database.

My experience is these days backups are generally placed in object 
stores. Sure, people are still using NFS but admins rarely have much 
control over those volumes. They may or not be COW filesystems.

> FWIW I think it's fine to tell users that to minimize the disk space
> requirements, they should use a CoW filesystem and --copy-file-range.
> The docs don't say that currently, that's true.

That would probably be a good addition to the docs.

> All of this also depends on how people do the restore. With the CoW
> stuff they can do a quick (and small) copy on the backup server, and
> then copy the result to the actual instance. Or they can do restore on
> the target directly (e.g. by mounting a r/o volume with backups), in
> which case the CoW won't really help.

And again, this all requires a significant amount of setup and tooling. 
Obviously I believe good backup requires effort but doing this right 
gets very complicated due to the limitations of the tool.

> But yeah, having to keep the backups as expanded directories is not
> great, I'd love to have .tar. Not necessarily because of the disk space
> (in my experience the compression in filesystems works quite well for
> this purpose), but mostly because it's more compact and allows working
> with backups as a single piece of data (e.g. it's much cleared what the
> checksum of a single .tar is, compared to a directory).

But again, object stores are commonly used for backup these days and 
billing is based on data stored rather than any compression that can be 
done on the data. Of course, you'd want to store the compressed tars in 
the object store, but that does mean storing an expanded copy somewhere 
to do pg_combinebackup.

But if the argument is that all this can/will be fixed in the future, I 
guess the smart thing for users to do is wait a few releases for 
incremental backups to become a practical feature.

Regards,
-David



В списке pgsql-hackers по дате отправления:

Предыдущее
От: Andres Freund
Дата:
Сообщение: Re: Should we add a compiler warning for large stack frames?
Следующее
От: Alexander Korotkov
Дата:
Сообщение: Re: Table AM Interface Enhancements