Re: Add recovery to pg_control and remove backup_label

Поиск
Список
Период
Сортировка
От David Steele
Тема Re: Add recovery to pg_control and remove backup_label
Дата
Msg-id 63fc7f6a-297a-4d21-908e-9cbcda451013@pgmasters.net
обсуждение исходный текст
Ответ на Re: Add recovery to pg_control and remove backup_label  (Andres Freund <andres@anarazel.de>)
Список pgsql-hackers
On 11/20/23 16:37, Andres Freund wrote:
> 
> On 2023-11-20 11:11:13 -0500, Robert Haas wrote:
>> I think we need more votes to make a change this big. I have a
>> concern, which I think I've expressed before, that we keep whacking
>> around the backup APIs, and that has a cost which is potentially
>> larger than the benefits.
> 
> +1.  The amount of whacking around in this area has been substantial, and it's
> hard for operators to keep up. And realistically, with data sizes today, the
> pressure to do basebackups with disk snapshots etc is not going to shrink.

True enough, but disk snapshots aren't really backups in themselves, in 
most scenarios, because they reside on the same storage as the cluster. 
Of course, snapshots can be exported, but that's also expensive.

I see snapshots as an adjunct to backups -- a safe backup offsite 
somewhere for DR and snapshots for day to day operations. Even so, 
managing snapshots as backups is harder than people think. It is easy to 
get wrong and end up with silent corruption.

> Leaving that concern aside, I am still on the fence about this proposal. I
> think it does decrease the chance of getting things wrong in the
> streaming-basebackup case. But for external backups, it seems almost
> universally worse (with the exception of the torn pg_control issue, that we
> also can address otherwise):

Why universally worse? The software stores pg_control instead of backup 
label. The changes to pg_basebackup were pretty trivial and the changes 
to external backup are pretty much the same, at least in my limited 
sample of one.

And I don't believe we have a satisfactory solution to the torn 
pg_control issue yet. Certainly it has not been committed and Thomas has 
shown enthusiasm for this approach, to the point of hoping it could be 
back patched (it can't).

> It doesn't reduce the risk of getting things wrong, you can still omit placing
> a file into the data directory and get silent corruption as a consequence. In
> addition, it's harder to see when looking at a base backup whether the process
> was right or not, because now the good and bad state look the same if you just
> look on the filesystem level!

This is one of the reasons I thought writing just the first 512 bytes of 
pg_control would be valuable. It would give an easy indicator that 
pg_control came from a backup. Michael was not in favor of conflating 
that change with this patch -- but I still think it's a good idea.

> Then there's the issue of making ad-hoc debugging harder by not having a
> human readable file with information anymore, including when looking at the
> history, via backup_label.old.

Yeah, you'd need to use pg_controldata instead. But as Michael has 
suggested, we could also write backup_label as backup_info so there is 
human-readable information available.

> Given that, I wonder if what we should do is to just add a new field to
> pg_control that says "error out if backup_label does not exist", that we set
> when creating a streaming base backup

I'm not in favor of a change only accessible to pg_basebackup or 
external software that can manipulate pg_control.

Regards,
-David



В списке pgsql-hackers по дате отправления:

Предыдущее
От: Tom Lane
Дата:
Сообщение: Re: PSQL error: total cell count of XXX exceeded
Следующее
От: Tom Lane
Дата:
Сообщение: Re: Locks on unlogged tables are locked?!