Re: trying again to get incremental backup

Поиск
Список
Период
Сортировка
От Robert Haas
Тема Re: trying again to get incremental backup
Дата
Msg-id CA+Tgmoad7igbt46K+JHG6UEZ56Y5SVBYq5e7OjcKkgD3SStNmg@mail.gmail.com
обсуждение исходный текст
Ответ на Re: trying again to get incremental backup  (Jakub Wartak <jakub.wartak@enterprisedb.com>)
Список pgsql-hackers
On Fri, Oct 20, 2023 at 9:20 AM Jakub Wartak
<jakub.wartak@enterprisedb.com> wrote:
> Okay, so another good news - related to the patch version #4.
> Not-so-tiny stress test consisting of pgbench run for 24h straight
> (with incremental backups every 2h, with base of initial full backup),
> followed by two PITRs (one not using incremental backup and one using
> to to illustrate the performance point - and potentially spot any
> errors in between). In both cases it worked fine.

This is great testing, thanks. What might be even better is to test
whether the resulting backups are correct, somehow.

> I've just noticed one thing when recovery is progress: is
> summarization working during recovery - in the background - an
> expected behaviour? I'm wondering about that, because after freshly
> restored and recovered DB, one would need to create a *new* full
> backup and only from that point new summaries would have any use?

Actually, I think you could take an incremental backup relative to a
full backup from a previous timeline.

But the question of what summarization ought to do (or not do) during
recovery, and whether it ought to be enabled by default, and what the
retention policy ought to be are very much live ones. Right now, it's
enabled by default and keeps summaries for a week, assuming you don't
reset your local clock and that it advances at the same speed as the
universe's own clock. But that's all debatable. Any views?

Meanwhile, here's a new patch set. I went ahead and committed the
first two preparatory patches, as I said earlier that I intended to
do. And here I've adjusted the main patch, which is now 0003, for the
addition of XLOG_CHECKPOINT_REDO, which permitted me to simplify a few
things. wal_summarize_mb now feels like a bit of a silly GUC --
presumably you'd never care, unless you had an absolutely gigantic
inter-checkpoint WAL distance. And if you have that, maybe you should
also have enough memory to summarize all that WAL. Or maybe not:
perhaps it's better to write WAL summaries more than once per
checkpoint when checkpoints are really big. But I'm worried that the
GUC will become a source of needless confusion for users. For most
people, it seems like emitting one summary per checkpoint should be
totally fine, and they might prefer a simple Boolean GUC,
summarize_wal = true | false, over this. I'm just not quite sure about
the corner cases.

--
Robert Haas
EDB: http://www.enterprisedb.com

Вложения

В списке pgsql-hackers по дате отправления:

Предыдущее
От: Pavel Stehule
Дата:
Сообщение: Re: PostgreSQL domains and NOT NULL constraint
Следующее
От: Nathan Bossart
Дата:
Сообщение: Re: recovery modules