Re: block-level incremental backup

Поиск

Список

Период

Сортировка

От	Robert Haas
Тема	Re: block-level incremental backup
Дата	16 сентября 2019 г. 13:30:06
Msg-id	CA+TgmobLLRin67vH-vJtVP4AudJXDKYD+zOwkpThVzc0uo+ujw@mail.gmail.com обсуждение исходный текст
Ответ на	Re: block-level incremental backup (Amit Kapila <amit.kapila16@gmail.com>)
Ответы	Re: block-level incremental backup Re: block-level incremental backup Re: block-level incremental backup
Список	pgsql-hackers

Дерево обсуждения

On Mon, Sep 16, 2019 at 4:31 AM Amit Kapila <amit.kapila16@gmail.com> wrote:
> This seems to be a blocking problem for the LSN based design.

Well, only the simplest version of it, I think.

> Can we think of using creation time for file?  Basically, if the file
> creation time is later than backup-labels "START TIME:", then include
> that file entirely.  I think one big point against this is clock skew
> like what if somebody tinkers with the clock.  And also, this can
> cover cases like
> what Jeevan has pointed but might not cover other cases which we found
> problematic.

Well that would mean, for example, that if you copied the data
directory from one machine to another, the next "incremental" backup
would turn into a full backup. That sucks. And in other situations,
like resetting the clock, it could mean that you end up with a corrupt
backup without any real ability for PostgreSQL to detect it. I'm not
saying that it is impossible to create a practically useful system
based on file time stamps, but I really don't like it.

> I think the operations covered by WAL flag XLR_SPECIAL_REL_UPDATE will
> have similar problems.

I'm not sure quite what you mean by that.  Can you elaborate? It
appears to me that the XLR_SPECIAL_REL_UPDATE operations are all
things that create files, remove files, or truncate files, and the
sketch in my previous email would handle the first two of those cases
correctly.  See below for the third.

> One related point is how do incremental backups handle the case where
> vacuum truncates the relation partially?  Basically, with current
> patch/design, it doesn't appear that such information can be passed
> via incremental backup.  I am not sure if this is a problem, but it
> would be good if we can somehow handle this.

As to this, if you're taking a full backup of a particular file,
there's no problem.  If you're taking a partial backup of a particular
file, you need to include the current length of the file and the
identity and contents of each modified block.  Then you're fine.

> Isn't some operations where at the end we directly call heap_sync
> without writing WAL will have a similar problem as well?

Maybe.  Can you give an example?

> Similarly,
> it is not very clear if unlogged relations are handled in some way if
> not, the same could be documented.

I think that we don't need to back up the contents of unlogged
relations at all, right? Restoration from an online backup always
involves running recovery, and so unlogged relations will anyway get
zapped.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

В списке pgsql-hackers по дате отправления:

Вход в личный кабинет

Восстановление пароля

Подтверждение аккаунта

Изменение пароля

Re: block-level incremental backup