Re: block-level incremental backup

Поиск
Список
Период
Сортировка
От Robert Haas
Тема Re: block-level incremental backup
Дата
Msg-id CA+TgmobLLRin67vH-vJtVP4AudJXDKYD+zOwkpThVzc0uo+ujw@mail.gmail.com
обсуждение исходный текст
Ответ на Re: block-level incremental backup  (Amit Kapila <amit.kapila16@gmail.com>)
Ответы Re: block-level incremental backup  (Stephen Frost <sfrost@snowman.net>)
Re: block-level incremental backup  (Robert Haas <robertmhaas@gmail.com>)
Re: block-level incremental backup  (Amit Kapila <amit.kapila16@gmail.com>)
Список pgsql-hackers
On Mon, Sep 16, 2019 at 4:31 AM Amit Kapila <amit.kapila16@gmail.com> wrote:
> This seems to be a blocking problem for the LSN based design.

Well, only the simplest version of it, I think.

> Can we think of using creation time for file?  Basically, if the file
> creation time is later than backup-labels "START TIME:", then include
> that file entirely.  I think one big point against this is clock skew
> like what if somebody tinkers with the clock.  And also, this can
> cover cases like
> what Jeevan has pointed but might not cover other cases which we found
> problematic.

Well that would mean, for example, that if you copied the data
directory from one machine to another, the next "incremental" backup
would turn into a full backup. That sucks. And in other situations,
like resetting the clock, it could mean that you end up with a corrupt
backup without any real ability for PostgreSQL to detect it. I'm not
saying that it is impossible to create a practically useful system
based on file time stamps, but I really don't like it.

> I think the operations covered by WAL flag XLR_SPECIAL_REL_UPDATE will
> have similar problems.

I'm not sure quite what you mean by that.  Can you elaborate? It
appears to me that the XLR_SPECIAL_REL_UPDATE operations are all
things that create files, remove files, or truncate files, and the
sketch in my previous email would handle the first two of those cases
correctly.  See below for the third.

> One related point is how do incremental backups handle the case where
> vacuum truncates the relation partially?  Basically, with current
> patch/design, it doesn't appear that such information can be passed
> via incremental backup.  I am not sure if this is a problem, but it
> would be good if we can somehow handle this.

As to this, if you're taking a full backup of a particular file,
there's no problem.  If you're taking a partial backup of a particular
file, you need to include the current length of the file and the
identity and contents of each modified block.  Then you're fine.

> Isn't some operations where at the end we directly call heap_sync
> without writing WAL will have a similar problem as well?

Maybe.  Can you give an example?

> Similarly,
> it is not very clear if unlogged relations are handled in some way if
> not, the same could be documented.

I think that we don't need to back up the contents of unlogged
relations at all, right? Restoration from an online backup always
involves running recovery, and so unlogged relations will anyway get
zapped.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



В списке pgsql-hackers по дате отправления:

Предыдущее
От: Arseny Sher
Дата:
Сообщение: Re: (Re)building index using itself or another index of the same table
Следующее
От: Tom Lane
Дата:
Сообщение: Re: [PATCH] Improve performance of NOTIFY over many databases (v2)