Re: finding changed blocks using WAL scanning

Поиск
Список
Период
Сортировка
От Peter Eisentraut
Тема Re: finding changed blocks using WAL scanning
Дата
Msg-id 12f979b7-33d2-5025-de7f-4b633846a691@2ndquadrant.com
обсуждение исходный текст
Ответ на finding changed blocks using WAL scanning  (Robert Haas <robertmhaas@gmail.com>)
Ответы Re: finding changed blocks using WAL scanning  (Robert Haas <robertmhaas@gmail.com>)
Список pgsql-hackers
On 2019-04-10 23:49, Robert Haas wrote:
> It seems to me that there are basically two ways of storing this kind
> of information, plus a bunch of variants.  One way is to store files
> that cover a range of LSNs, and basically contain a synopsis of the
> WAL for those LSNs.  You omit all the actual data and just mention
> which blocks were changed by some record in that part of the WAL.

That seems better than the other variant.

> Yet another question is how to make sure WAL doesn't get removed
> before we finish scanning it.  Peter mentioned on the other thread
> that we could use a variant replication slot, which immediately made
> me wonder why we'd need a variant.  Actually, the biggest problem I
> see here is that if we use a replication slot, somebody might try to
> drop it or use it for some other purpose, and that would mess things
> up.  I guess we could pull the usual trick of reserving names that
> start with 'pg_' for internal purposes.  Or we could just hard-code
> the LSN that was last scanned for this purpose as a bespoke constraint
> on WAL discard.  Not sure what is best.

The word "variant" was used as a hedge ;-), but now that I think about
it ...

I had in mind that you could have different overlapping incremental
backup jobs in existence at the same time.  Maybe a daily one to a
nearby disk and a weekly one to a faraway cloud.  Each one of these
would need a separate replication slot, so that the information that is
required for *that* incremental backup series is preserved between runs.
 So just one reserved replication slot that feeds the block summaries
wouldn't work.  Perhaps what would work is a flag on the replication
slot itself "keep block summaries for this slot".  Then when all the
slots with the block summary flag are past an LSN, you can clean up the
summaries before that LSN.

> I think all of this should be optional functionality.  It's going to
> be really useful for people with large databases, I think, but people
> with small databases may not care, and it won't be entirely free.  If
> it's not enabled, then the functionality that would otherwise exploit
> it can fall back to doing things in a less efficient way; nothing
> needs to break hard.

With the flag on the slot scheme you wouldn't need a separate knob to
turn this on, because it's just enabled when a backup software has
created an appropriate slot.

-- 
Peter Eisentraut              http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



В списке pgsql-hackers по дате отправления:

Предыдущее
От: Magnus Hagander
Дата:
Сообщение: Re: Switch TAP tests of pg_rewind to use role with only function permissions
Следующее
От: tushar
Дата:
Сообщение: Re: Minimal logical decoding on standbys