Re: finding changed blocks using WAL scanning

Поиск
Список
Период
Сортировка
От Kyotaro HORIGUCHI
Тема Re: finding changed blocks using WAL scanning
Дата
Msg-id 20190412.095933.31382486.horiguchi.kyotaro@lab.ntt.co.jp
обсуждение исходный текст
Ответ на Re: finding changed blocks using WAL scanning  (Ashwin Agrawal <aagrawal@pivotal.io>)
Список pgsql-hackers
At Thu, 11 Apr 2019 10:00:35 -0700, Ashwin Agrawal <aagrawal@pivotal.io> wrote in
<CALfoeis0qOyGk+KQ3AbkfRVv=XbsSecqHfKSag=i_SLWMT+B0A@mail.gmail.com>
> On Thu, Apr 11, 2019 at 6:27 AM Robert Haas <robertmhaas@gmail.com> wrote:
> 
> > On Thu, Apr 11, 2019 at 3:52 AM Peter Eisentraut
> > <peter.eisentraut@2ndquadrant.com> wrote:
> > > I had in mind that you could have different overlapping incremental
> > > backup jobs in existence at the same time.  Maybe a daily one to a
> > > nearby disk and a weekly one to a faraway cloud.  Each one of these
> > > would need a separate replication slot, so that the information that is
> > > required for *that* incremental backup series is preserved between runs.
> > >  So just one reserved replication slot that feeds the block summaries
> > > wouldn't work.  Perhaps what would work is a flag on the replication
> > > slot itself "keep block summaries for this slot".  Then when all the
> > > slots with the block summary flag are past an LSN, you can clean up the
> > > summaries before that LSN.
> >
> > I don't think that quite works.  There are two different LSNs.  One is
> > the LSN of the oldest WAL archive that we need to keep around so that
> > it can be summarized, and the other is the LSN of the oldest summary
> > we need to keep around so it can be used for incremental backup
> > purposes.  You can't keep both of those LSNs in the same slot.
> > Furthermore, the LSN stored in the slot is defined as the amount of
> > WAL we need to keep, not the amount of something else (summaries) that
> > we need to keep.  Reusing that same field to mean something different
> > sounds inadvisable.
> >
> > In other words, I think there are two problems which we need to
> > clearly separate: one is retaining WAL so we can generate summaries,
> > and the other is retaining summaries so we can generate incremental
> > backups.  Even if we solve the second problem by using some kind of
> > replication slot, we still need to solve the first problem somehow.
> 
> Just a thought for first problem, may not to simpler, can replication slot
> be enhanced to define X amount of WAL to retain, after reaching such limit
> collect summary and let the WAL be deleted.

I think Peter is saying that a slot for block summary doesn't
keep WAL segments themselves, but keeps maybe segmented block
summaries.  n block-summary-slots maintains n block summaries and
the newest block summary is "active", in other words,
continuously updated by WAL records pass-by. When backup-tool
requests for block summary, for example, for the oldest slot, the
acitve summary is closed then a new summary is opened from the
LSN at the time, which is the new LSN of the slot. Then the
concatenated block summary is sent. Finally the oldest summary is
removed.

regards.

-- 
Kyotaro Horiguchi
NTT Open Source Software Center




В списке pgsql-hackers по дате отправления:

Предыдущее
От: Tatsuo Ishii
Дата:
Сообщение: Adding Unix domain socket path and port topg_stat_get_wal_senders()
Следующее
От: Euler Taveira
Дата:
Сообщение: Re: Adding Unix domain socket path and port to pg_stat_get_wal_senders()