Re: finding changed blocks using WAL scanning

Поиск
Список
Период
Сортировка
От Stephen Frost
Тема Re: finding changed blocks using WAL scanning
Дата
Msg-id 20190420003951.GQ6197@tamriel.snowman.net
обсуждение исходный текст
Ответ на Re: finding changed blocks using WAL scanning  (Robert Haas <robertmhaas@gmail.com>)
Ответы Re: finding changed blocks using WAL scanning  (Robert Haas <robertmhaas@gmail.com>)
Список pgsql-hackers
Greetings,

* Robert Haas (robertmhaas@gmail.com) wrote:
> On Mon, Apr 15, 2019 at 11:45 PM Michael Paquier <michael@paquier.xyz> wrote:
> > Any caller of XLogWrite() could switch to a new segment once the
> > current one is done, and I am not sure that we would want some random
> > backend to potentially slow down to do that kind of operation.
> >
> > Or would a separate background worker do this work by itself?  An
> > external tool can do that easily already:
> > https://github.com/michaelpq/pg_plugins/tree/master/pg_wal_blocks
>
> I was thinking that a dedicated background worker would be a good
> option, but Stephen Frost seems concerned (over on the other thread)
> about how much load that would generate.  That never really occurred
> to me as a serious issue and I suspect for many people it wouldn't be,
> but there might be some.

While I do think we should at least be thinking about the load caused
from scanning the WAL to generate a list of blocks that are changed, the
load I was more concerned with in the other thread is the effort
required to actually merge all of those changes together over a large
amount of WAL.  I'm also not saying that we couldn't have either of
those pieces done as a background worker, just that it'd be really nice
to have an external tool (or library) that can be used on an independent
system to do that work.

> It's cool that you have a command-line tool that does this as well.
> Over there, it was also discussed that we might want to have both a
> command-line tool and a background worker.  I think, though, that we
> would want to get the output in some kind of compressed binary format,
> rather than text.  e.g.
>
> 4-byte database OID
> 4-byte tablespace OID
> any number of relation OID/block OID pairings for that
> database/tablespace combination
> 4-byte zero to mark the end of the relation OID/block OID list
> and then repeat all of the above any number of times

I agree that we'd like to get the data in a binary format of some kind.

> That might be too dumb and I suspect we want some headers and a
> checksum, but we should try to somehow exploit the fact that there
> aren't likely to be many distinct databases or many distinct
> tablespaces mentioned -- whereas relation OID and block number will
> probably have a lot more entropy.

I'm not remembering exactly where this idea came from, but I don't
believe it's my own (and I think there's some tool which already does
this..  maybe it's rsync?), but I certainly don't think we want to
repeat the relation OID for every block, and I don't think we really
want to store a block number for every block.  Instead, something like:

4-byte database OID
4-byte tablespace OID
relation OID

starting-ending block numbers
bitmap covering range of blocks
starting-ending block numbers
bitmap covering range of blocks
4-byte zero to mark the end of the relation
...
4-byte database OID
4-byte tablespace OID
relation OID

starting-ending block numbers
bitmap covering range of blocks
4-byte zero to mark the end of the relation
...

Only for relations which actually have changes though, of course.

Haven't implemented it, so it's entirely possible there's reasons why it
wouldn't work, but I do like the bitmap idea.  I definitely think we
need a checksum, as you mentioned.

Thanks!

Stephen

Вложения

В списке pgsql-hackers по дате отправления:

Предыдущее
От: Stephen Frost
Дата:
Сообщение: Re: block-level incremental backup
Следующее
От: Stephen Frost
Дата:
Сообщение: Re: [PATCH v20] GSSAPI encryption support