Re: block-level incremental backup

Поиск
Список
Период
Сортировка
От Konstantin Knizhnik
Тема Re: block-level incremental backup
Дата
Msg-id 86244a3f-689a-a15b-bac4-f3afe9b6523b@postgrespro.ru
обсуждение исходный текст
Ответ на Re: block-level incremental backup  (Robert Haas <robertmhaas@gmail.com>)
Список pgsql-hackers

On 10.04.2019 19:51, Robert Haas wrote:
> On Wed, Apr 10, 2019 at 10:22 AM Konstantin Knizhnik
> <k.knizhnik@postgrespro.ru> wrote:
>> Some times ago I have implemented alternative version of ptrack utility
>> (not one used in pg_probackup)
>> which detects updated block at file level. It is very simple and may be
>> it can be sometimes integrated in master.
> I don't think this is completely crash-safe.  It looks like it
> arranges to msync() the ptrack file at appropriate times (although I
> haven't exhaustively verified the logic), but it uses MS_ASYNC, so
> it's possible that the ptrack file could get updated on disk either
> before or after the relation file itself.  I think before is probably
> OK -- it just risks having some blocks look modified when they aren't
> really -- but after seems like it is very much not OK.  And changing
> this to use MS_SYNC would probably be really expensive.  Likely a
> better approach would be to hook into the new fsync queue machinery
> that Thomas Munro added to PostgreSQL 12.

I do not think that MS_SYNC or fsync queue is needed here.
If power failure or OS crash cause loose of some writes to ptrack map, 
then in any case {ostgres will perform recovery and updating pages from 
WAL cause once again marking them in ptrack map. So as in case of CLOG 
and many other Postgres files it is not critical to loose some writes 
because them will be restored from WAL. And before truncating WAL, 
Postgres performs checkpoint which flushes all changes to the disk, 
including ptrack map updates.


> It looks like your system maps all the blocks in the system into a
> fixed-size map using hashing.  If the number of modified blocks
> between the full backup and the incremental backup is large compared
> to the size of the ptrack map, you'll start to get a lot of
> false-positives.  It will look as if much of the database needs to be
> backed up.  For example, in your sample configuration, you have
> ptrack_map_size = 1000003. If you've got a 100GB database with 20%
> daily turnover, that's about 2.6 million blocks.  If you set bump a
> random entry ~2.6 million times in a map with 1000003 entries, on the
> average ~92% of the entries end up getting bumped, so you will get
> very little benefit from incremental backup.  This problem drops off
> pretty fast if you raise the size of the map, but it's pretty critical
> that your map is large enough for the database you've got, or you may
> as well not bother.
This is why ptrack block size should be larger than page size.
Assume that it is 1Mb. 1MB is considered to be optimal amount of disk 
IO, when frequent seeks are not degrading read speed (it is most 
critical for HDD). In other words reading 10 random pages (20%) from 
this 1Mb block will takes almost the same amount of time (or even 
longer) than reading all this 1Mb in one operation.

There will be just 100000 used entries in ptrack map with very small 
probability of collision.
Actually I have chosen this size (1000003) for ptrack map because with 
1Mb block size is allows to map without noticable number of collisions 
1Tb database which seems to be enough for most Postgres installations. 
But increasing ptrack map size 10 and even 100 times should not also 
cause problems with modern RAM sizes.

>
> It also appears that your system can't really handle resizing of the
> map in any friendly way.  So if your data size grows, you may be faced
> with either letting the map become progressively less effective, or
> throwing it out and losing all the data you have.
>
> None of that is to say that what you're presenting here has no value,
> but I think it's possible to do better (and I think we should try).
>
Definitely I didn't consider proposed patch as perfect solution and 
certainly it requires improvements (and may be complete redesign).
I just want to present this approach (maintaining hash of block's LSN in 
mapped memory) and keeping track of modified blocks at file level 
(unlike current ptrack implementation which logs changes in all places 
in Postgres code where data is updated).

Also, despite to the fact that this patch may be considered as raw 
prototype, I have spent some time thinking about all aspects of this 
approach including fault tolerance and false positives.




В списке pgsql-hackers по дате отправления:

Предыдущее
От: Peter Eisentraut
Дата:
Сообщение: Re: PostgreSQL pollutes the file system
Следующее
От: Alvaro Herrera
Дата:
Сообщение: Re: pg_dump is broken for partition tablespaces