Re: block-level incremental backup

Поиск
Список
Период
Сортировка
От Konstantin Knizhnik
Тема Re: block-level incremental backup
Дата
Msg-id 1148d018-ff98-3857-20b8-45179c0742a3@postgrespro.ru
обсуждение исходный текст
Ответ на block-level incremental backup  (Robert Haas <robertmhaas@gmail.com>)
Ответы Re: block-level incremental backup  (Robert Haas <robertmhaas@gmail.com>)
Список pgsql-hackers

On 09.04.2019 18:48, Robert Haas wrote:
> 1. There should be a way to tell pg_basebackup to request from the
> server only those blocks where LSN >= threshold_value.

Some times ago I have implemented alternative version of ptrack utility 
(not one used in pg_probackup)
which detects updated block at file level. It is very simple and may be 
it can be sometimes integrated in master.
I attached patch to vanilla to this mail.
Right now it contains just two GUCs:

ptrack_map_size: Size of ptrack map (number of elements) used for 
incremental backup: 0 disabled.
ptrack_block_log: Logarithm of ptrack block size (amount of pages)

and one function:

pg_ptrack_get_changeset(startlsn pg_lsn) returns 
{relid,relfilenode,reltablespace,forknum,blocknum,segsize,updlsn,path}

Idea is very simple: it creates hash map of fixed size (ptrack_map_size) 
and stores LSN of written pages in this map.
As far as postgres default page size seems to be too small  for ptrack 
block (requiring too large hash map or increasing number of conflicts, 
as well as
increasing number of random reads) it is possible to configure ptrack 
block to consists of multiple pages (power of 2).

This patch is using memory mapping mechanism. Unfortunately there is no 
portable wrapper for it in Postgres, so I have to provide own 
implementations for Unix/Windows. Certainly it is not good and should be 
rewritten.

How to use?

1. Define ptrack_map_size in postgres.conf, for example (use simple 
number for more uniform hashing):

ptrack_map_size = 1000003

2.  Remember current lsn.

psql postgres -c "select pg_current_wal_lsn()"
  pg_current_wal_lsn
--------------------
  0/224A268
(1 row)

3. Do some updates.

$ pgbench -T 10 postgres

4. Select changed blocks.

  select * from pg_ptrack_get_changeset('0/224A268');
  relid | relfilenode | reltablespace | forknum | blocknum | segsize |  
updlsn   |         path
-------+-------------+---------------+---------+----------+---------+-----------+----------------------
  16390 |       16396 |          1663 |       0 |     1640 |       1 | 
0/224FD88 | base/12710/16396
  16390 |       16396 |          1663 |       0 |     1641 |       1 | 
0/2258680 | base/12710/16396
  16390 |       16396 |          1663 |       0 |     1642 |       1 | 
0/22615A0 | base/12710/16396
...

Certainly ptrack should be used as part of some backup tool (as 
pg_basebackup or pg_probackup).


-- 
Konstantin Knizhnik
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company


Вложения

В списке pgsql-hackers по дате отправления:

Предыдущее
От: Alvaro Herrera
Дата:
Сообщение: Re: pg_dump is broken for partition tablespaces
Следующее
От: Jehan-Guillaume de Rorthais
Дата:
Сообщение: Re: block-level incremental backup