WIP/PoC for parallel backup

Поиск
Список
Период
Сортировка
От Asif Rehman
Тема WIP/PoC for parallel backup
Дата
Msg-id CADM=JehKgobEknb+_nab9179HzGj=9EiTzWMOd2mpqr_rifm0Q@mail.gmail.com
обсуждение исходный текст
Ответы Re: WIP/PoC for parallel backup  (Asim R P <apraveen@pivotal.io>)
Re: WIP/PoC for parallel backup  (Robert Haas <robertmhaas@gmail.com>)
Список pgsql-hackers
Hi Hackers,

I have been looking into adding parallel backup feature in pg_basebackup. Currently pg_basebackup sends BASE_BACKUP command for taking full backup, server scans the PGDATA and sends the files to pg_basebackup. In general, server takes the following steps on BASE_BACKUP command:

- do pg_start_backup
- scans PGDATA, creates and send header containing information of tablespaces.
- sends each tablespace to pg_basebackup.
- and then do pg_stop_backup

All these steps are executed sequentially by a single process. The idea I am working on is to separate these steps into multiple commands in replication grammer. Add worker processes to the pg_basebackup where they can copy the contents of PGDATA in parallel.

The command line interface syntax would be like:
pg_basebackup --jobs=WORKERS


Replication commands:

- BASE_BACKUP [PARALLEL] - returns a list of files in PGDATA
If the parallel option is there, then it will only do pg_start_backup, scans PGDATA and sends a list of file names.

- SEND_FILES_CONTENTS (file1, file2,...) - returns the files in given list.
pg_basebackup will then send back a list of filenames in this command. This commands will be send by each worker and that worker will be getting the said files.

- STOP_BACKUP
when all workers finish then, pg_basebackup will send STOP_BACKUP command.

The pg_basebackup can start by sending "BASE_BACKUP PARALLEL" command and getting a list of filenames from the server in response. It should then divide this list as per --jobs parameter. (This division can be based on file sizes). Each of the worker process will issue a SEND_FILES_CONTENTS (file1, file2,...) command. In response, the server will send the files mentioned in the list back to the requesting worker process.

Once all the files are copied, then pg_basebackup will send the STOP_BACKUP command. Similar idea has been been discussed by Robert, on the incremental backup thread a while ago. This is similar to that but instead of START_BACKUP and SEND_FILE_LIST, I have combined them into BASE_BACKUP PARALLEL.

I have done a basic proof of concenpt (POC), which is also attached. I would appreciate some input on this. So far, I am simply dividing the list equally and assigning them to worker processes. I intend to fine tune this by taking into consideration file sizes. Further to add tar format support, I am considering that each worker process, processes all files belonging to a tablespace in its list (i.e. creates and copies tar file), before it processes the next tablespace. As a result, this will create tar files that are disjointed with respect tablespace data. For example:
Say, tablespace t1 has 20 files and we have 5 worker processes and tablespace t2 has 10. Ignoring all other factors for the sake of this example, each worker process will get a group of 4 files of t1 and 2 files of t2. Each process will create 2 tar files, one for t1 containing 4 files and another for t2 containing 2 files.

Regards,
Asif
Вложения

В списке pgsql-hackers по дате отправления:

Предыдущее
От: Alex
Дата:
Сообщение: when the IndexScan reset to the next ScanKey for in operator
Следующее
От: ilmari@ilmari.org (Dagfinn Ilmari Mannsåker)
Дата:
Сообщение: Re: Remove one last occurrence of "replication slave" in comments