Re: [HACKERS] Hooks to track changed pages for backup purposes

Поиск
Список
Период
Сортировка
От Andrey Borodin
Тема Re: [HACKERS] Hooks to track changed pages for backup purposes
Дата
Msg-id DD60016B-D2AA-4ACB-8A0B-7AFDBF7C2F69@yandex-team.ru
обсуждение исходный текст
Ответ на Re: [HACKERS] Hooks to track changed pages for backup purposes  (Michael Paquier <michael.paquier@gmail.com>)
Ответы Re: [HACKERS] Hooks to track changed pages for backup purposes  (Tomas Vondra <tomas.vondra@2ndquadrant.com>)
Список pgsql-hackers
Thank you for your reply, Michael! Your comments are valuable, especially in the world of backups.

> 31 авг. 2017 г., в 19:44, Michael Paquier <michael.paquier@gmail.com> написал(а):
> Such things are not Postgres-C like.
Will be fixed.

> I don't understand what xlog_begin_insert_hook() is good for.
memset control structures and array of blocknos and relfilenodes of the size XLR_MAX_BLOCK_ID .

> There
> are no arguments fed to this hook, so modules would not be able to
> analyze things in this context, except shared memory and process
> state?

>
> Those hooks are put in hot code paths, and could impact performance of
> WAL insertion itself.
I do not think sending few bytes to cached array is comparable to disk write of XLog record. Checking the func ptr is
evencheaper with correct branch prediction. 

> So you basically move the cost of scanning WAL
> segments for those blocks from any backup solution to the WAL
> insertion itself. Really, wouldn't it be more simple to let for
> example the archiver process to create this meta-data if you just want
> to take faster backups with a set of segments? Even better, you could
> do a scan after archiving N segments, and then use M jobs to do this
> work more quickly. (A set of background workers could do this job as
> well).
I like the idea of doing this during archiving. It is different trade-off between performance of OLTP and performance
ofbackuping. Essentially, it is parsing WAL some time before doing backup. The best thing about it is usage of CPUs
thatare usually spinning in idle loop on backup machines. 

> In the backup/restore world, backups can be allowed to be taken at a
> slow pace, what matters is to be able to restore them quickly.
Backups are taken much more often than restored.

> In short, anything moving performance from an external backup code path
> to a critical backend code path looks like a bad design to begin with.
> So I am dubious that what you are proposing here is a good idea.
I will think about it more. This proposal takes vanishingly small part of backend performance, but, indeed, nonzero
part.

Again, thank you for your time and comments.

Best regards, Andrey Borodin.


В списке pgsql-hackers по дате отправления:

Предыдущее
От: Simon Riggs
Дата:
Сообщение: Re: [HACKERS] [bug fix] Savepoint-related statements terminates connection
Следующее
От: Simon Riggs
Дата:
Сообщение: Re: [HACKERS] Surjective functional indexes