Re: [HACKERS] Hooks to track changed pages for backup purposes

Поиск
Список
Период
Сортировка
От Andrey Borodin
Тема Re: [HACKERS] Hooks to track changed pages for backup purposes
Дата
Msg-id 153BCA6C-2B9B-44EA-9C63-73DD7DBD77B7@yandex-team.ru
обсуждение исходный текст
Ответ на Re: [HACKERS] Hooks to track changed pages for backup purposes  (Tomas Vondra <tomas.vondra@2ndquadrant.com>)
Ответы Re: [HACKERS] Hooks to track changed pages for backup purposes  (Tomas Vondra <tomas.vondra@2ndquadrant.com>)
Список pgsql-hackers
Hi Tomas! Thank you for looking into that patch.

8 сент. 2017 г., в 1:53, Tomas Vondra <tomas.vondra@2ndquadrant.com> написал(а):

A few more comments:

* The patch defines wal_switch_hook, but it's never called.
That call was missing, that's a bug, thanks for spotting that out.

* I see there are conditions like this:

   if(xlogreader->blocks[nblock].forknum == MAIN_FORKNUM)

Why is it enough to restrict the block-tracking code to main fork?
Aren't we interested in all relation forks?
fsm, vm and others are small enough to take them 

I guess you'll have to explain
what the implementation of the hooks is supposed to do, and why these
locations for hook calls are the right ones. It's damn impossible to
validate the patch without that information.

Assuming you still plan to use the hook approach ...
Yes, I still think hooking is good idea, but you are right - I need prototype first. I'll mark patch as Returned with feedback before prototype implementation.


There
are no arguments fed to this hook, so modules would not be able to
analyze things in this context, except shared memory and process
state?


Those hooks are put in hot code paths, and could impact performance of
WAL insertion itself.
I do not think sending few bytes to cached array is comparable to disk
write of XLog record. Checking the func ptr is even cheaper with correct
branch prediction.


That seems somewhat suspicious, for two reasons. Firstly, I believe we
only insert the XLOG records into WAL buffer here, so why should there
be any disk write related? Or do you mean the final commit?
Yes, I mean finally we will be waiting for disk. Hundred empty ptr checks are neglectable in comparision with disk.

But more importantly, doesn't this kind of information require some
durability guarantees? I mean, if it gets lost during server crashes or
restarts, doesn't that mean the incremental backups might miss some
buffers? I'd guess the hooks will have to do some sort of I/O, to
achieve that, no?
We need durability only on the level of one segment. If we do not have info from segment we can just rescan it.
If we send segment to S3 as one file, we are sure in it's integrity. But this IO can by async.

PTRACK in it's turn switch bits in fork's buffers which are written in checkpointer and..well... recovered during recovery. By usual WAL replay of recovery.


From this POV, the idea to collect this information on the backup system
(WAL archive) by pre-processing the arriving WAL segments seems like the
most promising. It moves the work to another system, the backup system
can make it as durable as the WAL segments, etc.

Well, in some not so rare cases users encrypt backups and send to S3. And there is no system with CPUs that can handle that WAL parsing. Currently, I'm considering mocking prototype for wal-g, which works exactly this.

Your comments were very valuable, thank you for looking into the patch and joining the discussion.

Best regards, Andrey Borodin.


В списке pgsql-hackers по дате отправления:

Предыдущее
От: Amit Langote
Дата:
Сообщение: Re: [HACKERS] Setting pd_lower in GIN metapage
Следующее
От: Amit Langote
Дата:
Сообщение: Re: [HACKERS] Partition-wise join for join between (declaratively)partitioned tables