Re: [HACKERS] Hooks to track changed pages for backup purposes

Поиск
Список
Период
Сортировка
От Tomas Vondra
Тема Re: [HACKERS] Hooks to track changed pages for backup purposes
Дата
Msg-id bb489d5f-4f4e-c161-26c2-18685d720c56@2ndquadrant.com
обсуждение исходный текст
Ответ на Re: [HACKERS] Hooks to track changed pages for backup purposes  (Andrey Borodin <x4mmm@yandex-team.ru>)
Ответы Re: [HACKERS] Hooks to track changed pages for backup purposes  (Daniel Gustafsson <daniel@yesql.se>)
Список pgsql-hackers

On 09/13/2017 07:53 AM, Andrey Borodin wrote:
>> * I see there are conditions like this:
>>
>>    if(xlogreader->blocks[nblock].forknum == MAIN_FORKNUM)
>>
>> Why is it enough to restrict the block-tracking code to main fork?
>> Aren't we interested in all relation forks?
> fsm, vm and others are small enough to take them 
> 

That seems like an optimization specific to your backup solution, not
necessarily to others and/or to other possible use cases.

>> I guess you'll have to explain
>> what the implementation of the hooks is supposed to do, and why these
>> locations for hook calls are the right ones. It's damn impossible to
>> validate the patch without that information.
>>
>> Assuming you still plan to use the hook approach ...
> Yes, I still think hooking is good idea, but you are right - I need
> prototype first. I'll mark patch as Returned with feedback before
> prototype implementation.
> 

OK

>>
>>>> There
>>>> are no arguments fed to this hook, so modules would not be able to
>>>> analyze things in this context, except shared memory and process
>>>> state?
>>>
>>>>
>>>> Those hooks are put in hot code paths, and could impact performance of
>>>> WAL insertion itself.
>>> I do not think sending few bytes to cached array is comparable to disk
>> write of XLog record. Checking the func ptr is even cheaper with correct
>> branch prediction.
>>>
>>
>> That seems somewhat suspicious, for two reasons. Firstly, I believe we
>> only insert the XLOG records into WAL buffer here, so why should there
>> be any disk write related? Or do you mean the final commit?
> Yes, I mean finally we will be waiting for disk. Hundred empty ptr
> checks are neglectable in comparision with disk.

Aren't we doing these calls while holding XLog locks? IIRC there was
quite a significant performance improvement after Heikki reduced the
amount of code executed while holding the locks.

>>
>> But more importantly, doesn't this kind of information require some
>> durability guarantees? I mean, if it gets lost during server crashes or
>> restarts, doesn't that mean the incremental backups might miss some
>> buffers? I'd guess the hooks will have to do some sort of I/O, to
>> achieve that, no?
> We need durability only on the level of one segment. If we do not have
> info from segment we can just rescan it.
> If we send segment to S3 as one file, we are sure in it's integrity. But
> this IO can by async.
> 
> PTRACK in it's turn switch bits in fork's buffers which are written in
> checkpointer and..well... recovered during recovery. By usual WAL replay
> of recovery.
> 

But how do you do that from the hooks, if they only store the data into
a buffer in memory? Let's say you insert ~8MB of WAL into a segment, and
then the system crashes and reboots. How do you know you have incomplete
information from the WAL segment?

Although, that's probably what wal_switch_hook() might do - sync the
data whenever the WAL segment is switched. Right?

> 
>> From this POV, the idea to collect this information on the backup system
>> (WAL archive) by pre-processing the arriving WAL segments seems like the
>> most promising. It moves the work to another system, the backup system
>> can make it as durable as the WAL segments, etc.
> 
> Well, in some not so rare cases users encrypt backups and send to S3.
> And there is no system with CPUs that can handle that WAL parsing.
> Currently, I'm considering mocking prototype for wal-g, which works
> exactly this.
> 

Why couldn't there be a system with enough CPU power? Sure, if you want
to do this, you'll need a more powerful system, but regular CPUs can do
>1GB/s in AES-256-GCM thanks to AES-NI. Or you could do it on the
database as part of archive_command, before the encryption, of course.

regards

-- 
Tomas Vondra                  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

В списке pgsql-hackers по дате отправления:

Предыдущее
От: Peter Moser
Дата:
Сообщение: Re: [HACKERS] [PROPOSAL] Temporal query processing with range types
Следующее
От: Bruce Momjian
Дата:
Сообщение: Re: [HACKERS] Clarification in pg10's pgupgrade.html step 10(upgrading standby servers)