Re: [HACKERS] Adding hook in BufferSync for backup purposes

Поиск
Список
Период
Сортировка
От Andrey Borodin
Тема Re: [HACKERS] Adding hook in BufferSync for backup purposes
Дата
Msg-id A0E68ABA-AA62-4B56-879F-C61EECB34F23@yandex-team.ru
обсуждение исходный текст
Ответ на Re: [HACKERS] Adding hook in BufferSync for backup purposes  (Tom Lane <tgl@sss.pgh.pa.us>)
Ответы Re: [HACKERS] Adding hook in BufferSync for backup purposes  (Tom Lane <tgl@sss.pgh.pa.us>)
Список pgsql-hackers
Alvaro, Tom, thank you for your valuable comments.


> Alvaro:
> I remember discussing the topic of differential base-backups with
> somebody (probably Marco and Gabriele).  The idea we had was to have a
> new relation fork which stores an LSN for each group of pages,
> indicating the LSN of the newest change to those pages.  The backup tool
> "scans" the whole LSN fork, and grabs images of all pages that have LSNs
> newer than the one used for the previous base backup.
Thanks for the pointer, I’ve found the discussions and now I’m in a process of extraction of the knowledge from there

> I think it should be at the point where the buffer is
> modified (i.e. when WAL is written) rather than when it's checkpointed
> out.
WAL is flushed before actual pages are written to disk(sent to kernel). I’d like to notify extensions right after we
exactlysure pages were flushed. 
But you are right, BufferSync is not good place for this:
1. It lacks LSNs
2. It’s not the only place to flush: bgwriter goes through nearby function FlushBuffer() and many AMs write directly to
smgr(for example when matapge is created) 

BufferSync() seemed sooo comfortable and efficient place for flashing info on dirty pages, already sorted and grouped
bytablespace, but it is absolutely incorrect to do it there. I’ll look for the better place. 

>
> 7 авг. 2017 г., в 18:37, Tom Lane <tgl@sss.pgh.pa.us> написал(а):
>
> Yeah.  Keep in mind that if the extension does anything at all that could
> possibly throw an error, and if that error condition persists across
> multiple tries, you will have broken the database completely: it will
> be impossible to complete a checkpoint, and your WAL segment pool will
> grow until it exhausts disk.  So the idea of doing something that involves
> unspecified extension behavior, especially possible interaction with
> an external backup agent, right there is pretty terrifying.
I think that API for extensions should tend to protect developer from breaking everything, but may allow it with
precautionwarnings in docs and comments. Please let me know if this assumption is incorrect. 

>
> Other problems with the proposed patch: it misses coverage of
> BgBufferSync, and I don't like exposing an ad-hoc structure like
> CkptTsStatus as part of an extension API.  The algorithm used by
> BufferSync to schedule buffer writes has changed multiple times
> before and doubtless will again; if we're going to have a hook
> here it should depend as little as possible on those details.
OK, now I see that «buf_internals.h» had word internals for a reason. Thanks for pointing that out, I didn’t knew about
changesin these algorithms. 

Best regards, Andrey Borodin, Yandex.




В списке pgsql-hackers по дате отправления:

Предыдущее
От: Masahiko Sawada
Дата:
Сообщение: Re: [HACKERS] pgbench: Skipping the creating primary keys after initialization
Следующее
От: Tom Lane
Дата:
Сообщение: Re: [HACKERS] Adding hook in BufferSync for backup purposes