Re: [HACKERS] Unlogged tables cleanup

Поиск

Список

Период

Сортировка

От	Andres Freund
Тема	Re: [HACKERS] Unlogged tables cleanup
Дата	14 мая 2019 г. 04:33:52
Msg-id	20190514043352.jtbki3f4ifegk6g3@alap3.anarazel.de обсуждение исходный текст
Ответ на	Re: [HACKERS] Unlogged tables cleanup (Michael Paquier <michael@paquier.xyz>)
Ответы	Re: [HACKERS] Unlogged tables cleanup
Список	pgsql-hackers

Дерево обсуждения

Hi,

On 2019-05-14 13:23:28 +0900, Michael Paquier wrote:
> On Mon, May 13, 2019 at 10:37:35AM -0700, Andres Freund wrote:
> > Ugh, this is all such a mess. But, isn't this broken independently of
> > the smgrimmedsync() issue? In a basebackup case, the basebackup could
> > have included the main fork, but not the init fork, and the reverse. WAL
> > replay *solely* needs to be able to recover from that.  At the very
> > least we'd have to do the cleanup step after becoming consistent, not
> > just before recovery even started.
> 
> Yes, the logic using smgrimmedsync() is race-prone and weaker than the
> index AMs in my opinion, even if the failure window is limited (I
> think that this is mentioned upthread a bit).

How's it limited? On a large database a base backup easily can take
*days*. And e.g. VM and FSM can easily have inodes that are much newer
than the the main/init forks, so typical base-backups (via OS/glibc
readdir) will sort them at a later point (or it'll be hashed, in which
case it's entirely random), so the window between when the different
forks are copied are large.

> What's actually the reason preventing us from delaying the
> checkpointer like the index AMs for the logging of heap init fork?

I'm not following. What do you mean by "delaying the checkpointer"?

Greetings,

Andres Freund

В списке pgsql-hackers по дате отправления:

Вход в личный кабинет

Восстановление пароля

Подтверждение аккаунта

Изменение пароля

Re: [HACKERS] Unlogged tables cleanup