Re: [HACKERS] Unlogged tables cleanup

Поиск
Список
Период
Сортировка
От Andres Freund
Тема Re: [HACKERS] Unlogged tables cleanup
Дата
Msg-id 20190514043352.jtbki3f4ifegk6g3@alap3.anarazel.de
обсуждение исходный текст
Ответ на Re: [HACKERS] Unlogged tables cleanup  (Michael Paquier <michael@paquier.xyz>)
Ответы Re: [HACKERS] Unlogged tables cleanup  (Michael Paquier <michael@paquier.xyz>)
Список pgsql-hackers
Hi,

On 2019-05-14 13:23:28 +0900, Michael Paquier wrote:
> On Mon, May 13, 2019 at 10:37:35AM -0700, Andres Freund wrote:
> > Ugh, this is all such a mess. But, isn't this broken independently of
> > the smgrimmedsync() issue? In a basebackup case, the basebackup could
> > have included the main fork, but not the init fork, and the reverse. WAL
> > replay *solely* needs to be able to recover from that.  At the very
> > least we'd have to do the cleanup step after becoming consistent, not
> > just before recovery even started.
> 
> Yes, the logic using smgrimmedsync() is race-prone and weaker than the
> index AMs in my opinion, even if the failure window is limited (I
> think that this is mentioned upthread a bit).

How's it limited? On a large database a base backup easily can take
*days*. And e.g. VM and FSM can easily have inodes that are much newer
than the the main/init forks, so typical base-backups (via OS/glibc
readdir) will sort them at a later point (or it'll be hashed, in which
case it's entirely random), so the window between when the different
forks are copied are large.


> What's actually the reason preventing us from delaying the
> checkpointer like the index AMs for the logging of heap init fork?

I'm not following. What do you mean by "delaying the checkpointer"?

Greetings,

Andres Freund



В списке pgsql-hackers по дате отправления:

Предыдущее
От: Amit Langote
Дата:
Сообщение: Re: [HACKERS] advanced partition matching algorithm forpartition-wise join
Следующее
От: Kyotaro HORIGUCHI
Дата:
Сообщение: Re: [HACKERS] WAL logging problem in 9.4.3?