Re: [BUG] non archived WAL removed during production crash recovery
От | Jehan-Guillaume de Rorthais |
---|---|
Тема | Re: [BUG] non archived WAL removed during production crash recovery |
Дата | |
Msg-id | 20200422000022.7aa6a62b@firost обсуждение исходный текст |
Ответ на | Re: [BUG] non archived WAL removed during production crash recovery (Kyotaro Horiguchi <horikyota.ntt@gmail.com>) |
Ответы |
Re: [BUG] non archived WAL removed during production crash recovery
|
Список | pgsql-bugs |
On Tue, 21 Apr 2020 15:08:17 +0900 (JST) Kyotaro Horiguchi <horikyota.ntt@gmail.com> wrote: > At Tue, 21 Apr 2020 13:57:39 +0900, Michael Paquier <michael@paquier.xyz> > wrote in > > On Tue, Apr 21, 2020 at 12:09:25PM +0900, Kyotaro Horiguchi wrote: > > > + if (!XLogArchivingAlways() && > > > + GetRecoveryState() == RECOVERY_STATE_ARCHIVE) > > > > > > Is rewritten as > > > > > > + if (!XLogArchivingAlways() && > > > + GetDBState() > DB_IN_CRASH_RECOVERY) > > > > > > FWIW, what annoyed me is there are three variables that are quite > > > similar but has different domains, ControlFile->state, > > > XLogCtl->SharedRecoveryState, and LocalRecoveryInProgress. I didn't > > > mind there were two, but three seems a bit too many to me. In fact, my original goal [1] was to avoid adding another shared boolean related to the same topic. So I understand your feeling. I played with this idea again, based on your argument that there's no problem if the update to DB_IN_ARCHIVE_RECOVERY reaches checkpointer with some delay. The other point I feel bad with is to open and check the controlfile again and again for each segment to archive, even on running production instance. It's not that it would be heavy, but it feels overkill to fetch this information that should be available more easily. That leads me an idea where we would keep the ControlFile data up-to-date in shared memory. There's a few duplicates between ControlFile and XLogCtl, so maybe it could make the code a little simpler at some other places than just fixing $SUBJECT using DBState? I'm not sure of the implications and impacts though. This seems way bigger than the current fix and with many traps on the way. Maybe we could discuss this in another thread if you think it deserves it? Regards, [1] https://www.postgresql.org/message-id/flat/20200403182625.0fccc6fd%40firost#28a756094a4b1f3dd24927fb6311927d
В списке pgsql-bugs по дате отправления: