Re: [BUG] non archived WAL removed during production crash recovery

Поиск
Список
Период
Сортировка
От Jehan-Guillaume de Rorthais
Тема Re: [BUG] non archived WAL removed during production crash recovery
Дата
Msg-id 20200422000022.7aa6a62b@firost
обсуждение исходный текст
Ответ на Re: [BUG] non archived WAL removed during production crash recovery  (Kyotaro Horiguchi <horikyota.ntt@gmail.com>)
Ответы Re: [BUG] non archived WAL removed during production crash recovery  (Michael Paquier <michael@paquier.xyz>)
Список pgsql-bugs
On Tue, 21 Apr 2020 15:08:17 +0900 (JST)
Kyotaro Horiguchi <horikyota.ntt@gmail.com> wrote:

> At Tue, 21 Apr 2020 13:57:39 +0900, Michael Paquier <michael@paquier.xyz>
> wrote in 
> > On Tue, Apr 21, 2020 at 12:09:25PM +0900, Kyotaro Horiguchi wrote:  
> > > +    if (!XLogArchivingAlways() &&
> > > +        GetRecoveryState() == RECOVERY_STATE_ARCHIVE)
> > > 
> > > Is rewritten as 
> > > 
> > > +    if (!XLogArchivingAlways() &&
> > > +        GetDBState() > DB_IN_CRASH_RECOVERY)
> > > 
> > > FWIW, what annoyed me is there are three variables that are quite
> > > similar but has different domains, ControlFile->state,
> > > XLogCtl->SharedRecoveryState, and LocalRecoveryInProgress. I didn't
> > > mind there were two, but three seems a bit too many to me.  

In fact, my original goal [1] was to avoid adding another shared boolean related
to the same topic. So I understand your feeling.

I played with this idea again, based on your argument that there's no problem
if the update to DB_IN_ARCHIVE_RECOVERY reaches checkpointer with some delay.

The other point I feel bad with is to open and check the controlfile again and
again for each segment to archive, even on running production instance.
It's not that it would be heavy, but it feels overkill to fetch this information
that should be available more easily.

That leads me an idea where we would keep the ControlFile data up-to-date in
shared memory. There's a few duplicates between ControlFile and XLogCtl, so
maybe it could make the code a little simpler at some other places than just
fixing $SUBJECT using DBState? I'm not sure of the implications and impacts
though. This seems way bigger than the current fix and with many traps on the
way. Maybe we could discuss this in another thread if you think it deserves it?

Regards,

[1]
https://www.postgresql.org/message-id/flat/20200403182625.0fccc6fd%40firost#28a756094a4b1f3dd24927fb6311927d



В списке pgsql-bugs по дате отправления:

Предыдущее
От: Euler Taveira
Дата:
Сообщение: Re: BUG #16354: No geos 3.8.1 package for RHEL 8
Следующее
От: Michael Paquier
Дата:
Сообщение: Re: Bug with memory leak on cert validation in libpq