Re: RecoveryInProgress() has critical side effects

Поиск
Список
Период
Сортировка
От Andres Freund
Тема Re: RecoveryInProgress() has critical side effects
Дата
Msg-id 20211116185116.g4qjbvgnurow3bio@alap3.anarazel.de
обсуждение исходный текст
Ответ на RecoveryInProgress() has critical side effects  (Robert Haas <robertmhaas@gmail.com>)
Список pgsql-hackers
Hi,

On 2021-11-11 12:15:03 -0500, Robert Haas wrote:
> The reason why RecoveryInProgress() has critical side effects is that
> it calls InitXLOGAccess(). It turns out that the only critical thing
> that InitXLOGAccess() does is call InitXLogInsert().[1] If a backend
> doesn't call InitXLogInsert(), max_rdatas won't be initialized, and
> the first attempt to insert WAL will probably fail with something like
> "ERROR: too much WAL data". But there's no real reason why
> InitXLogInsert() needs to happen only after recovery is finished.

I think we should consider allocating that memory in postmaster, on !windows
at least. Or, perhaps even better, to initially use some static variable, and
only use a separate memory context when XLogEnsureRecordSpace().  Reserving
all that memory in every process just needlessly increases our per-connection
memory usage, for no good reason.


> The
> work that it does is just setting up data structures in backend-local
> memory, and not much is lost if they are set up and not used. So, in
> 0001, I propose to move the call to InitXLogInsert() from
> InitXLOGAccess() to BaseInit(). That means every backend will do it
> the first time it starts up. It also means that CreateCheckPoint()
> will no longer need a special case call to InitXLogInsert(), which
> seems like a good thing.

Yes. Seems making BaseInit() being executed at a halfway consistent point is
starting to have some benefits at least ;)


> Nevertheless, what I did in 0002 is remove InitXLOGAccess() but move
> the initialization of RedoRecPtr and doPageWrites into
> GetFullPageWritesInfo(), only in the case where RedoRecPtr hasn't been
> set yet. This has one property that I really really like, which is
> that makes the code more understandable. There is no suggestion that
> the correctness of WAL insertion somehow depends on a
> RecoveryInProgress() call which may or may not have occurred at some
> arbitrarily distant time in the past. Initializing the value at the
> point of first use is just way clearer than initializing it as a
> side-effect of some other function that's not even that tightly
> connected. However, it does have the effect of introducing a branch
> into GetFullPageWritesInfo(), which I'm not 100% sure is cheap enough
> not to matter.

Hm. Compared to the other costs of the XLogInsert do/while loop it probably
doesn't matter. One nearly-always-false branch is vastly cheaper than going
through the loop unnecessarily. Sure, the unnecessarily iteration saved will
only be the first (for now, it might be worth to refresh the values more
often), but there's enough branches in the body of the loop that I can't
really see it mattering.

Maybe kinda conceivably the cost of that branch could be an argument if
GetFullPageWritesInfo() were inline in XLogInsert(). But it isn't.


> I think the obvious alternative is to just not
> initialize RedoRecPtr and doPageWrites at all and document in the
> comments that the first time each backend does something with them
> it's going to end up retrying once; perhaps that is preferable. Going
> the other way, we could get rid of the global variables altogether and
> have GetFullPageWrites() read the data from shared memory. However,
> unless 8-byte atomicity is guaranteed, that requires a spinlock, so it
> seems likely unappealing.

I think it'd be fine to just make them 8byte atomics, which'd be lock-free on
relevant platforms (at least once the aarch64 thing is fixed, there's a
patch).

XLogCtlInsert already takes care to ensure that RedoRecPtr/fullPageWrites are
on a cacheline that's not constantly modified.


If we really wanted to optimize the no-8-byte-single-copy-atomicity path, we
could introduce a counter counting how many times RedoRecPtr/fullPageWrites
have changed. But it seems unlikely to be worth it.


Greetings,

Andres Freund



В списке pgsql-hackers по дате отправления:

Предыдущее
От: Joshua Brindle
Дата:
Сообщение: Re: Support for NSS as a libpq TLS backend
Следующее
От: Stephen Frost
Дата:
Сообщение: Re: pgsql: Fix headerscheck failure in replication/worker_internal.h