On 2018-10-14 13:26:24 +0000, Michael Paquier wrote:
> Avoid duplicate XIDs at recovery when building initial snapshot
>
> On a primary, sets of XLOG_RUNNING_XACTS records are generated on a
> periodic basis to allow recovery to build the initial state of
> transactions for a hot standby. The set of transaction IDs is created
> by scanning all the entries in ProcArray. However it happens that its
> logic never counted on the fact that two-phase transactions finishing to
> prepare can put ProcArray in a state where there are two entries with
> the same transaction ID, one for the initial transaction which gets
> cleared when prepare finishes, and a second, dummy, entry to track that
> the transaction is still running after prepare finishes. This way
> ensures a continuous presence of the transaction so as callers of for
> example TransactionIdIsInProgress() are always able to see it as alive.
>
> So, if a XLOG_RUNNING_XACTS takes a standby snapshot while a two-phase
> transaction finishes to prepare, the record can finish with duplicated
> XIDs, which is a state expected by design. If this record gets applied
> on a standby to initial its recovery state, then it would simply fail,
> so the odds of facing this failure are very low in practice. It would
> be tempting to change the generation of XLOG_RUNNING_XACTS so as
> duplicates are removed on the source, but this requires to hold on
> ProcArrayLock for longer and this would impact all workloads,
> particularly those using heavily two-phase transactions.
>
> XLOG_RUNNING_XACTS is also actually used only to initialize the standby
> state at recovery, so instead the solution is taken to discard
> duplicates when applying the initial snapshot.
>
> Diagnosed-by: Konstantin Knizhnik
> Author: Michael Paquier
> Discussion: https://postgr.es/m/0c96b653-4696-d4b4-6b5d-78143175d113@postgrespro.ru
> Backpatch-through: 9.3
I'm unhappy this approach was taken over objections. Without a real
warning. Even leaving the crummyness aside, did you check other users
of XLOG_RUNNING_XACTS, e.g. logical decoding?
Greetings,
Andres Freund