Andres Freund <andres@2ndquadrant.com> writes: > Any idea how to cheat our way out of that one given the current way > heap_freeze_tuple() works (running on both primary and standby)? My only > idea was to MultiXactIdWait() if !InRecovery but that's extremly grotty. > We can't even realistically create a new multixact with fewer members > with the current format of xl_heap_freeze.
Maybe we should just bite the bullet and change the WAL format for heap_freeze (inventing an all-new record type, not repurposing the old one, and allowing WAL replay to continue to accept the old one). The implication for users would be that they'd have to update slave servers before the master when installing the update; which is unpleasant, but better than living with a known data corruption case.
Agreed. It may suck, but it sucks less.
How badly will it break if they do the upgrade in the wrong order though. Will the slaves just stop (I assume this?) or is there a risk of a wrong-order upgrade causing extra breakage? And if they do shut down, would just upgrading the slave fix it, or would they then have to rebuild the slave? (actually, don't we recommend they always rebuild the slave *anyway*? In which case the problem is even smaller..)
I think we've always told people to upgrade the slave first, and it's the logical thing that AFAIK most other systems require as well, so that's not an unreasonable requirement at all.
I assume we'd then get rid of the old record type completely in 9.4, right?