It seems like replica did not replayed corresponding WAL records. Any thoughts?
heap_xlog_freeze_page() is a pretty simple function. It's not impossible that it could have a bug that causes it to incorrectly skip records, but it's not clear why that wouldn't affect many other replay routines equally, since the pattern of using the return value of XLogReadBufferForRedo() to decide what to do is widespread.
Can you prove that other WAL records generated around the same time as the freeze record *were* replayed on the master? If so, that proves that this isn't just a case of the WAL never reaching the standby.
Can you look at the segment that contains the relevant freeze record with pg_xlogdump? Maybe that record is messed up somehow.
Not yet. Most of such cases are long before our recovery window so corresponding WALs have been deleted. We have already tuned retention policy and we are now looking for a fresh case.