On Mon, Aug 21, 2023 at 08:32:39AM +0900, Michael Paquier wrote:
> I am not sure that I'll be able to do more on this topic this week, at
> least that's some progress.
Some time later..
I have spent more time and thoughts on the whole test suite, finishing
with the attached 0003 that applies on top of your own patches. I am
really looking forward to making this whole logic more robust, so as
WAL replay can be made itself more robust for the OOM/end-of-wal
detection on HEAD for standbys and crash recovery.
While looking at the whole, I have considered a few things that may
make the test cleaner, like:
- Calculating the segment name and its offset from the end_lsn of a
record directly from the backend, but it felt inelegant to pass
through more subroutine layers the couple ($segment, offset) rather
than just a LSN, so guessing the segment number and the offset while
the cluster is offline if OK by me.
- The TLI can be queried from the server rather than hardcoded.
- I've been thinking about bundling the tests of each sub-section in
their own subroutine, but that felt a bit awkward, particularly for
the part where we need a correct $prev_lsn in the record header
written to enforce other code paths.
- The test needs better documentation. One of the things I kept
staring at is cross-checking pack() and the dependency to the C
structures, so I have added more details there, explaining more the
whys and the hows.
I have also looked again at the C code for a few hours, and still got
the impression that this is rather solid. There are two things that
may be better:
- Document at the top of allocate_recordbuf() that this should never
be called with a length coming from a header until it is validated.
- Removing AllocSizeIsValid() for the non-FRONTEND path should be OK.
What do you think?
--
Michael