On Fri, Nov 12, 2021 at 3:31 PM Andres Freund <andres@anarazel.de> wrote:
> I wonder if we should try to go for something considerably simpler for 14. How
> about having a new array that just stores the HTSV state for every
> ItemIdIsNormal(). For simplicity, we could populate that array eagerly in a
> separate loop.
Why is that simpler than a boolean array, which represents whether or
not the item has had its heap_prune_record_unused() call yet (if it's
a tuple with storage)?
> That'd fix the known bugs, and yield better efficiency (because we'd not
> re-compute HTSV all the time). Then for HEAD go for something that fixes
> pruning more fundamentally.
I don't know what you mean about the patch recomputing HTSV all the
time. The patch doesn't do that.
It's true that we'll call HTSV (heap_prune_record_unused(), actually)
more often when following HOT chains, because now we follow them until
the end. However, the heap_prune_record_unused() calls only happen
after we've already validated that we found a heap-only tuple that's
part of the same HOT chain. That just leaves disconnected tuples. We
do call heap_prune_record_unused() there too (which is theoretically
unnecessary), but only once.
Overall, we are *guaranteed* to only call heap_prune_record_unused()
at most once per tuple with storage. I believe that this is a small
reduction, since HEAD will do the maybe-aborted precheck call to
heap_prune_record_unused() before anything else.
I guess you might have meant something about the more-conservative
behavior with DEAD vs RECENTLY_DEAD during HOT chain traversal. But
the cases where that makes any difference at all ought to be very rare
indeed.
--
Peter Geoghegan