On Fri, Jul 22, 2022 at 03:33:20PM -0700, Peter Geoghegan wrote:
> On Fri, Jul 22, 2022 at 2:11 PM Bruce Momjian <bruce@momjian.us> wrote:
> > I have improved the wording of the last paragraph in this patch.
>
> I think that it would be worth prominently explaining where heap-only
> tuples get their name from: it comes from the fact there are (by
> definition) no entries for a heap-only tuple in any index, ever.
> Indexes are nevertheless capable of locating heap-only tuples during
> index scans, by dealing with a little additional indirection: they
> must traverse groups of related tuple versions, all for the same
> logical row that was HOT updated one or more times -- this group of
> related tuples is called a HOT chain.
>
> This seems like a useful thing to emphasize because it places the
> emphasis on what *doesn't* happen. Mostly what doesn't happen in
> indexes.
>
> New item identifiers actually *are* needed for heap-only tuples
> (perhaps we could get away with it, but we don't). However, that
> doesn't really matter too much in practice. Heap-only tuples can still
> have their line pointers set to LP_UNUSED directly during pruning,
> without having to be set to LP_DEAD for a time first (a situation
> which VACUUM alone can correct by setting the LP_DEAD items to
> LP_UNUSED during its second heap pass).
>
> So heap-only tuples "skip the step" where they have to become LP_DEAD
> stubs/tombstones. Which is possible precisely because indexes don't
> need to be considered (they're "heap-only").
Good points. I have updated the attached patch and URL to mention that
HOT rows are _completely_ removed, and why that is possible, and I
clarified the page item identifier mention.
--
Bruce Momjian <bruce@momjian.us> https://momjian.us
EDB https://enterprisedb.com
Indecision is a decision. Inaction is an action. Mark Batterson