Обсуждение: Is heap_page_prune() stats collector accounting wrong?

Поиск
Список
Период
Сортировка

Is heap_page_prune() stats collector accounting wrong?

От
Peter Geoghegan
Дата:
The following code appears at the end of heap_page_prune():

    /*
     * If requested, report the number of tuples reclaimed to pgstats. This is
     * ndeleted minus ndead, because we don't want to count a now-DEAD root
     * item as a deletion for this purpose.
     */
    if (report_stats && ndeleted > prstate.ndead)
        pgstat_update_heap_dead_tuples(relation, ndeleted - prstate.ndead);

In general, heap_page_prune() has to know the exact definition of
ndeleted (which is its return value) -- and the definition is kind of
subtle. The definition is really "the number of tuples that had
storage before pruning, but not after pruning, because pruning
considered them DEAD and removed the tuple itself (though not
necessarily its line pointer, which could have just been marked
LP_DEAD rather than LP_UNUSED)". As far as I know, the handling for
all that at the end of heap_prune_chain() (at the point that we decide
some tuples from a HOT chain are considered DEAD) is correct. It
handles all the subtleties in a way that's consistent.

I can't see why we should need to compensate like this (by subtracting
prstate.ndead) at all, because, as I said, heap_page_prune() *already*
does everything for us. In particular, if the now-DEAD root item
started out as an LP_NORMAL tuple with storage (a root item that is a
plain heap tuple, or just a simple tuple on its own), then we already
make sure to count it in ndeleted here, at the end of
heap_prune_chain():

        /*
         * If the root entry had been a normal tuple, we are deleting it, so
         * count it in the result.  But changing a redirect (even to DEAD
         * state) doesn't count.
         */
        if (ItemIdIsNormal(rootlp))
            ndeleted++;

So we're always going to count tuples that had storage but then got
pruned away in ndeleted -- we *do* want to count root items when
they're LP_NORMAL. We also avoid counting root items when that is
inappropriate: when they started out as an LP_REDIRECT, but became
LP_DEAD. It would be inappropriate in the case of an LP_REDIRECT root
item specifically, because it doesn't start out with tuple storage
anyway (heap-only tuples from the same HOT chain are another matter).
So why compensate when calling pgstat_update_heap_dead_tuples() at
all?

Let's assume that somehow I have it wrong. Even then, why should we
compensate like this for the stats collector, but not for VACUUM?
There's certainly no corresponding code in vacuumlazy.c that does a
similar transformation with ndeleted.

-- 
Peter Geoghegan



Re: Is heap_page_prune() stats collector accounting wrong?

От
Peter Geoghegan
Дата:
On Fri, Nov 12, 2021 at 10:45 AM Peter Geoghegan <pg@bowt.ie> wrote:
> Let's assume that somehow I have it wrong. Even then, why should we
> compensate like this for the stats collector, but not for VACUUM?
> There's certainly no corresponding code in vacuumlazy.c that does a
> similar transformation with ndeleted.

I think that I figured it out myself, but it's very confusing. Could
definitely do with a better explanatory comment. Here's what I think
is actually going on here:

We compensate here precisely because we are not running in VACUUM (it
has to be an opportunistic prune in practice).

If we're running in VACUUM, then we are justified in ignoring
newly-pruned LP_DEAD items, because 1.) They're going to be turned
into LP_UNUSED items in the same VACUUM anyway (barring edge-cases),
and 2.) Newly-pruned LP_DEAD items are really no different to existing
LP_DEAD items that VACUUM found on the page before its own pruning
operation even began -- if VACUUM wants to count them or report on
them at all, then it had better count them itself, after pruning
(which it only began to do in Postgres 14).

If we're not running in VACUUM, and have to make a statistics
collector call, then we don't want to forget about DEAD tuples that
were pruned-away (i.e. no longer have tuple storage) when they still
have an LP_DEAD stub item. There is obviously no justification for
just ignoring LP_DEAD items there, because we don't know when VACUUM
is going to run next (since we are not VACUUM).

Does that sound correct?

-- 
Peter Geoghegan



Re: Is heap_page_prune() stats collector accounting wrong?

От
Peter Geoghegan
Дата:
On Fri, Nov 12, 2021 at 11:29 AM Peter Geoghegan <pg@bowt.ie> wrote:
> We compensate here precisely because we are not running in VACUUM (it
> has to be an opportunistic prune in practice).

> If we're not running in VACUUM, and have to make a statistics
> collector call, then we don't want to forget about DEAD tuples that
> were pruned-away (i.e. no longer have tuple storage) when they still
> have an LP_DEAD stub item. There is obviously no justification for
> just ignoring LP_DEAD items there, because we don't know when VACUUM
> is going to run next (since we are not VACUUM).

Attached patch clears this up by adding some comments. It also moves
the call to pgstat_update_heap_dead_tuples() from heap_page_prune() to
heap_page_prune_opt(), which feels like a better place for it to me.

-- 
Peter Geoghegan

Вложения