Is heap_page_prune() stats collector accounting wrong?

Поиск
Список
Период
Сортировка
От Peter Geoghegan
Тема Is heap_page_prune() stats collector accounting wrong?
Дата
Msg-id CAH2-WzmyeDBMN2XSmpBV05iOv=fNnUPe4C8sUQNVEBGrF0_NUQ@mail.gmail.com
обсуждение исходный текст
Ответы Re: Is heap_page_prune() stats collector accounting wrong?  (Peter Geoghegan <pg@bowt.ie>)
Список pgsql-hackers
The following code appears at the end of heap_page_prune():

    /*
     * If requested, report the number of tuples reclaimed to pgstats. This is
     * ndeleted minus ndead, because we don't want to count a now-DEAD root
     * item as a deletion for this purpose.
     */
    if (report_stats && ndeleted > prstate.ndead)
        pgstat_update_heap_dead_tuples(relation, ndeleted - prstate.ndead);

In general, heap_page_prune() has to know the exact definition of
ndeleted (which is its return value) -- and the definition is kind of
subtle. The definition is really "the number of tuples that had
storage before pruning, but not after pruning, because pruning
considered them DEAD and removed the tuple itself (though not
necessarily its line pointer, which could have just been marked
LP_DEAD rather than LP_UNUSED)". As far as I know, the handling for
all that at the end of heap_prune_chain() (at the point that we decide
some tuples from a HOT chain are considered DEAD) is correct. It
handles all the subtleties in a way that's consistent.

I can't see why we should need to compensate like this (by subtracting
prstate.ndead) at all, because, as I said, heap_page_prune() *already*
does everything for us. In particular, if the now-DEAD root item
started out as an LP_NORMAL tuple with storage (a root item that is a
plain heap tuple, or just a simple tuple on its own), then we already
make sure to count it in ndeleted here, at the end of
heap_prune_chain():

        /*
         * If the root entry had been a normal tuple, we are deleting it, so
         * count it in the result.  But changing a redirect (even to DEAD
         * state) doesn't count.
         */
        if (ItemIdIsNormal(rootlp))
            ndeleted++;

So we're always going to count tuples that had storage but then got
pruned away in ndeleted -- we *do* want to count root items when
they're LP_NORMAL. We also avoid counting root items when that is
inappropriate: when they started out as an LP_REDIRECT, but became
LP_DEAD. It would be inappropriate in the case of an LP_REDIRECT root
item specifically, because it doesn't start out with tuple storage
anyway (heap-only tuples from the same HOT chain are another matter).
So why compensate when calling pgstat_update_heap_dead_tuples() at
all?

Let's assume that somehow I have it wrong. Even then, why should we
compensate like this for the stats collector, but not for VACUUM?
There's certainly no corresponding code in vacuumlazy.c that does a
similar transformation with ndeleted.

-- 
Peter Geoghegan



В списке pgsql-hackers по дате отправления:

Предыдущее
От: Justin Pryzby
Дата:
Сообщение: Re: RFC: Logging plan of the running query
Следующее
От: "Euler Taveira"
Дата:
Сообщение: Re: [BUG]Invalidate relcache when setting REPLICA IDENTITY