Обсуждение: Clarify vacuum verbose message

Поиск
Список
Период
Сортировка

Clarify vacuum verbose message

От
Jim Nasby
Дата:
VACUUM VERBOSE spits out two different messages for the heap, one of 
which is rather confusing:

INFO:  "trades": removed 625664 row versions in 20967 pages
...
INFO:  "trades": found 3282 removable, 56891627 nonremovable row 
versions in 1986034 out of 1986034 pages

After discussion with RhodiumToad I think I now understand how this can 
happen:

20:00 < RhodiumToad> the LP_DEAD slot is where the index entries for the 
deleted row point to, so that has to stay
20:01 < RhodiumToad> so for example, if you delete a lot of rows, then 
try and do a lot of updates (which will hint the                     pages as needing pruning),
20:01 < RhodiumToad> then do more updates or a seqscan (to let prune 
look at the pages),
20:02 < RhodiumToad> then do a vacuum, the vacuum will see a lot of 
LP_DEAD slots to remove index entries for, but not                     actual tuples

This example is from a table that was VACUUM FULL'd this weekend and had 
a nightly batch process run last night. That process INSERTs a bunch of 
rows and then does a bunch of UPDATEs on different subsets of those 
rows. I don't believe there would have been a large amount of deletes; 
I'll check with them tomorrow.

IMHO we need to change the messages so they are explicit about line 
pointers vs actual tuples. Trying to obfuscate that just leads to 
confusion. heap_page_prune needs to report only non-rootlp tuples that 
were pruned. (None of the other callers care about the return value.)
-- 
Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX
Experts in Analytics, Data Architecture and PostgreSQL
Data in Trouble? Get it in Treble! http://BlueTreble.com



Re: Clarify vacuum verbose message

От
Robert Haas
Дата:
On Tue, Dec 15, 2015 at 9:19 PM, Jim Nasby <Jim.Nasby@bluetreble.com> wrote:
> VACUUM VERBOSE spits out two different messages for the heap, one of which
> is rather confusing:
>
> INFO:  "trades": removed 625664 row versions in 20967 pages
> ...
> INFO:  "trades": found 3282 removable, 56891627 nonremovable row versions in
> 1986034 out of 1986034 pages
>
> After discussion with RhodiumToad I think I now understand how this can
> happen:
>
> 20:00 < RhodiumToad> the LP_DEAD slot is where the index entries for the
> deleted row point to, so that has to stay
> 20:01 < RhodiumToad> so for example, if you delete a lot of rows, then try
> and do a lot of updates (which will hint the
>                      pages as needing pruning),
> 20:01 < RhodiumToad> then do more updates or a seqscan (to let prune look at
> the pages),
> 20:02 < RhodiumToad> then do a vacuum, the vacuum will see a lot of LP_DEAD
> slots to remove index entries for, but not
>                      actual tuples
>
> This example is from a table that was VACUUM FULL'd this weekend and had a
> nightly batch process run last night. That process INSERTs a bunch of rows
> and then does a bunch of UPDATEs on different subsets of those rows. I don't
> believe there would have been a large amount of deletes; I'll check with
> them tomorrow.
>
> IMHO we need to change the messages so they are explicit about line pointers
> vs actual tuples. Trying to obfuscate that just leads to confusion.
> heap_page_prune needs to report only non-rootlp tuples that were pruned.
> (None of the other callers care about the return value.)

Yeah, I've had the the thought before that this reporting could be
more clear.  I think it never really got revised when 8.3 invented
HOT.  It might be about time for that.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company