Re: Combine Prune and Freeze records emitted by vacuum

Поиск

Список

Период

Сортировка

От	Heikki Linnakangas
Тема	Re: Combine Prune and Freeze records emitted by vacuum
Дата	1 апреля 2024 г. 17:37:46
Msg-id	a065b0a9-dabf-4f99-8ba4-9271995b7c3e@iki.fi обсуждение исходный текст
Ответ на	Re: Combine Prune and Freeze records emitted by vacuum (Melanie Plageman <melanieplageman@gmail.com>)
Ответы	Re: Combine Prune and Freeze records emitted by vacuum
Список	pgsql-hackers

Дерево обсуждения

On 01/04/2024 19:08, Melanie Plageman wrote:
> On Mon, Apr 01, 2024 at 05:17:51PM +0300, Heikki Linnakangas wrote:
>> diff --git a/src/backend/access/heap/pruneheap.c b/src/backend/access/heap/pruneheap.c
>> @@ -256,15 +270,16 @@ heap_page_prune(Relation relation, Buffer buffer,
>>       tup.t_tableOid = RelationGetRelid(relation);
>>   
>>       /*
>> -     * Determine HTSV for all tuples.
>> +     * Determine HTSV for all tuples, and queue them up for processing as HOT
>> +     * chain roots or as a heap-only items.
> 
> Reading this comment now as a whole, I would add something like
> "Determining HTSV for all tuples once is required for correctness" to
> the start of the second paragraph. The new conjunction on the first
> paragraph sentence followed by the next paragraph is a bit confusing
> because it sounds like queuing them up for processing is required for
> correctness (which, I suppose it is on some level). Basically, I'm just
> saying that it is now less clear what is required for correctness.

Fixed.

> While I recognize this is a matter of style and not important, I
> personally prefer this for reverse looping:
> 
>     for (int i = prstate.nheaponly_items; i --> 0;)

I don't think we use that style anywhere in the Postgres source tree 
currently. (And I don't like it ;-) )

> I do think a comment about the reverse order would be nice. I know it
> says something above the first loop to this effect:
> 
> * Processing the items in reverse order (and thus the tuples in
> * increasing order) increases prefetching efficiency significantly /
> * decreases the number of cache misses.
> 
> So perhaps we could just say "as above, process the items in reverse
> order"

I'm actually not sure why it makes a difference. I would assume all the 
data to already be in CPU cache at this point, since the first loop 
already accessed it, so I think there's something else going on. But I 
didn't investigate it deeper. Anyway, added a comment.

Committed the first of the remaining patches with those changes. And 
also this, which is worth calling out:

>         if (prstate.htsv[offnum] == HEAPTUPLE_DEAD)
>         {
>             ItemId        itemid = PageGetItemId(page, offnum);
>             HeapTupleHeader htup = (HeapTupleHeader) PageGetItem(page, itemid);
> 
>             if (likely(!HeapTupleHeaderIsHotUpdated(htup)))
>             {
>                 HeapTupleHeaderAdvanceConflictHorizon(htup,
>                                                       &prstate.latest_xid_removed);
>                 heap_prune_record_unused(&prstate, offnum, true);
>             }
>             else
>             {
>                 /*
>                  * This tuple should've been processed and removed as part of
>                  * a HOT chain, so something's wrong.  To preserve evidence,
>                  * we don't dare to remove it.  We cannot leave behind a DEAD
>                  * tuple either, because that will cause VACUUM to error out.
>                  * Throwing an error with a distinct error message seems like
>                  * the least bad option.
>                  */
>                 elog(ERROR, "dead heap-only tuple (%u, %d) is not linked to from any HOT chain",
>                      blockno, offnum);
>             }
>         }
>         else
>             heap_prune_record_unchanged_lp_normal(page, &prstate, offnum);

As you can see, I turned that into a hard error. Previously, that code 
was at the top of heap_prune_chain(), and it was normal to see DEAD 
heap-only tuples there, because they would presumably get processed 
later as part of a HOT chain. But now this is done after all the HOT 
chains have already been processed.

Previously if there was a dead heap-only tuple like that on the page for 
some reason, it was silently not processed by heap_prune_chain() 
(because it was assumed that it would be processed later as part of a 
HOT chain), and was left behind as a HEAPTUPLE_DEAD tuple. If the 
pruning was done as part of VACUUM, VACUUM would fail with "ERROR: 
unexpected HeapTupleSatisfiesVacuum result". Or am I missing something?

Now you get that above error also on on-access pruning, which is not 
ideal. But I don't remember hearing about corruption like that, and 
you'd get the error on VACUUM anyway.

With the next patches, heap_prune_record_unchanged() will do more, and 
will also throw an error on a HEAPTUPLE_LIVE tuple, so even though in 
the first patch we could print just a WARNING and move on, it gets more 
awkward with the rest of the patches.

(Continuing with the remaining patches..)

-- 
Heikki Linnakangas
Neon (https://neon.tech)

В списке pgsql-hackers по дате отправления:

Вход в личный кабинет

Восстановление пароля

Подтверждение аккаунта

Изменение пароля

Re: Combine Prune and Freeze records emitted by vacuum