Re: HOT chain validation in verify_heapam()

Поиск
Список
Период
Сортировка
От Himanshu Upadhyaya
Тема Re: HOT chain validation in verify_heapam()
Дата
Msg-id CAPF61jD7uAQa4BQ2k3B5xKF_wSgJHeSTmm9U6ZJAU53+LPLH+Q@mail.gmail.com
обсуждение исходный текст
Ответ на Re: HOT chain validation in verify_heapam()  (Robert Haas <robertmhaas@gmail.com>)
Ответы Re: HOT chain validation in verify_heapam()  (Aleksander Alekseev <aleksander@timescale.com>)
Re: HOT chain validation in verify_heapam()  (Robert Haas <robertmhaas@gmail.com>)
Список pgsql-hackers

Hi Robert,

Thanks for sharing the feedback.

On Sat, Aug 27, 2022 at 1:47 AM Robert Haas <robertmhaas@gmail.com> wrote:
+                htup = (HeapTupleHeader) PageGetItem(ctx.page, rditem);
+                if (!(HeapTupleHeaderIsHeapOnly(htup) &&
htup->t_infomask & HEAP_UPDATED))
+                    report_corruption(&ctx,
+                                      psprintf("Redirected Tuple at
line pointer offset %u is not HEAP_ONLY_TUPLE or HEAP_UPDATED tuple",
+                                               (unsigned) rdoffnum));

This isn't safe because the line pointer referenced by rditem may not
have been sanity-checked yet. Refer to the comment just below where it
says "Sanity-check the line pointer's offset and length values".

handled by creating a new function check_lp and calling it before accessing the redirected tuple.


+            /*
+             * Add line pointer offset to predecessor array.
+             * 1) If xmax is matching with xmin of next
Tuple(reaching via its t_ctid).
+             * 2) If next tuple is in the same page.
+             * Raise corruption if:
+             * We have two tuples having same predecessor.
+             *
+             * We add offset to predecessor irrespective of
transaction(t_xmin) status. We will
+             * do validation related to transaction status(and also
all other validations)
+             * when we loop over predecessor array.
+             */

The formatting of this comment will, I think, be mangled if pgindent
is run against the file. You can use ----- markers to prevent that, I
believe, or (better) write this as a paragraph without relying on the
lines ending up uneven in length.

 
Done, reformatted using pg_indent. 

+                if (predecessor[nextTupOffnum] != 0)
+                {
+                    report_corruption(&ctx,
+                                psprintf("Tuple at offset %u is
reachable from two or more updated tuple",
+                                    (unsigned) nextTupOffnum));
+                    continue;
+                }

You need to do this test after xmin/xmax matching. Otherwise you might
get false positives. Also, the message should be more specific and
match the style of the existing messages. ctx.offnum is already going
to be reported in another column, but we can report both nextTupOffnum
and predecessor[nextTupOffnum] here e.g. "updated version at offset %u
is also the updated version of tuple at offset %u".

 
agree, done.

+                currTupXmax = HeapTupleHeaderGetUpdateXid(ctx.tuphdr);
+                lp   = PageGetItemId(ctx.page, nextTupOffnum);
+
+                htup = (HeapTupleHeader) PageGetItem(ctx.page, lp);

This has the same problem I mentioned in my first comment above,
namely, we haven't necessarily sanity-checked the length and offset
values for nextTupOffnum yet. Saying that another way, if the contents
of lp are corrupt and point off the page, we want that to be reported
as corruption (which the current code will already do) and we want
this check to be skipped so that we don't crash or access random
memory addresses. You need to think about how to rearrange the code so
that we only perform checks that are known to be safe.

 
Moved logic of sanity checked to a new function check_lp() and called before accessing the next tuple while populating the predecessor array.
 
Please take the time to format your code according to the PostgeSQL
standard practice. If you don't know what that looks like, use
pgindent.

+        {
+            HeapTupleHeader pred_htup, curr_htup;
+            TransactionId   pred_xmin, curr_xmin, curr_xmax;
+            ItemId      pred_lp, curr_lp;

Same here.
 
Done. 

I think it would actually be a good idea to set ctx.offnum to the
predecessor's offset number, and use a separate variable for the
current offset number. The reason why I think this is that I believe
it will make it easier to phrase the messages appropriately. For
example, if ctx.offnum is the predecessor tuple, then we can issue
complaints like this:

tuple with uncommitted xmin %u was updated to produce a tuple at
offset %u with differing xmin %u
unfrozen tuple was updated to produce a tuple at offset %u which is not frozen
tuple with uncommitted xmin %u has cmin %u, but was updated to produce
a tuple with cmin %u
non-heap-only update produced a heap-only tuple at offset %u
heap-only update produced a non-heap only tuple at offset %u


Agree, Done.
 
+            if (!TransactionIdIsValid(curr_xmax) &&
+                HeapTupleHeaderIsHotUpdated(curr_htup))
+            {
+                report_corruption(&ctx,
+                            psprintf("Current tuple at offset %u is
HOT but is last tuple in the HOT chain.",
+                            (unsigned) ctx.offnum));
+            }

This check has nothing to do with the predecessor[] array, so it seems
like it belongs in check_tuple() rather than here. Also, the message
is rather confused, because the test is checking whether the tuple has
been HOT-updated, while the message is talking about whether the tuple
was *itself* created by a HOT update. Also, when we're dealing with
corruption, statements like "is last tuple in the HOT chain" are
pretty ambiguous. Also, isn't this an issue for both HOT-updated
tuples and also just regular updated tuples? i.e. maybe what we should
be complaining about here is something like "tuple has been updated,
but xmax is 0" and then make the test check exactly that.
 
Moved to check_tuple_header. This should be applicable for both HOT and normal updates but even the last updated tuple in the normal update is HEAP_UPDATED so not sure how we can apply this check for a normal update?

+            if (!HeapTupleHeaderIsHotUpdated(pred_htup) &&
+                HeapTupleHeaderIsHeapOnly(pred_htup) &&
+                HeapTupleHeaderIsHeapOnly(curr_htup))
+            {
+                report_corruption(&ctx,
+                            psprintf("Current tuple at offset %u is
HOT but it is next updated tuple of last Tuple in HOT chain.",
+                            (unsigned) ctx.offnum));
+            }

Three if-statements up, you tested two out of these three conditions
and complained if they were met. So any time this fires, that will
have also fired.

Yes, the above condition is not required. Now removed.

--
Regards,
Himanshu Upadhyaya
EnterpriseDB: http://www.enterprisedb.com
Вложения

В списке pgsql-hackers по дате отправления:

Предыдущее
От: Shinya Kato
Дата:
Сообщение: Re: [PATCH] Tab completion for SET COMPRESSION
Следующее
От: "Anton A. Melnikov"
Дата:
Сообщение: May be BUG. Periodic burst growth of the checkpoint_req counter on replica.