Re: eliminate xl_heap_visible to reduce WAL (and eventually set VM on-access)

Поиск
Список
Период
Сортировка
От Melanie Plageman
Тема Re: eliminate xl_heap_visible to reduce WAL (and eventually set VM on-access)
Дата
Msg-id CAAKRu_ZcK+ez71W7j+QzMo0gSafTsbGA3TU-fVSptw7yh41BEQ@mail.gmail.com
обсуждение исходный текст
Ответ на Re: eliminate xl_heap_visible to reduce WAL (and eventually set VM on-access)  (Xuneng Zhou <xunengzhou@gmail.com>)
Список pgsql-hackers
Attached v29 addresses some feedback and also corrects a small error
with the assertion I had added in the previous version's 0009.

On Thu, Dec 18, 2025 at 10:38 PM Xuneng Zhou <xunengzhou@gmail.com> wrote:
>
> I’ve done a basic review of patches 1 and 2. Here are some comments
> which may be somewhat immature, as this is a fairly large change set
> and I’m new to some parts of the code.
>
> 1) Potential stale old_vmbits after VM repair n v2

Good catch! I've fixed this in attached v29.

> 2) Add Assert(BufferIsDirty(buf))
>
> Since the patch's core claim is "buffer must be dirty before WAL
> registration", an assertion encodes this invariant. Should we add:
>
> Assert(BufferIsValid(buf));
> Assert(BufferIsDirty(buf));
>
> right before the visibilitymap_set() call?

There are already assertions that will trip in various places -- most
importantly in XLogRegisterBuffer(), which is the one that inspired
this refactor.

> The comment at lines:
> > "The only scenario where it is not already dirty is if the VM was removed…"
>
> This phrasing could become misleading after future refactors. Can we
> make it more direct like:
>
> > "We must mark the heap buffer dirty before calling visibilitymap_set(), because it may WAL-log the buffer and
XLogRegisterBuffer()requires it." 

I see your point about future refactors missing updating comments like
this. But, I don't think we are going to refactor the code such that
we can have PD_ALL_VISIBLE set without the VM bits set more often.
Also, it is common practice in Postgres to describe very specific edge
cases or odd scenarios in order to explain code that may seem
confusing without the comment. It does risk that comment later
becoming stale, but it is better that future developers understand why
the code is there.

That being said, I take your point that the comment is confusing. I
have updated it in a different way.

> > "Even if PD_ALL_VISIBLE is already set, we don't need to worry about unnecessarily dirtying the heap buffer, as it
mustbe marked dirty before adding it to the WAL chain. The only scenario where it is not already dirty is if the VM was
removed..."
>
> In this test we now call MarkBufferDirty() on the heap page even when
> only setting the VM, so the comments claiming “does not need to modify
> the heap buffer”/“no heap page modification” might be misleading. It
> might be better to say the test doesn’t need to modify heap
> tuples/page contents or doesn’t need to prune/freeze.

The point I'm trying to make is that we have to dirty the buffer even
if we don't modify the page because of the XLOG sub-system
requirements. And, it may seem like a waste to do that if not
modifying the page, but the page will rarely be clean anyway. I've
tried to make this more clear in attached v29.

- Melanie

Вложения

В списке pgsql-hackers по дате отправления: