Re: Emit fewer vacuum records by reaping removable tuples during pruning

Поиск
Список
Период
Сортировка
От Melanie Plageman
Тема Re: Emit fewer vacuum records by reaping removable tuples during pruning
Дата
Msg-id CAAKRu_b_9N2i9fn02P9t_oeLqcEj4=E-EK20bZ69nThYWna-Xg@mail.gmail.com
обсуждение исходный текст
Ответ на Re: Emit fewer vacuum records by reaping removable tuples during pruning  (Michael Paquier <michael@paquier.xyz>)
Ответы Re: Emit fewer vacuum records by reaping removable tuples during pruning  (Robert Haas <robertmhaas@gmail.com>)
Список pgsql-hackers
On Tue, Jan 9, 2024 at 12:56 AM Michael Paquier <michael@paquier.xyz> wrote:
>
> On Mon, Jan 08, 2024 at 03:50:47PM -0500, Robert Haas wrote:
> > Hmm, interesting. I haven't had time to study this fully today, but I
> > think 0001 looks fine and could just be committed. Hooray for killing
> > useless variables with dumb names.
>
> I've been looking at 0001 a couple of weeks ago and thought that it
> was fine because there's only one caller of lazy_scan_prune() and one
> caller of lazy_scan_noprune() so all the code paths were covered.
>
> +   /* rel truncation is unsafe */
> +   if (hastup)
> +       vacrel->nonempty_pages = blkno + 1;

Andres had actually said that he didn't like pushing the update of
nonempty_pages into lazy_scan_[no]prune(). So, my v4 patch set
eliminates this.

I can see an argument for doing both the update of
vacrel->nonempty_pages and the FSM updates in lazy_scan_[no]prune()
because it eliminates some of the back-and-forth between the
block-specific functions and lazy_scan_heap().
lazy_scan_new_or_empty() has special logic for deciding how to update
the FSM -- so that remains in lazy_scan_new_or_empty() either way.

On the other hand, the comment above lazy_scan_new_or_empty() says we
can get rid of this special handling if we make relation extension
crash safe. Then it would make more sense to have a consolidated FSM
update in lazy_scan_heap(). However it does still mean that we repeat
the "UnlockReleaseBuffer()" and FSM update code in even more places.

Ultimately I can see arguments for and against. Is it better to avoid
having the same few lines of code in two places or avoid unneeded
communication between page-level functions and lazy_scan_heap()?

> Except for this comment that I found misleading because this is not
> about the fact that truncation is unsafe, it's about correctly
> tracking the the last block where we have tuples to ensure a correct
> truncation.  Perhaps this could just reuse "Remember the location of
> the last page with nonremovable tuples"?  If people object to that,
> feel free.

I agree the comment could be better. But, simply saying that it tracks
the last page with non-removable tuples makes it less clear how
important this is. It makes it sound like it could be simply for stats
purposes. I'll update the comment to something that includes that
sentiment but is more exact than "rel truncation is unsafe".

- Melanie



В списке pgsql-hackers по дате отправления:

Предыдущее
От: Robert Haas
Дата:
Сообщение: Re: the s_lock_stuck on perform_spin_delay
Следующее
От: Nathan Bossart
Дата:
Сообщение: Re: add AVX2 support to simd.h