Re: Confine vacuum skip logic to lazy_scan_skip

Поиск
Список
Период
Сортировка
От Heikki Linnakangas
Тема Re: Confine vacuum skip logic to lazy_scan_skip
Дата
Msg-id 3df2b582-dc1c-46b6-99b6-38eddd1b2784@iki.fi
обсуждение исходный текст
Ответ на Re: Confine vacuum skip logic to lazy_scan_skip  (Melanie Plageman <melanieplageman@gmail.com>)
Ответы Re: Confine vacuum skip logic to lazy_scan_skip
Re: Confine vacuum skip logic to lazy_scan_skip
Список pgsql-hackers
On 08/03/2024 02:46, Melanie Plageman wrote:
> On Wed, Mar 06, 2024 at 10:00:23PM -0500, Melanie Plageman wrote:
>>> I feel heap_vac_scan_get_next_block() function could use some love. Maybe
>>> just some rewording of the comments, or maybe some other refactoring; not
>>> sure. But I'm pretty happy with the function signature and how it's called.
> 
> I've cleaned up the comments on heap_vac_scan_next_block() in the first
> couple patches (not so much in the streaming read user). Let me know if
> it addresses your feelings or if I should look for other things I could
> change.

Thanks, that is better. I think I now finally understand how the 
function works, and now I can see some more issues and refactoring 
opportunities :-).

Looking at current lazy_scan_skip() code in 'master', one thing now 
caught my eye (and it's the same with your patches):

>     *next_unskippable_allvis = true;
>     while (next_unskippable_block < rel_pages)
>     {
>         uint8        mapbits = visibilitymap_get_status(vacrel->rel,
>                                                        next_unskippable_block,
>                                                        vmbuffer);
> 
>         if ((mapbits & VISIBILITYMAP_ALL_VISIBLE) == 0)
>         {
>             Assert((mapbits & VISIBILITYMAP_ALL_FROZEN) == 0);
>             *next_unskippable_allvis = false;
>             break;
>         }
> 
>         /*
>          * Caller must scan the last page to determine whether it has tuples
>          * (caller must have the opportunity to set vacrel->nonempty_pages).
>          * This rule avoids having lazy_truncate_heap() take access-exclusive
>          * lock on rel to attempt a truncation that fails anyway, just because
>          * there are tuples on the last page (it is likely that there will be
>          * tuples on other nearby pages as well, but those can be skipped).
>          *
>          * Implement this by always treating the last block as unsafe to skip.
>          */
>         if (next_unskippable_block == rel_pages - 1)
>             break;
> 
>         /* DISABLE_PAGE_SKIPPING makes all skipping unsafe */
>         if (!vacrel->skipwithvm)
>         {
>             /* Caller shouldn't rely on all_visible_according_to_vm */
>             *next_unskippable_allvis = false;
>             break;
>         }
> 
>         /*
>          * Aggressive VACUUM caller can't skip pages just because they are
>          * all-visible.  They may still skip all-frozen pages, which can't
>          * contain XIDs < OldestXmin (XIDs that aren't already frozen by now).
>          */
>         if ((mapbits & VISIBILITYMAP_ALL_FROZEN) == 0)
>         {
>             if (vacrel->aggressive)
>                 break;
> 
>             /*
>              * All-visible block is safe to skip in non-aggressive case.  But
>              * remember that the final range contains such a block for later.
>              */
>             skipsallvis = true;
>         }
> 
>         /* XXX: is it OK to remove this? */
>         vacuum_delay_point();
>         next_unskippable_block++;
>         nskippable_blocks++;
>     }

Firstly, it seems silly to check DISABLE_PAGE_SKIPPING within the loop. 
When DISABLE_PAGE_SKIPPING is set, we always return the next block and 
set *next_unskippable_allvis = false regardless of the visibility map, 
so why bother checking the visibility map at all?

Except at the very last block of the relation! If you look carefully, 
at the last block we do return *next_unskippable_allvis = true, if the 
VM says so, even if DISABLE_PAGE_SKIPPING is set. I think that's wrong. 
Surely the intention was to pretend that none of the VM bits were set if 
DISABLE_PAGE_SKIPPING is used, also for the last block.

This was changed in commit 980ae17310:

> @@ -1311,7 +1327,11 @@ lazy_scan_skip(LVRelState *vacrel, Buffer *vmbuffer, BlockNumber next_block,
>  
>                 /* DISABLE_PAGE_SKIPPING makes all skipping unsafe */
>                 if (!vacrel->skipwithvm)
> +               {
> +                       /* Caller shouldn't rely on all_visible_according_to_vm */
> +                       *next_unskippable_allvis = false;
>                         break;
> +               }

Before that, *next_unskippable_allvis was set correctly according to the 
VM, even when DISABLE_PAGE_SKIPPING was used. It's not clear to me why 
that was changed. And I think setting it to 'true' would be a more 
failsafe value than 'false'. When *next_unskippable_allvis is set to 
true, the caller cannot rely on it because a concurrent modification 
could immediately clear the VM bit. But because VACUUM is the only 
process that sets VM bits, if it's set to false, the caller can assume 
that it's still not set later on.

One consequence of that is that with DISABLE_PAGE_SKIPPING, 
lazy_scan_heap() dirties all pages, even if there are no changes. The 
attached test script demonstrates that.

ISTM we should revert the above hunk, and backpatch it to v16. I'm a 
little wary because I don't understand why that change was made in the 
first place, though. I think it was just an ill-advised attempt at 
tidying up the code as part of the larger commit, but I'm not sure. 
Peter, do you remember?

I wonder if we should give up trying to set all_visible_according_to_vm 
correctly when we decide what to skip, and always do 
"all_visible_according_to_vm = visibilitymap_get_status(...)" in 
lazy_scan_prune(). It would be more expensive, but maybe it doesn't 
matter in practice. It would get rid of this tricky bookkeeping in 
heap_vac_scan_next_block().

-- 
Heikki Linnakangas
Neon (https://neon.tech)

Вложения

В списке pgsql-hackers по дате отправления:

Предыдущее
От: Daniel Gustafsson
Дата:
Сообщение: Re: Support a wildcard in backtrace_functions
Следующее
От: Matthias van de Meent
Дата:
Сообщение: Re: Optimizing nbtree ScalarArrayOp execution, allowing multi-column ordered scans, skip scan