Re: Using read_stream in index vacuum

Поиск
Список
Период
Сортировка
От Melanie Plageman
Тема Re: Using read_stream in index vacuum
Дата
Msg-id CAAKRu_bCZJT6yLkExmTmgOa9VK+41jjpi5bVEgsEO2BiNQXZ+w@mail.gmail.com
обсуждение исходный текст
Ответ на Re: Using read_stream in index vacuum  (Junwang Zhao <zhjwpku@gmail.com>)
Список pgsql-hackers
On Wed, Oct 23, 2024 at 4:29 PM Andrey M. Borodin <x4mmm@yandex-team.ru> wrote:
>
> > On 23 Oct 2024, at 20:57, Andrey M. Borodin <x4mmm@yandex-team.ru> wrote:
> >
> > I'll think how to restructure flow there...
>
> OK, I've understood how it should be structured. PFA v5. Sorry for the noise.

I think this would be a bit nicer:

while (BufferIsValid(buf = read_stream_next_buffer(stream, NULL)))
{
       block = btvacuumpage(&vstate, buf);
       if (info->report_progress)
              pgstat_progress_update_param(PROGRESS_SCAN_BLOCKS_DONE, block);
}

Maybe change btvacuumpage() to return the block number to avoid the
extra BufferGetBlockNumber() calls (those add up).

While looking at this, I started to wonder if it isn't incorrect that
we are not calling pgstat_progress_update_param() for the blocks that
we backtrack and read in btvacuumpage() too (on master as well).
btvacuumpage() may actually have scanned more than one block, so...

Unrelated to code review, but btree index vacuum has the same issues
that kept us from committing streaming heap vacuum last release --
interactions between the buffer access strategy ring buffer size and
the larger reads -- one of which is an increase in the number of WAL
syncs and writes required. Thomas mentions it here [1] and here [2] is
the thread where he proposes adding vectored writeback to address some
of these issues.

- Melanie

[1]
https://www.postgresql.org/message-id/CA%2BhUKGKN3oy0bN_3yv8hd78a4%2BM1tJC9z7mD8%2Bf%2ByA%2BGeoFUwQ%40mail.gmail.com
[2]
https://www.postgresql.org/message-id/flat/CA%2BhUKGK1in4FiWtisXZ%2BJo-cNSbWjmBcPww3w3DBM%2BwhJTABXA%40mail.gmail.com



В списке pgsql-hackers по дате отправления: