Re: Using read_stream in index vacuum
От | Melanie Plageman |
---|---|
Тема | Re: Using read_stream in index vacuum |
Дата | |
Msg-id | CAAKRu_aMiTqCB3PvUeCT46xSv6tkeKfqcnfpnvzLx4Yj_b+KDw@mail.gmail.com обсуждение исходный текст |
Ответ на | Re: Using read_stream in index vacuum ("Andrey M. Borodin" <x4mmm@yandex-team.ru>) |
Ответы |
Re: Using read_stream in index vacuum
|
Список | pgsql-hackers |
On Sun, Oct 20, 2024 at 10:19 AM Andrey M. Borodin <x4mmm@yandex-team.ru> wrote: > > > > > On 20 Oct 2024, at 15:16, Junwang Zhao <zhjwpku@gmail.com> wrote: > > > > I'm not sure if I did not express myself correctly, I didn't mean to > > restart the stream, > > I mean we can create a new stream for each outer loop, I attached a > > refactor 0002 > > based on your 0001, correct me if I'm wrong. > > I really like how the code looks with this refactoring. But I think we need some input from Thomas. > Is it OK if we start handful of streams for 1 page at the end of vacuum scan? How costly is to start a new scan? You shouldn't need either loop in btvacuumscan(). For the inner loop: for (; scanblkno < num_pages; scanblkno++) { Buffer buf = read_stream_next_buffer(stream, NULL); btvacuumpage(&vstate, buf); if (info->report_progress) pgstat_progress_update_param(PROGRESS_SCAN_BLOCKS_DONE, scanblkno); } you should just be able to be do something like Buffer buf = read_stream_next_buffer(stream, NULL); if (BufferIsValid(buf)) btvacuumpage(&vstate, buf); Obviously I am eliding some details and clean-up and such. But your read stream callback should be responsible for advancing the block number and thus you shouldn't need to loop like this in btvacuumscan(). The whole point of the read stream callback provided by the caller is that the logic to get the next block should be contained there (read_stream_get_block() is an exception to this). For the outer loop, I feel like we have options. For example, maybe the read stream callback can call RelationGetNumberOfBlocks(). I mean maybe we don't want to have to take a relation extension lock in a callback. So, alternatively, we could add some kind of restartable flag to the read stream API. So, after the callback returns InvalidBlockNumber and the read_stream_next_buffer() returns NULL, we could call something like read_stream_update() or read_stream_restart() or something. We would have updated the BlockRangeReadStreamPrivate->last_exclusive value. In your case it might not be substantially different operationally than making new read streams (since you are not allocating memory for a per-buffer data structure). But, I think the code would read much better than making new read stream objects in a loop. - Melanie
В списке pgsql-hackers по дате отправления: