Re: Parallel heap vacuum

Поиск

Список

Период

Сортировка

От	Masahiko Sawada
Тема	Re: Parallel heap vacuum
Дата	27 августа 21:29:49
Msg-id	CAD21AoAtarAob7q4aMK044=7nGx9RjAfsUprroZJa4k4+FYZ6w@mail.gmail.com обсуждение исходный текст
Ответ на	Re: Parallel heap vacuum (Melanie Plageman <melanieplageman@gmail.com>)
Ответы	Re: Parallel heap vacuum
Список	pgsql-hackers

Дерево обсуждения

On Tue, Aug 26, 2025 at 8:55 AM Melanie Plageman
<melanieplageman@gmail.com> wrote:
>
> On Wed, Jul 23, 2025 at 12:06 PM Andres Freund <andres@anarazel.de> wrote:
> >
> > On 2025-07-22 11:44:29 -0700, Masahiko Sawada wrote:
> > > Do you think it makes sense to implement the above idea that we launch
> > > parallel vacuum workers for heap through the same vacuumparallel.c
> > > codebase and maintain the consistent interface with parallel index
> > > vacuuming APIs?
> >
> > Yes, that might make sense. But wiring it up via tableam doesn't make sense.
>
> If you do parallel worker setup below heap_vacuum_rel(), then how are
> you supposed to use those workers to do non-heap table vacuuming?

IIUC non-heap tables can call parallel_vacuum_init() in its
relation_vacuum table AM callback implementation in order to
initialize parallel table vacuum, parallel index vacuum, or both.

>
> All the code in vacuumparallel.c is invoked from below
> lazy_scan_heap(), so I don't see how having a
> vacuumparallel.c-specific callback struct solves the layering
> violation.

I think the layering problem we discussed is about where the callbacks
are declared; it seems odd that we add new table AM callbacks that are
invoked only by another table AM callback. IIUC we invoke all codes in
vacuumparallel.c in vacuumlazy.c even today. If we think we think this
design has a problem in terms of layering of functions, we can
refactor it as a separate patch.

> It seems like parallel index vacuuming setup would have to be done in
> vacuum_rel() if we want to reuse the same parallel workers for the
> table vacuuming and index vacuuming phases and allow for different
> table AMs to vacuum the tables in their own way using these parallel
> workers.

Hmm, let me clarify  your idea as I'm confused. If the parallel
context used for both table vacuuming and index vacuuming is set up in
vacuum_rel(), its DSM would need to have some data too required by
table AM to do parallel table vacuuming. In order to do that, table
AMs somehow need to tell the necessary DSM size at least. How do table
AMs tell that to parallel vacuum initialization function (e.g.,
parallel_vacuum_init()) in vacuum_rel()?

Also, if we set up the parallel context in vacuum_rel(), we would
somehow need to pass it to the relation_vacuum table AM callback so
that they can use it during their own vacuum operation. Do you mean to
pass it via table_relation_vacuum()?

Regards,

--
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com

В списке pgsql-hackers по дате отправления:

Вход в личный кабинет

Восстановление пароля

Подтверждение аккаунта

Изменение пароля

Re: Parallel heap vacuum