Re: Parallel heap vacuum
От | Masahiko Sawada |
---|---|
Тема | Re: Parallel heap vacuum |
Дата | |
Msg-id | CAD21AoAtarAob7q4aMK044=7nGx9RjAfsUprroZJa4k4+FYZ6w@mail.gmail.com обсуждение исходный текст |
Ответ на | Re: Parallel heap vacuum (Melanie Plageman <melanieplageman@gmail.com>) |
Ответы |
Re: Parallel heap vacuum
|
Список | pgsql-hackers |
On Tue, Aug 26, 2025 at 8:55 AM Melanie Plageman <melanieplageman@gmail.com> wrote: > > On Wed, Jul 23, 2025 at 12:06 PM Andres Freund <andres@anarazel.de> wrote: > > > > On 2025-07-22 11:44:29 -0700, Masahiko Sawada wrote: > > > Do you think it makes sense to implement the above idea that we launch > > > parallel vacuum workers for heap through the same vacuumparallel.c > > > codebase and maintain the consistent interface with parallel index > > > vacuuming APIs? > > > > Yes, that might make sense. But wiring it up via tableam doesn't make sense. > > If you do parallel worker setup below heap_vacuum_rel(), then how are > you supposed to use those workers to do non-heap table vacuuming? IIUC non-heap tables can call parallel_vacuum_init() in its relation_vacuum table AM callback implementation in order to initialize parallel table vacuum, parallel index vacuum, or both. > > All the code in vacuumparallel.c is invoked from below > lazy_scan_heap(), so I don't see how having a > vacuumparallel.c-specific callback struct solves the layering > violation. I think the layering problem we discussed is about where the callbacks are declared; it seems odd that we add new table AM callbacks that are invoked only by another table AM callback. IIUC we invoke all codes in vacuumparallel.c in vacuumlazy.c even today. If we think we think this design has a problem in terms of layering of functions, we can refactor it as a separate patch. > It seems like parallel index vacuuming setup would have to be done in > vacuum_rel() if we want to reuse the same parallel workers for the > table vacuuming and index vacuuming phases and allow for different > table AMs to vacuum the tables in their own way using these parallel > workers. Hmm, let me clarify your idea as I'm confused. If the parallel context used for both table vacuuming and index vacuuming is set up in vacuum_rel(), its DSM would need to have some data too required by table AM to do parallel table vacuuming. In order to do that, table AMs somehow need to tell the necessary DSM size at least. How do table AMs tell that to parallel vacuum initialization function (e.g., parallel_vacuum_init()) in vacuum_rel()? Also, if we set up the parallel context in vacuum_rel(), we would somehow need to pass it to the relation_vacuum table AM callback so that they can use it during their own vacuum operation. Do you mean to pass it via table_relation_vacuum()? Regards, -- Masahiko Sawada Amazon Web Services: https://aws.amazon.com
В списке pgsql-hackers по дате отправления: