Re: POC: Parallel processing of indexes in autovacuum
От | Daniil Davydov |
---|---|
Тема | Re: POC: Parallel processing of indexes in autovacuum |
Дата | |
Msg-id | CAJDiXgigcF3CMY86oREdQvxUDaUDFihkK9f78rdEyLTLeB0hdA@mail.gmail.com обсуждение исходный текст |
Ответ на | Re: POC: Parallel processing of indexes in autovacuum (Sami Imseih <samimseih@gmail.com>) |
Ответы |
Re: POC: Parallel processing of indexes in autovacuum
|
Список | pgsql-hackers |
On Fri, May 2, 2025 at 11:58 PM Sami Imseih <samimseih@gmail.com> wrote: > > I am generally -1 on the idea of autovacuum performing parallel > index vacuum, because I always felt that the parallel option should > be employed in a targeted manner for a specific table. if you have a bunch > of large tables, some more important than others, a/c may end > up using parallel resources on the least important tables and you > will have to adjust a/v settings per table, etc to get the right table > to be parallel index vacuumed by a/v. Hm, this is a good point. I think I should clarify one moment - in practice, there is a common situation when users have one huge table among all databases (with 80+ indexes created on it). But, of course, in general there may be few such tables. But we can still adjust the autovac_idx_parallel_min_rows parameter. If a table has a lot of dead tuples => it is actively used => table is important (?). Also, if the user can really determine the "importance" of each of the tables - we can provide an appropriate table option. Tables with this option set will be processed in parallel in priority order. What do you think about such an idea? > > Also, with the TIDStore improvements for index cleanup, and the practical > elimination of multi-pass index vacuums, I see this being even less > convincing as something to add to a/v. If I understood correctly, then we are talking about the fact that TIDStore can store so many tuples that in fact a second pass is never needed. But the number of passes does not affect the presented optimization in any way. We must think about a large number of indexes that must be processed. Even within a single pass we can have a 40% increase in speed. > > Now, If I am going to allocate extra workers to run vacuum in parallel, why > not just provide more autovacuum workers instead so I can get more tables > vacuumed within a span of time? For now, only one process can clean up indexes, so I don't see how increasing the number of a/v workers will help in the situation that I mentioned above. Also, we don't consume additional resources during autovacuum in this patch - total number of a/v workers always <= autovacuum_max_workers. BTW, see v2 patch, attached to this letter (bug fixes) :-) -- Best regards, Daniil Davydov
Вложения
В списке pgsql-hackers по дате отправления: