Re: [HACKERS] Block level parallel vacuum
| От | Amit Kapila | 
|---|---|
| Тема | Re: [HACKERS] Block level parallel vacuum | 
| Дата | |
| Msg-id | CAA4eK1K+2qucdnyAk-eZ7zOezsyhNz8B6K0bOV_Ah9TouOi8-A@mail.gmail.com обсуждение исходный текст | 
| Ответ на | Re: [HACKERS] Block level parallel vacuum (Masahiko Sawada <sawada.mshk@gmail.com>) | 
| Ответы | Re: [HACKERS] Block level parallel vacuum | 
| Список | pgsql-hackers | 
On Fri, Oct 4, 2019 at 7:34 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
On Fri, Oct 4, 2019 at 2:02 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
>>
>> I'd also prefer to use maintenance_work_mem at max during parallel
>> vacuum regardless of the number of parallel workers. This is current
>> implementation. In lazy vacuum the maintenance_work_mem is used to
>> record itempointer of dead tuples. This is done by leader process and
>> worker processes just refers them for vacuuming dead index tuples.
>> Even if user sets a small amount of maintenance_work_mem the parallel
>> vacuum would be helpful as it still would take a time for index
>> vacuuming. So I thought we should cap the number of parallel workers
>> by the number of indexes rather than maintenance_work_mem.
>>
>
> Isn't that true only if we never use maintenance_work_mem during index cleanup? However, I think we are using during index cleanup, see forex. ginInsertCleanup. I think before reaching any conclusion about what to do about this, first we need to establish whether this is a problem. If I am correct, then only some of the index cleanups (like gin index) use maintenance_work_mem, so we need to consider that point while designing a solution for this.
>
I got your point. Currently the single process lazy vacuum could
consume the amount of (maintenance_work_mem * 2) memory at max because
we do index cleanup during holding the dead tuple space as you
mentioned. And ginInsertCleanup is also be called at the beginning of
ginbulkdelete. In current parallel lazy vacuum, each parallel vacuum
worker could consume other memory apart from the memory used by heap
scan depending on the implementation of target index AM. Given that
the current single and parallel vacuum implementation it would be
better to control the amount memory in total rather than the number of
parallel workers. So one approach I came up with is that we make all
vacuum workers use the amount of (maintenance_work_mem / # of
participants) as new maintenance_work_mem.
Yeah, we can do something like that, but I am not clear whether the current memory usage for Gin indexes is correct.  I have started a new thread, let's discuss there.
В списке pgsql-hackers по дате отправления: