Re: [HACKERS] Block level parallel vacuum

Поиск
Список
Период
Сортировка
От Yura Sokolov
Тема Re: [HACKERS] Block level parallel vacuum
Дата
Msg-id c820e50d-6097-9980-f024-2adc3c646f3f@gmail.com
обсуждение исходный текст
Ответ на Re: [HACKERS] Block level parallel vacuum  (Masahiko Sawada <sawada.mshk@gmail.com>)
Ответы Re: [HACKERS] Block level parallel vacuum
Список pgsql-hackers
Excuse me for being noisy.

Increasing vacuum's ring buffer improves vacuum upto 6 times.
https://www.postgresql.org/message-id/flat/20170720190405.GM1769%40tamriel.snowman.net
This is one-line change.

How much improvement parallel vacuum gives?

31.10.2018 3:23, Masahiko Sawada пишет:
> On Tue, Oct 30, 2018 at 5:30 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
>>
>> On Tue, Aug 14, 2018 at 9:31 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
>>>
>>> On Thu, Nov 30, 2017 at 11:09 AM, Michael Paquier
>>> <michael.paquier@gmail.com> wrote:
>>>> On Tue, Oct 24, 2017 at 5:54 AM, Masahiko Sawada <sawada.mshk@gmail.com> wrote:
>>>>> Yeah, I was thinking the commit is relevant with this issue but as
>>>>> Amit mentioned this error is emitted by DROP SCHEMA CASCASE.
>>>>> I don't find out the cause of this issue yet. With the previous
>>>>> version patch, autovacuum workers were woking with one parallel worker
>>>>> but it never drops relations. So it's possible that the error might
>>>>> not have been relevant with the patch but anywayI'll continue to work
>>>>> on that.
>>>>
>>>> This depends on the extension lock patch from
>>>> https://www.postgresql.org/message-id/flat/CAD21AoCmT3cFQUN4aVvzy5chw7DuzXrJCbrjTU05B+Ss=Gn1LA@mail.gmail.com/
>>>> if I am following correctly. So I propose to mark this patch as
>>>> returned with feedback for now, and come back to it once the root
>>>> problems are addressed. Feel free to correct me if you think that's
>>>> not adapted.
>>>
>>> I've re-designed the parallel vacuum patch. Attached the latest
>>> version patch. As the discussion so far, this patch depends on the
>>> extension lock patch[1]. However I think we can discuss the design
>>> part of parallel vacuum independently from that patch. That's way I'm
>>> proposing the new patch. In this patch, I structured and refined the
>>> lazy_scan_heap() because it's a single big function and not suitable
>>> for making it parallel.
>>>
>>> The parallel vacuum worker processes keep waiting for commands from
>>> the parallel vacuum leader process. Before entering each phase of lazy
>>> vacuum such as scanning heap, vacuum index and vacuum heap, the leader
>>> process changes the all workers state to the next state. Vacuum worker
>>> processes do the job according to the their state and wait for the
>>> next command after finished. Also in before entering the next phase,
>>> the leader process does some preparation works while vacuum workers is
>>> sleeping; for example, clearing shared dead tuple space before
>>> entering the 'scanning heap' phase. The status of vacuum workers are
>>> stored into a DSM area pointed by WorkerState variables, and
>>> controlled by the leader process. FOr the basic design and performance
>>> improvements please refer to my presentation at PGCon 2018[2].
>>>
>>> The number of parallel vacuum workers is determined according to
>>> either the table size or PARALLEL option in VACUUM command. The
>>> maximum of parallel workers is max_parallel_maintenance_workers.
>>>
>>> I've separated the code for vacuum worker process to
>>> backends/commands/vacuumworker.c, and created
>>> includes/commands/vacuum_internal.h file to declare the definitions
>>> for the lazy vacuum.
>>>
>>> For autovacuum, this patch allows autovacuum worker process to use the
>>> parallel option according to the relation size or the reloption. But
>>> autovacuum delay, since there is no slots for parallel worker of
>>> autovacuum in AutoVacuumShmem this patch doesn't support the change of
>>> the autovacuum delay configuration during running.
>>>
>>
>> Attached rebased version patch to the current HEAD.
>>
>>> Please apply this patch with the extension lock patch[1] when testing
>>> as this patch can try to extend visibility map pages concurrently.
>>>
>>
>> Because the patch leads performance degradation in the case where
>> bulk-loading to a partitioned table I think that the original
>> proposal, which makes group locking conflict when relation extension
>> locks, is more realistic approach. So I worked on this with the simple
>> patch instead of [1]. Attached three patches:
>>
>> * 0001 patch publishes some static functions such as
>> heap_paralellscan_startblock_init so that the parallel vacuum code can
>> use them.
>> * 0002 patch makes the group locking conflict when relation extension locks.
>> * 0003 patch add paralel option to lazy vacuum.
>>
>> Please review them.
>>
> 
> Oops, forgot to attach patches.
> 
> Regards,
> 
> --
> Masahiko Sawada
> NIPPON TELEGRAPH AND TELEPHONE CORPORATION
> NTT Open Source Software Center
> 



В списке pgsql-hackers по дате отправления:

Предыдущее
От: Gavin Flower
Дата:
Сообщение: Re: Speeding up INSERTs and UPDATEs to partitioned tables
Следующее
От: Michael Paquier
Дата:
Сообщение: Re: pgsql: Avoid duplicate XIDs at recovery when building initialsnapshot