Re: [HACKERS] Block level parallel vacuum

Поиск
Список
Период
Сортировка
От Masahiko Sawada
Тема Re: [HACKERS] Block level parallel vacuum
Дата
Msg-id CAD21AoDhAutvKbQ37Btf4taMVbQaOaSvOpxpLgu814T1-OqYGg@mail.gmail.com
обсуждение исходный текст
Ответ на Re: [HACKERS] Block level parallel vacuum  (Amit Kapila <amit.kapila16@gmail.com>)
Ответы Re: [HACKERS] Block level parallel vacuum  (Amit Kapila <amit.kapila16@gmail.com>)
Список pgsql-hackers
On Sun, Nov 25, 2018 at 2:35 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> On Sat, Nov 24, 2018 at 5:47 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> > On Tue, Oct 30, 2018 at 2:04 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> > >
> >

Thank you for the comment.

> > I could see that you have put a lot of effort on this patch and still
> > we are not able to make much progress mainly I guess because of
> > relation extension lock problem.  I think we can park that problem for
> > some time (as already we have invested quite some time on it), discuss
> > a bit about actual parallel vacuum patch and then come back to it.
> >
>
> Today, I was reading this and previous related thread [1] and it seems
> to me multiple people Andres [2], Simon [3] have pointed out that
> parallelization for index portion is more valuable.  Also, some of the
> results [4] indicate the same.  Now, when there are no indexes,
> parallelizing heap scans also have benefit, but I think in practice we
> will see more cases where the user wants to vacuum tables with
> indexes.  So how about if we break this problem in the following way
> where each piece give the benefit of its own:
> (a) Parallelize index scans wherein the workers will be launched only
> to vacuum indexes.  Only one worker per index will be spawned.
> (b) Parallelize per-index vacuum.  Each index can be vacuumed by
> multiple workers.
> (c) Parallelize heap scans where multiple workers will scan the heap,
> collect dead TIDs and then launch multiple workers for indexes.
>
> I think if we break this problem into multiple patches, it will reduce
> the scope of each patch and help us in making progress.   Now, it's
> been more than 2 years that we are trying to solve this problem, but
> still didn't make much progress.  I understand there are various
> genuine reasons and all of that work will help us in solving all the
> problems in this area.  How about if we first target problem (a) and
> once we are done with that we can see which of (b) or (c) we want to
> do first?

Thank you for suggestion. It seems good to me. We would get a nice
performance scalability even by only (a), and vacuum will get more
powerful by (b) or (c). Also, (a) would not require to resovle the
relation extension lock issue IIUC. I'll change the patch and submit
to the next CF.

Regards,

--
Masahiko Sawada
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center


В списке pgsql-hackers по дате отправления:

Предыдущее
От: John Naylor
Дата:
Сообщение: Re: inconsistency and inefficiency in setup_conversion()
Следующее
От: Amit Kapila
Дата:
Сообщение: Re: Undo logs