Re: [HACKERS] Block level parallel vacuum

Поиск
Список
Период
Сортировка
От Masahiko Sawada
Тема Re: [HACKERS] Block level parallel vacuum
Дата
Msg-id CA+fd4k4T2egAfj5TY9WsYrqDfPamLAM-p8o14WCwKJn8z5jkMg@mail.gmail.com
обсуждение исходный текст
Ответ на Re: [HACKERS] Block level parallel vacuum  (Amit Kapila <amit.kapila16@gmail.com>)
Ответы Re: [HACKERS] Block level parallel vacuum
Список pgsql-hackers
On Mon, 11 Nov 2019 at 19:29, Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> On Mon, Nov 11, 2019 at 12:26 PM Masahiko Sawada
> <masahiko.sawada@2ndquadrant.com> wrote:
> >
> > On Mon, 11 Nov 2019 at 15:06, Dilip Kumar <dilipbalaut@gmail.com> wrote:
> > >
> > > On Mon, Nov 11, 2019 at 9:57 AM Masahiko Sawada
> > > <masahiko.sawada@2ndquadrant.com> wrote:
> > > >
> > > > Good point. gin and bloom do a certain heavy work during cleanup and
> > > > during bulkdelete as you mentioned. Brin does it during cleanup, and
> > > > hash and gist do it during bulkdelete. There are three types of index
> > > > AM just inside postgres code. An idea I came up with is that we can
> > > > control parallel vacuum and parallel cleanup separately.  That is,
> > > > adding a variable amcanparallelcleanup and we can do parallel cleanup
> > > > on only indexes of which amcanparallelcleanup is true.
> > > >
>
> This is what I mentioned in my email as a second option (whether to
> expose via IndexAM).  I am not sure if we can have a new variable just
> for this.
>
> > > > IndexBulkDelete
> > > > can be stored locally if both amcanparallelvacuum and
> > > > amcanparallelcleanup of an index are false because only the leader
> > > > process deals with such indexes. Otherwise we need to store it in DSM
> > > > as always.
> > > >
> > > IIUC,  amcanparallelcleanup will be true for those indexes which does
> > > heavy work during cleanup irrespective of whether bulkdelete is called
> > > or not e.g. gin?
> >
> > Yes, I guess that gin and brin set amcanparallelcleanup to true (gin
> > might set amcanparallevacuum to true as well).
> >
> > >  If so, along with an amcanparallelcleanup flag, we
> > > need to consider vacrelstats->num_index_scans right? So if
> > > vacrelstats->num_index_scans == 0 then we need to launch parallel
> > > worker for all the indexes who support amcanparallelvacuum and if
> > > vacrelstats->num_index_scans > 0 then only for those who has
> > > amcanparallelcleanup as true.
> >
> > Yes, you're right. But this won't work fine for brin indexes who don't
> > want to participate in parallel vacuum but always want to participate
> > in parallel cleanup.
> >
> > After more thoughts, I think we can have a ternary value: never,
> > always, once. If it's 'never' the index never participates in parallel
> > cleanup. I guess hash indexes use 'never'. Next, if it's 'always' the
> > index always participates regardless of vacrelstats->num_index_scan. I
> > guess gin, brin and bloom use 'always'. Finally if it's 'once' the
> > index participates in parallel cleanup only when it's the first time
> > (that is, vacrelstats->num_index_scan == 0), I guess btree, gist and
> > spgist use 'once'.
> >
>
> I think this 'once' option is confusing especially because it also
> depends on 'num_index_scans' which the IndexAM has no control over.
> It might be that the option name is not good, but I am not sure.
> Another thing is that for brin indexes, we don't want bulkdelete to
> participate in parallelism.

I thought brin should set amcanparallelvacuum is false and
amcanparallelcleanup is 'always'.

> Do we want to have separate variables for
> ambulkdelete and amvacuumcleanup which decides whether the particular
> phase can be done in parallel?

You mean adding variables to ambulkdelete and amvacuumcleanup as
function arguments? If so isn't it too late to tell the leader whether
the particular pchase can be done in parallel?

> Another possibility could be to just
> have one variable (say uint16 amparallelvacuum) which will tell us all
> the options but I don't think that will be a popular approach
> considering all the other methods and variables exposed.  What do you
> think?

Adding only one variable that can have flags would also be a good
idea, instead of having multiple variables for each option. For
instance FDW API uses such interface (see eflags of BeginForeignScan).

--
Masahiko Sawada            http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



В списке pgsql-hackers по дате отправления:

Предыдущее
От: Michael Paquier
Дата:
Сообщение: Re: Monitoring disk space from within the server
Следующее
От: Michael Paquier
Дата:
Сообщение: Re: Coding in WalSndWaitForWal