Re: [HACKERS] Block level parallel vacuum

Поиск
Список
Период
Сортировка
От Dilip Kumar
Тема Re: [HACKERS] Block level parallel vacuum
Дата
Msg-id CAFiTN-voSVoc0JR1sf_AFQeE_F1FfRrH2uNiCqraUPcrcx_+dQ@mail.gmail.com
обсуждение исходный текст
Ответ на Re: [HACKERS] Block level parallel vacuum  (Masahiko Sawada <masahiko.sawada@2ndquadrant.com>)
Ответы Re: [HACKERS] Block level parallel vacuum  (Masahiko Sawada <masahiko.sawada@2ndquadrant.com>)
Список pgsql-hackers
On Wed, Nov 13, 2019 at 11:39 AM Masahiko Sawada
<masahiko.sawada@2ndquadrant.com> wrote:
>
> On Wed, 13 Nov 2019 at 12:43, Dilip Kumar <dilipbalaut@gmail.com> wrote:
> >
> > On Tue, Nov 12, 2019 at 5:31 PM Masahiko Sawada
> > <masahiko.sawada@2ndquadrant.com> wrote:
> > >
> > > On Tue, 12 Nov 2019 at 20:29, Dilip Kumar <dilipbalaut@gmail.com> wrote:
> > > >
> > > > On Tue, Nov 12, 2019 at 4:04 PM Masahiko Sawada
> > > > <masahiko.sawada@2ndquadrant.com> wrote:
> > > > >
> > > > > On Mon, 11 Nov 2019 at 17:57, Dilip Kumar <dilipbalaut@gmail.com> wrote:
> > > > > >
> > > > > > On Tue, Oct 29, 2019 at 12:37 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> > > > > > > I realized that v31-0006 patch doesn't work fine so I've attached the
> > > > > > > updated version patch that also incorporated some comments I got so
> > > > > > > far. Sorry for the inconvenience. I'll apply your 0001 patch and also
> > > > > > > test the total delay time.
> > > > > > >
> > > > > > While reviewing the 0002, I got one doubt related to how we are
> > > > > > dividing the maintainance_work_mem
> > > > > >
> > > > > > +prepare_index_statistics(LVShared *lvshared, Relation *Irel, int nindexes)
> > > > > > +{
> > > > > > + /* Compute the new maitenance_work_mem value for index vacuuming */
> > > > > > + lvshared->maintenance_work_mem_worker =
> > > > > > + (nindexes_mwm > 0) ? maintenance_work_mem / nindexes_mwm :
> > > > > > maintenance_work_mem;
> > > > > > +}
> > > > > > Is it fair to just consider the number of indexes which use
> > > > > > maintenance_work_mem?  Or we need to consider the number of worker as
> > > > > > well.  My point is suppose there are 10 indexes which will use the
> > > > > > maintenance_work_mem but we are launching just 2 workers then what is
> > > > > > the point in dividing the maintenance_work_mem by 10.
> > > > > >
> > > > > > IMHO the calculation should be like this
> > > > > > lvshared->maintenance_work_mem_worker = (nindexes_mwm > 0) ?
> > > > > > maintenance_work_mem / Min(nindexes_mwm, nworkers)  :
> > > > > > maintenance_work_mem;
> > > > > >
> > > > > > Am I missing something?
> > > > >
> > > > > No, I think you're right. On the other hand I think that dividing it
> > > > > by the number of indexes that will use the mantenance_work_mem makes
> > > > > sense when parallel degree > the number of such indexes. Suppose the
> > > > > table has 2 indexes and there are 10 workers then we should divide the
> > > > > maintenance_work_mem by 2 rather than 10 because it's possible that at
> > > > > most 2 indexes that uses the maintenance_work_mem are processed in
> > > > > parallel at a time.
> > > > >
> > > > Right, thats the reason I suggested divide with Min(nindexes_mwm, nworkers).
> > >
> > > Thanks! I'll fix it in the next version patch.
> > >
> > One more comment.
> >
> > +lazy_vacuum_indexes(LVRelStats *vacrelstats, Relation *Irel,
> > + int nindexes, IndexBulkDeleteResult **stats,
> > + LVParallelState *lps)
> > +{
> > + ....
> >
> > + if (ParallelVacuumIsActive(lps))
> > + {
> >
> > +
> > + lazy_parallel_vacuum_or_cleanup_indexes(vacrelstats, Irel, nindexes,
> > + stats, lps);
> > +
> > + }
> > +
> > + for (idx = 0; idx < nindexes; idx++)
> > + {
> > + /*
> > + * Skip indexes that we have already vacuumed during parallel index
> > + * vacuuming.
> > + */
> > + if (ParallelVacuumIsActive(lps) && !IndStatsIsNull(lps->lvshared, idx))
> > + continue;
> > +
> > + lazy_vacuum_index(Irel[idx], &stats[idx], vacrelstats->dead_tuples,
> > +   vacrelstats->old_live_tuples);
> > + }
> > +}
> >
> > In this function, if ParallelVacuumIsActive, we perform the parallel
> > vacuum for all the index for which parallel vacuum is supported and
> > once that is over we finish vacuuming remaining indexes for which
> > parallel vacuum is not supported.  But, my question is that inside
> > lazy_parallel_vacuum_or_cleanup_indexes, we wait for all the workers
> > to finish their job then only we start with the sequential vacuuming
> > shouldn't we start that immediately as soon as the leader
> > participation is over in the parallel vacuum?
>
> If we do that, while the leader process is vacuuming indexes that
> don't not support parallel vacuum sequentially some workers might be
> vacuuming for other indexes. Isn't it a problem?

I am not sure what could be the problem.

 If it's not problem,
> I think we can tie up indexes that don't support parallel vacuum to
> the leader and do parallel index vacuum.

I am not sure whether we can do that or not.  Because if we do a
parallel vacuum from the leader for the indexes which don't support a
parallel option then we will unnecessarily allocate the shared memory
for those indexes (index stats).  Moreover, I think it could also
cause a problem in a multi-pass vacuum if we try to copy its stats
into the shared memory.

I think simple option would be that as soon as leader participation is
over we can have a loop for all the indexes who don't support
parallelism in that phase and after completing that we wait for the
parallel workers to finish.

-- 
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com



В списке pgsql-hackers по дате отправления:

Предыдущее
От: btfujiitkp
Дата:
Сообщение: Re: Should we add xid_current() or a int8->xid cast?
Следующее
От: Masahiko Sawada
Дата:
Сообщение: Re: [HACKERS] Block level parallel vacuum