Re: [HACKERS] Block level parallel vacuum

Поиск
Список
Период
Сортировка
От Masahiko Sawada
Тема Re: [HACKERS] Block level parallel vacuum
Дата
Msg-id CA+fd4k6nEbxgXHO-wuUKgixPNcrDho0vY-AR1571-OCOdovd-A@mail.gmail.com
обсуждение исходный текст
Ответ на Re: [HACKERS] Block level parallel vacuum  (Amit Kapila <amit.kapila16@gmail.com>)
Ответы Re: [HACKERS] Block level parallel vacuum  (Amit Kapila <amit.kapila16@gmail.com>)
Re: [HACKERS] Block level parallel vacuum  (Robert Haas <robertmhaas@gmail.com>)
Список pgsql-hackers
On Wed, 18 Dec 2019 at 19:06, Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> On Wed, Dec 18, 2019 at 12:04 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> >
> > On Wed, Dec 18, 2019 at 11:46 AM Masahiko Sawada
> > <masahiko.sawada@2ndquadrant.com> wrote:
> > >
> > > On Wed, 18 Dec 2019 at 15:03, Amit Kapila <amit.kapila16@gmail.com> wrote:
> > > >
> > > > I was analyzing your changes related to ReinitializeParallelDSM() and
> > > > it seems like we might launch more number of workers for the
> > > > bulkdelete phase.   While creating a parallel context, we used the
> > > > maximum of "workers required for bulkdelete phase" and "workers
> > > > required for cleanup", but now if the number of workers required in
> > > > bulkdelete phase is lesser than a cleanup phase(as mentioned by you in
> > > > one example), then we would launch more workers for bulkdelete phase.
> > >
> > > Good catch. Currently when creating a parallel context the number of
> > > workers passed to CreateParallelContext() is set not only to
> > > pcxt->nworkers but also pcxt->nworkers_to_launch. We would need to
> > > specify the number of workers actually to launch after created the
> > > parallel context or when creating it. Or I think we call
> > > ReinitializeParallelDSM() even the first time running index vacuum.
> > >
> >
> > How about just having ReinitializeParallelWorkers which can be called
> > only via vacuum even for the first time before the launch of workers
> > as of now?
> >
>
> See in the attached what I have in mind.  Few other comments:
>
> 1.
> + shared->disable_delay = (params->options & VACOPT_FAST);
>
> This should be part of the third patch.
>
> 2.
> +lazy_parallel_vacuum_indexes(Relation *Irel, IndexBulkDeleteResult **stats,
> + LVRelStats *vacrelstats, LVParallelState *lps,
> + int nindexes)
> {
> ..
> ..
> + /* Cap by the worker we computed at the beginning of parallel lazy vacuum */
> + nworkers = Min(nworkers, lps->pcxt->nworkers);
> ..
> }
>
> This should be Assert.  In no case, the computed workers can be more
> than what we have in context.
>
> 3.
> + if (((vacoptions & VACUUM_OPTION_PARALLEL_CLEANUP) != 0) ||
> + ((vacoptions & VACUUM_OPTION_PARALLEL_CLEANUP) != 0))
> + nindexes_parallel_cleanup++;
>
> I think the second condition should be VACUUM_OPTION_PARALLEL_COND_CLEANUP.
>
> I have fixed the above comments and some given by me earlier [1] in
> the attached patch.  The attached patch is a diff on top of
> v36-0002-Add-parallel-option-to-VACUUM-command.

Thank you!

- /* Cap by the worker we computed at the beginning of parallel lazy vacuum */
- nworkers = Min(nworkers, lps->pcxt->nworkers);
+ /*
+ * The number of workers required for parallel vacuum phase must be less
+ * than the number of workers with which parallel context is initialized.
+ */
+ Assert(lps->pcxt->nworkers >= nworkers);

Regarding the above change in your patch I think we need to cap the
number of workers by lps->pcxt->nworkers because the computation of
the number of indexes based on lps->nindexes_paralle_XXX can be larger
than the number determined when creating the parallel context, for
example, when max_parallel_maintenance_workers is smaller than the
number of indexes that can be vacuumed in parallel at bulkdelete
phase.

>
> Few other comments which I have not fixed:
>
> 4.
> + if (Irel[i]->rd_indam->amusemaintenanceworkmem)
> + nindexes_mwm++;
> +
> + /* Skip indexes that don't participate parallel index vacuum */
> + if (vacoptions == VACUUM_OPTION_NO_PARALLEL ||
> + RelationGetNumberOfBlocks(Irel[i]) < min_parallel_index_scan_size)
> + continue;
>
> Won't we need to worry about the number of indexes that uses
> maintenance_work_mem only for indexes that can participate in a
> parallel vacuum? If so, the above checks need to be reversed.

You're right. Fixed.

>
> 5.
> /*
> + * Remember indexes that can participate parallel index vacuum and use
> + * it for index statistics initialization on DSM because the index
> + * size can get bigger during vacuum.
> + */
> + can_parallel_vacuum[i] = true;
>
> I am not able to understand the second part of the comment ("because
> the index size can get bigger during vacuum.").  What is its
> relevance?

I meant that the indexes can be begger even during vacuum. So we need
to check the size of indexes and determine participations of parallel
index vacuum at one place.

>
> 6.
> +/*
> + * Vacuum or cleanup indexes that can be processed by only the leader process
> + * because these indexes don't support parallel operation at that phase.
> + * Therefore this function must be called by the leader process.
> + */
> +static void
> +vacuum_indexes_leader(Relation *Irel, int nindexes,
> IndexBulkDeleteResult **stats,
> +   LVRelStats *vacrelstats, LVParallelState *lps)
> {
> ..
>
> Why you have changed the order of nindexes parameter?  I think in the
> previous patch, it was the last parameter and that seems to be better
> place for it.

Since some existing codes place nindexes right after *Irel I thought
it's more understandable but I'm also fine with the previous order.

> Also, I think after the latest modifications, you can
> remove the second sentence in the above comment ("Therefore this
> function must be called by the leader process.).

Fixed.

>
> 7.
> + for (i = 0; i < nindexes; i++)
> + {
> + bool leader_only = (get_indstats(lps->lvshared, i) == NULL ||
> +    skip_parallel_vacuum_index(Irel[i], lps->lvshared));
> +
> + /* Skip the indexes that can be processed by parallel workers */
> + if (!leader_only)
> + continue;
>
> It is better to name this parameter as skip_index or something like that.

Fixed.

Attached the updated version patch. This version patch incorporates
the above comments and the comments from Mahendra. I also fixed one
bug around determining the indexes that are vacuumed in parallel based
on their option and size. Please review it.

Regards,

--
Masahiko Sawada            http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Вложения

В списке pgsql-hackers по дате отправления:

Предыдущее
От: Amit Langote
Дата:
Сообщение: Re: unsupportable composite type partition keys
Следующее
От: Mahendra Singh
Дата:
Сообщение: Re: [HACKERS] Block level parallel vacuum