Re: [HACKERS] Block level parallel vacuum

Поиск
Список
Период
Сортировка
От Masahiko Sawada
Тема Re: [HACKERS] Block level parallel vacuum
Дата
Msg-id CAD21AoBqw_EkWtBCnvvDsr2cyYqhJ5F0g-QQ3YQVwTAsjzsDQw@mail.gmail.com
обсуждение исходный текст
Ответ на Re: [HACKERS] Block level parallel vacuum  (Amit Kapila <amit.kapila16@gmail.com>)
Ответы Re: [HACKERS] Block level parallel vacuum  (Masahiko Sawada <sawada.mshk@gmail.com>)
Список pgsql-hackers
On Thu, Dec 20, 2018 at 3:38 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> On Tue, Dec 18, 2018 at 1:29 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> >
> > Attached the updated patches. I scaled back the scope of this patch.
> > The patch now includes only feature (a), that is it execute both index
> > vacuum and cleanup index in parallel. It also doesn't include
> > autovacuum support for now.
> >
> > The PARALLEL option works alomst same as before patch. In VACUUM
> > command, we can specify 'PARALLEL n' option where n is the number of
> > parallel workers to request. If the n is omitted the number of
> > parallel worekrs would be # of indexes -1.
> >
>
> I think for now this is okay, but I guess in furture when we make
> heapscans also parallel or maybe allow more than one worker per-index
> vacuum, then this won't hold good. So, I am not sure if below text in
> docs is most appropriate.
>
> +    <term><literal>PARALLEL <replaceable
> class="parameter">N</replaceable></literal></term>
> +    <listitem>
> +     <para>
> +      Execute index vacuum and cleanup index in parallel with
> +      <replaceable class="parameter">N</replaceable> background
> workers. If the parallel
> +      degree <replaceable class="parameter">N</replaceable> is omitted,
> +      <command>VACUUM</command> requests the number of indexes - 1
> processes, which is the
> +      maximum number of parallel vacuum workers since individual
> indexes is processed by
> +      one process. The actual number of parallel vacuum workers may
> be less due to the
> +      setting of <xref linkend="guc-max-parallel-workers-maintenance"/>.
> +      This option can not use with  <literal>FULL</literal> option.
>
> It might be better to use some generic language in docs, something
> like "If the parallel degree N is omitted, then vacuum decides the
> number of workers based on number of indexes on the relation which is
> further limited by max-parallel-workers-maintenance".

Thank you for the review.

I agreed your concern and the text you suggested.

>  I think you
> also need to mention in some way that you consider storage option
> parallel_workers.

Added.

>
> Few assorted comments:
> 1.
> +lazy_begin_parallel_vacuum_index(LVState *lvstate, bool for_cleanup)
> {
> ..
> +
> + LaunchParallelWorkers(lvstate->pcxt);
> +
> + /*
> + * if no workers launched, we vacuum all indexes by the leader process
> + * alone. Since there is hope that we can launch workers in the next
> + * execution time we don't want to end the parallel mode yet.
> + */
> + if (lvstate->pcxt->nworkers_launched == 0)
> + return;
>
> It is quite possible that the workers are not launched because we fail
> to allocate memory, basically when pcxt->nworkers is zero.  I think in
> such cases there is no use for being in parallel mode.  You can even
> detect that before calling lazy_begin_parallel_vacuum_index.

Agreed. we can stop preparation and exit parallel mode if
pcxt->nworkers got 0 after InitializeParallelDSM() .

>
> 2.
> static void
> +lazy_vacuum_all_indexes_for_leader(LVState *lvstate,
> IndexBulkDeleteResult **stats,
> +    LVTidMap *dead_tuples, bool do_parallel,
> +    bool for_cleanup)
> {
> ..
> + if (do_parallel)
> + lazy_begin_parallel_vacuum_index(lvstate, for_cleanup);
> +
> + for (;;)
> + {
> + IndexBulkDeleteResult *r = NULL;
> +
> + /*
> + * Get the next index number to vacuum and set index statistics. In parallel
> + * lazy vacuum, index bulk-deletion results are stored in the shared memory
> + * segment. If it's already updated we use it rather than setting it to NULL.
> + * In single vacuum, we can always use an element of the 'stats'.
> + */
> + if (do_parallel)
> + {
> + idx = pg_atomic_fetch_add_u32(&(lvshared->nprocessed), 1);
> +
> + if (lvshared->indstats[idx].updated)
> + r = &(lvshared->indstats[idx].stats);
> + }
>
> It is quite possible that we are not able to launch any workers in
> lazy_begin_parallel_vacuum_index, in such cases, we should not use
> parallel mode path, basically as written we can't rely on
> 'do_parallel' flag.
>

Fixed.

Attached new version patch.

Regards,

--
Masahiko Sawada
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center

Вложения

В списке pgsql-hackers по дате отправления:

Предыдущее
От: Michael Paquier
Дата:
Сообщение: Re: plpgsql plugin - stmt_beg/end is not called for top level blockof statements
Следующее
От: Alvaro Herrera
Дата:
Сообщение: Re: removal of dangling temp tables