Re: [HACKERS] Block level parallel vacuum

Поиск
Список
Период
Сортировка
От Masahiko Sawada
Тема Re: [HACKERS] Block level parallel vacuum
Дата
Msg-id CAD21AoDS7K-J-aPsTBxftOfzP4=8rGp_AM2uZKFjG-BFotvV0g@mail.gmail.com
обсуждение исходный текст
Ответ на Re: [HACKERS] Block level parallel vacuum  (Haribabu Kommi <kommi.haribabu@gmail.com>)
Ответы Re: [HACKERS] Block level parallel vacuum
Re: [HACKERS] Block level parallel vacuum
Re: [HACKERS] Block level parallel vacuum
Список pgsql-hackers
On Wed, Jan 30, 2019 at 2:06 AM Haribabu Kommi <kommi.haribabu@gmail.com> wrote:
>
>
> On Thu, Jan 24, 2019 at 1:16 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
>>
>>
>> Attached the latest patches.
>
>
> Thanks for the updated patches.
> Some more code review comments.
>

Thank you!

> +         started by a single utility command.  Currently, the parallel
> +         utility commands that support the use of parallel workers are
> +         <command>CREATE INDEX</command> and <command>VACUUM</command>
> +         without <literal>FULL</literal> option, and only when building
> +         a B-tree index.  Parallel workers are taken from the pool of
>
>
> I feel the above sentence may not give the proper picture, how about the
> adding following modification?
>
> <command>CREATE INDEX</command> only when building a B-tree index
> and <command>VACUUM</command> without <literal>FULL</literal> option.
>
>

Agreed.

>
> + * parallel vacuum, we perform both index vacuum and index cleanup in parallel.
> + * Individual indexes is processed by one vacuum process. At beginning of
>
> How about vacuum index and cleanup index similar like other places?
>
>
> + * memory space for dead tuples. When starting either index vacuum or cleanup
> + * vacuum, we launch parallel worker processes. Once all indexes are processed
>
> same here as well?
>
>
> + * Before starting parallel index vacuum and parallel cleanup index we launch
> + * parallel workers. All parallel workers will exit after processed all indexes
>
> parallel vacuum index and parallel cleanup index?
>
>

ISTM we're using like "index vacuuming", "index cleanup" and "FSM
vacuming" in vacuumlazy.c so maybe "parallel index vacuuming" and
"parallel index cleanup" would be better?

> + /*
> + * If there is already-updated result in the shared memory we
> + * use it. Otherwise we pass NULL to index AMs and copy the
> + * result to the shared memory segment.
> + */
> + if (lvshared->indstats[idx].updated)
> + result = &(lvshared->indstats[idx].stats);
>
> I didn't really find a need of the flag to differentiate the stats pointer from
> first run to second run? I don't see any problem in passing directing the stats
> and the same stats are updated in the worker side and leader side. Anyway no two
> processes will do the index vacuum at same time. Am I missing something?
>
> Even if this flag is to identify whether the stats are updated or not before
> writing them, I don't see a need of it compared to normal vacuum.
>

The passing stats = NULL to amvacuumcleanup and ambulkdelete means the
first time execution. For example, btvacuumcleanup skips cleanup if
it's not NULL.In the normal vacuum we pass NULL to ambulkdelete or
amvacuumcleanup when the first time calling. And they store the result
stats to the memory allocated int the local memory. Therefore in the
parallel vacuum I think that both worker and leader need to move it to
the shared memory and mark it as updated as different worker could
vacuum different indexes at the next time.

>
> + * Enter the parallel mode, allocate and initialize a DSM segment. Return
> + * the memory space for storing dead tuples or NULL if no workers are prepared.
> + */
>
> + pcxt = CreateParallelContext("postgres", "heap_parallel_vacuum_main",
> + request, true);
>
> But we are passing as serializable_okay flag as true, means it doesn't return
> NULL. Is it expected?
>
>

I think you're right. Since the request never be 0 and
serializable_okey is true it should not return NULL. Will fix.

> + initStringInfo(&buf);
> + appendStringInfo(&buf,
> + ngettext("launched %d parallel vacuum worker %s (planned: %d",
> +   "launched %d parallel vacuum workers %s (planned: %d",
> +   lvstate->pcxt->nworkers_launched),
> + lvstate->pcxt->nworkers_launched,
> + for_cleanup ? "for index cleanup" : "for index vacuum",
> + lvstate->pcxt->nworkers);
> + if (lvstate->options.nworkers > 0)
> + appendStringInfo(&buf, ", requested %d", lvstate->options.nworkers);
>
> what is the difference between planned workers and requested workers, aren't both
> are same?

The request is the parallel degree that is specified explicitly by
user whereas the planned is the actual number we planned based on the
number of indexes the table has. For example, if we do like 'VACUUM
(PARALLEL 3000) tbl' where the tbl has 4 indexes, the request is 3000
and the planned is 4. Also if max_parallel_maintenance_workers is 2
the planned is 2.

>
>
> - COMPARE_SCALAR_FIELD(options);
> - COMPARE_NODE_FIELD(rels);
> + if (a->options.flags != b->options.flags)
> + return false;
> + if (a->options.nworkers != b->options.nworkers)
> + return false;
>
> Options is changed from SCALAR to check, but why the rels check is removed?
> The options is changed from int to a structure so using SCALAR may not work
> in other function like _copyVacuumStmt and etc?

Agreed and will fix.

>
> +typedef struct VacuumOptions
> +{
> + VacuumFlag flags; /* OR of VacuumFlag */
> + int nworkers; /* # of parallel vacuum workers */
> +} VacuumOptions;
>
>
> Do we need to add NodeTag for the above structure? Because this structure is
> part of VacuumStmt structure.

Yes, I will add it.

>
>
> +        <application>vacuumdb</application> will require background workers,
> +        so make sure your <xref linkend="guc-max-parallel-workers-maintenance"/>
> +        setting is more than one.
>
> removing vacuumdb and changing it as "This option will ..."?
>
Agreed.

> I will continue the testing of this patch and share the details.
>

Thank you. I'll submit the updated patch set.

Regards,

--
Masahiko Sawada
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center


В списке pgsql-hackers по дате отправления:

Предыдущее
От: Stas Kelvich
Дата:
Сообщение: XLogInsert() of dangling pointer while logging replica identity
Следующее
От: Robert Haas
Дата:
Сообщение: Re: Delay locking partitions during query execution