Re: [HACKERS] Block level parallel vacuum
От | Masahiko Sawada |
---|---|
Тема | Re: [HACKERS] Block level parallel vacuum |
Дата | |
Msg-id | CAD21AoAdue0hHLpPFMicRzfe8KX=CyLa257cWmkAfFoo3-YtOA@mail.gmail.com обсуждение исходный текст |
Ответ на | Re: [HACKERS] Block level parallel vacuum (Amit Kapila <amit.kapila16@gmail.com>) |
Ответы |
Re: [HACKERS] Block level parallel vacuum
(Dilip Kumar <dilipbalaut@gmail.com>)
Re: [HACKERS] Block level parallel vacuum (Amit Kapila <amit.kapila16@gmail.com>) Re: [HACKERS] Block level parallel vacuum (Amit Kapila <amit.kapila16@gmail.com>) |
Список | pgsql-hackers |
On Sat, Sep 21, 2019 at 9:31 PM Amit Kapila <amit.kapila16@gmail.com> wrote: > > On Fri, Jun 7, 2019 at 12:03 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote: > > > > Since the previous version patch conflicts with current HEAD, I've > > attached the updated version patches. > > > Thank you for reviewing this patch! > Review comments: > ------------------------------ > * > indexes on the relation which further limited by > + <xref linkend="guc-max-parallel-workers-maintenance"/>. > > /which further/which is further > Fixed. > * > + * index vacuuming or index cleanup, we launch parallel worker processes. Once > + * all indexes are processed the parallel worker processes exit and the leader > + * process re-initializes the DSM segment while keeping recorded dead tuples. > > It is not clear for this comment why it re-initializes the DSM segment > instead of destroying it once the index work is done by workers. Can > you elaborate a bit more in the comment? Added more explanation. > > * > + * Note that all parallel workers live during one either index vacuuming or > > It seems usage of 'one' is not required in the above sentence. Removed. > > * > + > +/* > + * Compute the number of parallel worker process to request. > > /process/processes Fixed. > > * > +static int > +compute_parallel_workers(Relation onerel, int nrequested, int nindexes) > +{ > + int parallel_workers = 0; > + > + Assert(nrequested >= 0); > + > + if (nindexes <= 1) > + return 0; > > I think here, in the beginning, you can also check if > max_parallel_maintenance_workers are 0, then return. > Agreed, fixed. > * > In function compute_parallel_workers, don't we want to cap the number > of workers based on maintenance_work_mem as we do in > plan_create_index_workers? > > The basic point is how do we want to treat maintenance_work_mem for > this feature. Do we want all workers to use at max the > maintenance_work_mem or each worker is allowed to use > maintenance_work_mem? I would prefer earlier unless we have good > reason to follow the later strategy. > > Accordingly, we might need to update the below paragraph in docs: > "Note that parallel utility commands should not consume substantially > more memory than equivalent non-parallel operations. This strategy > differs from that of parallel query, where resource limits generally > apply per worker process. Parallel utility commands treat the > resource limit <varname>maintenance_work_mem</varname> as a limit to > be applied to the entire utility command, regardless of the number of > parallel worker processes." I'd also prefer to use maintenance_work_mem at max during parallel vacuum regardless of the number of parallel workers. This is current implementation. In lazy vacuum the maintenance_work_mem is used to record itempointer of dead tuples. This is done by leader process and worker processes just refers them for vacuuming dead index tuples. Even if user sets a small amount of maintenance_work_mem the parallel vacuum would be helpful as it still would take a time for index vacuuming. So I thought we should cap the number of parallel workers by the number of indexes rather than maintenance_work_mem. > > * > +static int > +compute_parallel_workers(Relation onerel, int nrequested, int nindexes) > +{ > + int parallel_workers = 0; > + > + Assert(nrequested >= 0); > + > + if (nindexes <= 1) > + return 0; > + > + if (nrequested > 0) > + { > + /* At least one index is taken by the leader process */ > + parallel_workers = Min(nrequested, nindexes - 1); > + } > > I think here we always allow the leader to participate. It seems to > me we have some way to disable leader participation. During the > development of previous parallel operations, we find it quite handy to > catch bugs. We might want to mimic what has been done for index with > DISABLE_LEADER_PARTICIPATION. Added the way to disable leader participation. > > * > +/* > + * DSM keys for parallel lazy vacuum. Unlike other parallel execution code, > + * since we don't need to worry about DSM keys conflicting with plan_node_id > + * we can use small integers. > + */ > +#define PARALLEL_VACUUM_KEY_SHARED 1 > +#define PARALLEL_VACUUM_KEY_DEAD_TUPLES 2 > +#define PARALLEL_VACUUM_KEY_QUERY_TEXT 3 > > I think it would be better if these keys should be assigned numbers in > a way we do for other similar operation like create index. See below > defines > in code: > /* Magic numbers for parallel state sharing */ > #define PARALLEL_KEY_BTREE_SHARED UINT64CONST(0xA000000000000001) > > This will make the code consistent with other parallel operations. I skipped this comment according to the previous your mail. > > * > +begin_parallel_vacuum(LVRelStats *vacrelstats, Oid relid, BlockNumber nblocks, > + int nindexes, int nrequested) > { > .. > + est_deadtuples = MAXALIGN(add_size(sizeof(LVDeadTuples), > .. > } > > I think here you should use SizeOfLVDeadTuples as defined by patch. Fixed. > > * > + keys++; > + > + /* Estimate size for dead tuples -- PARALLEL_VACUUM_KEY_DEAD_TUPLES */ > + maxtuples = compute_max_dead_tuples(nblocks, true); > + est_deadtuples = MAXALIGN(add_size(sizeof(LVDeadTuples), > + mul_size(sizeof(ItemPointerData), maxtuples))); > + shm_toc_estimate_chunk(&pcxt->estimator, est_deadtuples); > + keys++; > + > + shm_toc_estimate_keys(&pcxt->estimator, keys); > + > + /* Finally, estimate VACUUM_KEY_QUERY_TEXT space */ > + querylen = strlen(debug_query_string); > + shm_toc_estimate_chunk(&pcxt->estimator, querylen + 1); > + shm_toc_estimate_keys(&pcxt->estimator, 1); > > The code style looks inconsistent here. In some cases, you are > calling shm_toc_estimate_keys immediately after shm_toc_estimate_chunk > and in other cases, you are accumulating keys. I think it is better > to call shm_toc_estimate_keys immediately after shm_toc_estimate_chunk > in all cases. Fixed. But there are some code that call shm_toc_estimate_keys for multiple keys in for example nbtsort.c and parallel.c. What is the difference? > > * > +void > +heap_parallel_vacuum_main(dsm_segment *seg, shm_toc *toc) > { > .. > + /* Set debug_query_string for individual workers */ > + sharedquery = shm_toc_lookup(toc, PARALLEL_VACUUM_KEY_QUERY_TEXT, true); > .. > } > > I think the last parameter in shm_toc_lookup should be false. Is > there a reason for passing it as true? My bad, fixed. > > * > +void > +heap_parallel_vacuum_main(dsm_segment *seg, shm_toc *toc) > +{ > .. > + /* Open table */ > + onerel = heap_open(lvshared->relid, ShareUpdateExclusiveLock); > .. > } > > I don't think it is a good idea to assume the lock mode as > ShareUpdateExclusiveLock here. Tomorrow, if due to some reason there > is a change in lock level for the vacuum process, we might forget to > update it here. I think it is better if we can get this information > from the master backend. So did you mean to declare the lock mode for lazy vacuum somewhere as a global variable and use it in both try_relation_open in the leader process and relation_open in the worker process? Otherwise we would end up with adding something like shared->lmode = ShareUpdateExclusiveLock during parallel context initialization, which seems not to resolve your concern. > > * > +end_parallel_vacuum(LVParallelState *lps, Relation *Irel, int nindexes) > { > .. > + /* Shutdown worker processes and destroy the parallel context */ > + WaitForParallelWorkersToFinish(lps->pcxt); > .. > } > > Do we really need to call WaitForParallelWorkersToFinish here as it > must have been called in lazy_parallel_vacuum_or_cleanup_indexes > before this time? No, removed. I've attached the updated version patch that incorporated your comments excluding some comments that needs more discussion. After discussion I'll update it again. Regards, -- Masahiko Sawada
Вложения
В списке pgsql-hackers по дате отправления: