Re: [HACKERS] Block level parallel vacuum

Поиск
Список
Период
Сортировка
От Masahiko Sawada
Тема Re: [HACKERS] Block level parallel vacuum
Дата
Msg-id CAD21AoARj=e=6_KOZnaR66jRkDmGaVdLcrt33Ua-zMUugKU3mQ@mail.gmail.com
обсуждение исходный текст
Ответ на Re: [HACKERS] Block level parallel vacuum  (Amit Kapila <amit.kapila16@gmail.com>)
Ответы Re: [HACKERS] Block level parallel vacuum  (Amit Kapila <amit.kapila16@gmail.com>)
Список pgsql-hackers
On Fri, Oct 4, 2019 at 2:02 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> On Wed, Oct 2, 2019 at 7:29 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
>>
>> On Sat, Sep 21, 2019 at 9:31 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
>> > *
>> > In function compute_parallel_workers, don't we want to cap the number
>> > of workers based on maintenance_work_mem as we do in
>> > plan_create_index_workers?
>> >
>> > The basic point is how do we want to treat maintenance_work_mem for
>> > this feature.  Do we want all workers to use at max the
>> > maintenance_work_mem or each worker is allowed to use
>> > maintenance_work_mem?  I would prefer earlier unless we have good
>> > reason to follow the later strategy.
>> >
>> > Accordingly, we might need to update the below paragraph in docs:
>> > "Note that parallel utility commands should not consume substantially
>> > more memory than equivalent non-parallel operations.  This strategy
>> > differs from that of parallel query, where resource limits generally
>> > apply per worker process.  Parallel utility commands treat the
>> > resource limit <varname>maintenance_work_mem</varname> as a limit to
>> > be applied to the entire utility command, regardless of the number of
>> > parallel worker processes."
>>
>> I'd also prefer to use maintenance_work_mem at max during parallel
>> vacuum regardless of the number of parallel workers. This is current
>> implementation. In lazy vacuum the maintenance_work_mem is used to
>> record itempointer of dead tuples. This is done by leader process and
>> worker processes just refers them for vacuuming dead index tuples.
>> Even if user sets a small amount of maintenance_work_mem the parallel
>> vacuum would be helpful as it still would take a time for index
>> vacuuming. So I thought we should cap the number of parallel workers
>> by the number of indexes rather than maintenance_work_mem.
>>
>
> Isn't that true only if we never use maintenance_work_mem during index cleanup?  However, I think we are using during
indexcleanup, see forex. ginInsertCleanup.  I think before reaching any conclusion about what to do about this, first
weneed to establish whether this is a problem.  If I am correct, then only some of the index cleanups (like gin index)
usemaintenance_work_mem, so we need to consider that point while designing a solution for this. 
>

I got your point. Currently the single process lazy vacuum could
consume the amount of (maintenance_work_mem * 2) memory at max because
we do index cleanup during holding the dead tuple space as you
mentioned. And ginInsertCleanup is also be called at the beginning of
ginbulkdelete. In current parallel lazy vacuum, each parallel vacuum
worker could consume other memory apart from the memory used by heap
scan depending on the implementation of target index AM. Given that
the current single and parallel vacuum implementation it would be
better to control the amount memory in total rather than the number of
parallel workers. So one approach I came up with is that we make all
vacuum workers use the amount of (maintenance_work_mem / # of
participants) as new maintenance_work_mem. It might be too small in
some cases but it doesn't consume more memory than single lazy vacuum
as long as index AM doesn't consume more memory regardless of
maintenance_work_mem. I think it really depends on the implementation
of index AM.

>>
>> > *
>> > + keys++;
>> > +
>> > + /* Estimate size for dead tuples -- PARALLEL_VACUUM_KEY_DEAD_TUPLES */
>> > + maxtuples = compute_max_dead_tuples(nblocks, true);
>> > + est_deadtuples = MAXALIGN(add_size(sizeof(LVDeadTuples),
>> > +    mul_size(sizeof(ItemPointerData), maxtuples)));
>> > + shm_toc_estimate_chunk(&pcxt->estimator, est_deadtuples);
>> > + keys++;
>> > +
>> > + shm_toc_estimate_keys(&pcxt->estimator, keys);
>> > +
>> > + /* Finally, estimate VACUUM_KEY_QUERY_TEXT space */
>> > + querylen = strlen(debug_query_string);
>> > + shm_toc_estimate_chunk(&pcxt->estimator, querylen + 1);
>> > + shm_toc_estimate_keys(&pcxt->estimator, 1);
>> >
>> > The code style looks inconsistent here.  In some cases, you are
>> > calling shm_toc_estimate_keys immediately after shm_toc_estimate_chunk
>> > and in other cases, you are accumulating keys.  I think it is better
>> > to call shm_toc_estimate_keys immediately after shm_toc_estimate_chunk
>> > in all cases.
>>
>> Fixed. But there are some code that call shm_toc_estimate_keys for
>> multiple keys in for example nbtsort.c and parallel.c. What is the
>> difference?
>>
>
> We can do it, either way, depending on the situation.  For example, in nbtsort.c, there is an if check based on which
'numberof keys' can vary.  I think here we should try to write in a way that it should not confuse the reader why it is
donein a particular way.  This is the reason I told you to be consistent. 

Understood. Thank you for explanation!

>
>>
>> >
>> > *
>> > +void
>> > +heap_parallel_vacuum_main(dsm_segment *seg, shm_toc *toc)
>> > +{
>> > ..
>> > + /* Open table */
>> > + onerel = heap_open(lvshared->relid, ShareUpdateExclusiveLock);
>> > ..
>> > }
>> >
>> > I don't think it is a good idea to assume the lock mode as
>> > ShareUpdateExclusiveLock here.  Tomorrow, if due to some reason there
>> > is a change in lock level for the vacuum process, we might forget to
>> > update it here.  I think it is better if we can get this information
>> > from the master backend.
>>
>> So did you mean to declare the lock mode for lazy vacuum somewhere as
>> a global variable and use it in both try_relation_open in the leader
>> process and relation_open in the worker process? Otherwise we would
>> end up with adding something like shared->lmode =
>> ShareUpdateExclusiveLock during parallel context initialization, which
>> seems not to resolve your concern.
>>
>
> I was thinking that if we can find a way to pass the lockmode we used in vacuum_rel, but I guess we need to pass it
throughmultiple functions which will be a bit inconvenient.  OTOH, today, I checked nbtsort.c (_bt_parallel_build_main)
andfound that there also we are using it directly instead of passing it from the master backend.  I think we can leave
itas you have in the patch, but add a comment on why it is okay to use that lock mode? 

Yeah agreed.

Regards,

--
Masahiko Sawada



В списке pgsql-hackers по дате отправления:

Предыдущее
От: Robert Haas
Дата:
Сообщение: Re: Close stdout and stderr in syslogger
Следующее
От: Alexey Kondratov
Дата:
Сообщение: Re: Two pg_rewind patches (auto generate recovery conf and ensureclean shutdown)