Re: [HACKERS] Block level parallel vacuum

Поиск
Список
Период
Сортировка
От Amit Kapila
Тема Re: [HACKERS] Block level parallel vacuum
Дата
Msg-id CAA4eK1JEQ2y3uNucNopDjK8pse6xSe5=_oknoWfRQvAF=VqsBA@mail.gmail.com
обсуждение исходный текст
Ответ на Re: [HACKERS] Block level parallel vacuum  (Amit Kapila <amit.kapila16@gmail.com>)
Ответы Re: [HACKERS] Block level parallel vacuum  (Dilip Kumar <dilipbalaut@gmail.com>)
Re: [HACKERS] Block level parallel vacuum  (Masahiko Sawada <sawada.mshk@gmail.com>)
Список pgsql-hackers
On Sat, Oct 12, 2019 at 4:50 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> On Sat, Oct 12, 2019 at 11:29 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> >
>
> I see a much bigger problem with the way this patch collects the index
> stats in shared memory.  IIUC, it allocates the shared memory (DSM)
> for all the index stats, in the same way, considering its size as
> IndexBulkDeleteResult.  For the first time, it gets the stats from
> local memory as returned by ambulkdelete/amvacuumcleanup call and then
> copies it in shared memory space.  There onwards, it always updates
> the stats in shared memory by pointing each index stats to that
> memory.  In this scheme, you overlooked the point that an index AM
> could choose to return a larger structure of which
> IndexBulkDeleteResult is just the first field.  This generally
> provides a way for ambulkdelete to communicate additional private data
> to amvacuumcleanup.  We use this idea in the gist index, see how
> gistbulkdelete and gistvacuumcleanup works. The current design won't
> work for such cases.
>

Today, I looked at gistbulkdelete and gistvacuumcleanup closely and I
have a few observations about those which might help us to solve this
problem for gist indexes:
1. Are we using memory context GistBulkDeleteResult->page_set_context?
 It seems to me it is not being used.
2. Each time we perform gistbulkdelete, we always seem to reset the
GistBulkDeleteResult stats, see gistvacuumscan.  So, how will it
accumulate it for the cleanup phase when the vacuum needs to call
gistbulkdelete multiple times because the available space for
dead-tuple is filled.  It seems to me like we only use the stats from
the very last call to gistbulkdelete.
3. Do we really need to give the responsibility of deleting empty
pages (gistvacuum_delete_empty_pages) to gistvacuumcleanup.  Can't we
do it in gistbulkdelte?  I see one advantage of postponing it till the
cleanup phase which is if somehow we can accumulate stats over
multiple calls of gistbulkdelete, but I am not sure if it is feasible.
At least, the way current code works, it seems that there is no
advantage to postpone deleting empty pages till the cleanup phase.

If we avoid postponing deleting empty pages till the cleanup phase,
then we don't have the problem for gist indexes.

This is not directly related to this patch, so we can discuss these
observations in a separate thread as well, but before that, I wanted
to check your opinion to see if this makes sense to you as this will
help us in moving this patch forward.

-- 
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com



В списке pgsql-hackers по дате отправления:

Предыдущее
От: nil socket
Дата:
Сообщение: Re: Minimal logical decoding on standbys
Следующее
От: Dilip Kumar
Дата:
Сообщение: Re: PATCH: logical_work_mem and logical streaming of largein-progress transactions