Re: parallel vacuum comments

Поиск
Список
Период
Сортировка
От Andres Freund
Тема Re: parallel vacuum comments
Дата
Msg-id 20211104190003.l5547auul46k7d5v@alap3.anarazel.de
обсуждение исходный текст
Ответ на Re: parallel vacuum comments  (Masahiko Sawada <sawada.mshk@gmail.com>)
Ответы Re: parallel vacuum comments  (Masahiko Sawada <sawada.mshk@gmail.com>)
Список pgsql-hackers
Hi,

On 2021-11-01 10:44:34 +0900, Masahiko Sawada wrote:
> On Sun, Oct 31, 2021 at 6:21 AM Andres Freund <andres@anarazel.de> wrote:
> >   But even though we have this space optimized bitmap thing, we actually need
> >   more memory allocated for each index, making this whole thing pointless.
> 
> Right. But is better to change to use booleans?

It seems very clearly better to me. We shouldn't use complicated stuff like

#define SizeOfLVShared (offsetof(LVShared, bitmap) + sizeof(bits8))
#define GetSharedIndStats(s) \
    ((LVSharedIndStats *)((char *)(s) + ((LVShared *)(s))->offset))
#define IndStatsIsNull(s, i) \
    (!(((LVShared *)(s))->bitmap[(i) >> 3] & (1 << ((i) & 0x07))))

when there's reason / benefit.


> > - Imo it's pretty confusing to have functions like
> >   lazy_parallel_vacuum_indexes() (in 13, renamed in 14) that "Perform index
> >   vacuum or index cleanup with parallel workers.", based on
> >   lps->lvshared->for_cleanup.
> 
> Okay. We need to set lps->lvshared->for_cleanup to tell worker do
> either index vacuum or index cleanup. So it might be better to pass
> for_cleanup flag down to the functions in addition to setting
> lps->lvshared->for_cleanup.

Yup.


> > - I don't like some of the new names introduced in 14. E.g.
> >   "do_parallel_processing" is way too generic.
> 
> I listed the function names that probably needs to be renamed from
> that perspecti:
> 
> * do_parallel_processing
> * do_serial_processing_for_unsafe_indexes
> * parallel_process_one_index
> 
> Is there any other function that should be renamed?

parallel_processing_is_safe().

I don't like that there's three different naming patterns for parallel
things. There's do_parallel_*, there's parallel_, and there's
(begin|end|compute)_parallel_*.


> > - On a higher level, a lot of this actually doesn't seem to belong into
> >   vacuumlazy.c, but should be somewhere more generic. Pretty much none of this
> >   code is heap specific.  And vacuumlazy.c is large enough without the parallel
> >   code.
> 
> I don't come with an idea to make them more generic. Could you
> elaborate on that?

To me the the job that the parallel vacuum stuff does isn't really specific to
heap. Any table AM supporting indexes is going to need to do something pretty
much like it (it's calling indexam stuff). Most of the stuff in vacuumlazy.c
is very heap specific - you're not going to be able to reuse lazy_scan_heap()
or such. Before the parallel vacuum stuff, the index specific code in
vacuumlazy.c was fairly limited - but now it's a nontrivial amount of code.

Based on a quick look
  parallel_vacuum_main(), parallel_processing_is_safe(),
  parallel_stats_for_idx(), end_parallel_vacuum(), begin_parallel_vacuum(),
  compute_parallel_vacuum_workers(), parallel_process_one_index(),
  do_serial_processing_for_unsafe_indexes(), do_parallel_processing(),
  do_parallel_vacuum_or_cleanup(), do_parallel_lazy_cleanup_all_indexes(),
  do_parallel_lazy_vacuum_all_indexes(),

don't really belong in vacuumlazy.c. but should be in vacuum.c or a new
file. Of course that requires a bit of work, because of the heavy reliance on
LVRelState, but afaict there's not really an intrinsic need to use that.

Greetings,

Andres Freund



В списке pgsql-hackers по дате отправления:

Предыдущее
От: Peter Eisentraut
Дата:
Сообщение: Re: Time to drop plpython2?
Следующее
От: Peter Geoghegan
Дата:
Сообщение: Re: parallel vacuum comments