Re: [HACKERS] Block level parallel vacuum

Поиск
Список
Период
Сортировка
От Masahiko Sawada
Тема Re: [HACKERS] Block level parallel vacuum
Дата
Msg-id CA+fd4k7Eq928jYsYeNHDJM0TC+qGF8kxG0Ok7r2rErX2=2aANQ@mail.gmail.com
обсуждение исходный текст
Ответ на Re: [HACKERS] Block level parallel vacuum  (Amit Kapila <amit.kapila16@gmail.com>)
Ответы Re: [HACKERS] Block level parallel vacuum
Список pgsql-hackers
On Mon, 4 Nov 2019 at 14:02, Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> On Fri, Nov 1, 2019 at 2:21 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> >
> > I think that two approaches make parallel vacuum worker wait in
> > different way: in approach(a) the vacuum delay works as if vacuum is
> > performed by single process, on the other hand in approach(b) the
> > vacuum delay work for each workers independently.
> >
> > Suppose that the total number of blocks to vacuum is 10,000 blocks,
> > the cost per blocks is 10, the cost limit is 200 and sleep time is 5
> > ms. In single process vacuum the total sleep time is 2,500ms (=
> > (10,000 * 10 / 200) * 5). The approach (a) is the same, 2,500ms.
> > Because all parallel vacuum workers use the shared balance value and a
> > worker sleeps once the balance value exceeds the limit. In
> > approach(b), since the cost limit is divided evenly the value of each
> > workers is 40 (e.g. when 5 parallel degree). And suppose each workers
> > processes blocks  evenly,  the total sleep time of all workers is
> > 12,500ms (=(2,000 * 10 / 40) * 5 * 5). I think that's why we can
> > compute the sleep time of approach(b) by dividing the total value by
> > the number of parallel workers.
> >
> > IOW the approach(b) makes parallel vacuum delay much more than normal
> > vacuum and parallel vacuum with approach(a) even with the same
> > settings. Which behaviors do we expect?
> >
>
> Yeah, this is an important thing to decide.  I don't think that the
> conclusion you are drawing is correct because it that is true then the
> same applies to the current autovacuum work division where we divide
> the cost_limit among workers but the cost_delay is same (see
> autovac_balance_cost).  Basically, if we consider the delay time of
> each worker independently, then it would appear that a parallel vacuum
> delay with approach (b) is more, but that is true only if the workers
> run serially which is not true.
>
> > I thought the vacuum delay for
> > parallel vacuum should work as if it's a single process vacuum as we
> > did for memory usage. I might be missing something. If we prefer
> > approach(b) I should change the patch so that the leader process
> > divides the cost limit evenly.
> >
>
> I am also not completely sure which approach is better but I slightly
> lean towards approach (b).

Can we get the same sleep time as approach (b) if we divide the cost
limit by the number of workers and have the shared cost balance (i.e.
approach (a) with dividing the cost limit)? Currently the approach (b)
seems better but I'm concerned that it might unnecessarily delay
vacuum if some indexes are very small or bulk-deletions of indexes
does almost nothing such as brin.

>
>   I think we need input from some other
> people as well.  I will start a separate thread to discuss this and
> see if that helps to get the input from others.

+1

--
Masahiko Sawada  http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



В списке pgsql-hackers по дате отправления:

Предыдущее
От: Andrey Borodin
Дата:
Сообщение: Re: pglz performance
Следующее
От: Darafei "Komяpa" Praliaskouski
Дата:
Сообщение: Re: cost based vacuum (parallel)