Re: Resetting spilled txn statistics in pg_stat_replication

Поиск
Список
Период
Сортировка
От Masahiko Sawada
Тема Re: Resetting spilled txn statistics in pg_stat_replication
Дата
Msg-id CA+fd4k5DBP+X6swgYi462N-3uBN7WXX8apx9SyUtk9J=cy+9YQ@mail.gmail.com
обсуждение исходный текст
Ответ на Re: Resetting spilled txn statistics in pg_stat_replication  (Amit Kapila <amit.kapila16@gmail.com>)
Ответы Re: Resetting spilled txn statistics in pg_stat_replication  (Amit Kapila <amit.kapila16@gmail.com>)
Список pgsql-hackers
On Tue, 13 Oct 2020 at 14:53, Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> On Tue, Oct 13, 2020 at 11:05 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:
> >
> > Amit Kapila <amit.kapila16@gmail.com> writes:
> > >> It is possible that MAXALIGN stuff is playing a role here and or the
> > >> background transaction stuff. I think if we go with the idea of
> > >> testing spill_txns and spill_count being positive then the results
> > >> will be stable. I'll write a patch for that.
> >
> > Here's our first failure on a MAXALIGN-8 machine:
> >
> > https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=grison&dt=2020-10-13%2005%3A00%3A08
> >
> > So this is just plain not stable.  It is odd though.  I can
> > easily think of mechanisms that would cause the WAL volume
> > to occasionally be *more* than the "typical" case.  What
> > would cause it to be *less*, if MAXALIGN is ruled out?
> >
>
> The original theory I have given above [1] which is an interleaved
> autovacumm transaction. Let me try to explain in a bit more detail.
> Say when transaction T-1 is performing Insert ('INSERT INTO stats_test
> SELECT 'serialize-topbig--1:'||g.i FROM generate_series(1, 5000)
> g(i);') a parallel autovacuum transaction occurs. The problem as seen
> in buildfarm will happen when autovacuum transaction happens after 80%
> or more of the Insert is done.
>
> In such a situation we will start decoding 'Insert' first and need to
> spill multiple times due to the amount of changes (more than threshold
> logical_decoding_work_mem) and then before we encounter Commit of
> transaction that performed Insert (and probably some more changes from
> that transaction) we will encounter a small transaction (autovacuum
> transaction).  The decode of that small transaction will send the
> stats collected till now which will lead to the problem shown in
> buildfarm.

That seems a possible scenario.

I think probably this also explains the reason why spill_count
slightly varied and spill_txns was still 1. The spill_count value
depends on how much the process spilled out transactions before
encountering the commit of an autovacuum transaction. Since we have
the spill statistics per reorder buffer, not per transactions, it's
possible.

Regards,

-- 
Masahiko Sawada            http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



В списке pgsql-hackers по дате отправления:

Предыдущее
От: Amit Kapila
Дата:
Сообщение: Re: Resetting spilled txn statistics in pg_stat_replication
Следующее
От: Amit Kapila
Дата:
Сообщение: Re: Resetting spilled txn statistics in pg_stat_replication