Re: Replication slot stats misgivings

Поиск
Список
Период
Сортировка
От Amit Kapila
Тема Re: Replication slot stats misgivings
Дата
Msg-id CAA4eK1Li_m6WVkHpcf4437+b1kAg4zbWc90q5ynjWD93Xen5Xw@mail.gmail.com
обсуждение исходный текст
Ответ на Re: Replication slot stats misgivings  (Tom Lane <tgl@sss.pgh.pa.us>)
Список pgsql-hackers
On Thu, Apr 29, 2021 at 8:50 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:
>
> Amit Kapila <amit.kapila16@gmail.com> writes:
> > This is the first test and inserts just one small record, so how it
> > can lead to spill of data. Do you mean to say that may be some
> > background process has written some transaction which leads to a spill
> > of data?
>
> autovacuum, say?
>
> > Yeah, something like this could happen. Another possibility here could
> > be that before the stats collector has processed drop and create
> > messages, we have enquired about the stats which lead to it giving us
> > the old stats. Note, that we don't wait for 'drop' or 'create' message
> > to be delivered. So, there is a possibility of the same. What do you
> > think?
>
> You should take a close look at the stats test in the main regression
> tests.  We had to jump through *high* hoops to get that to be stable,
> and yet it still fails semi-regularly.  This looks like pretty much the
> same thing, and so I'm pessimistically inclined to guess that it will
> never be entirely stable.
>

True, it is possible that we can't make it entirely stable but I would
like to try some more before giving up on this. Otherwise, I guess the
other possibility is to remove some of the latest tests added or
probably change them to be more forgiving. For example, we can change
the currently failing test to not check 'spill*' count and rely on
just 'total*' count which will work even in scenarios we discussed for
this failure but it will reduce the efficiency/completeness of the
test case.

> (At least not before the fabled stats collector rewrite, which may well
> introduce some entirely new set of failure modes.)
>
> Do we really need this test in this form?  Perhaps it could be converted
> to a TAP test that's a bit more forgiving.
>

We have a TAP test for slot stats but there we are checking some
scenarios across the restart. We can surely move these tests also
there but it is not apparent to me how it can create a difference?

-- 
With Regards,
Amit Kapila.



В списке pgsql-hackers по дате отправления:

Предыдущее
От: Peter Geoghegan
Дата:
Сообщение: Re: [BUG]"FailedAssertion" reported in lazy_scan_heap() when running logical replication
Следующее
От: Thomas Munro
Дата:
Сообщение: Re: WIP: WAL prefetch (another approach)