Re: Testing autovacuum wraparound (including failsafe)

Поиск
Список
Период
Сортировка
От Masahiko Sawada
Тема Re: Testing autovacuum wraparound (including failsafe)
Дата
Msg-id CAD21AoDVhkXp8HjpFO-gp3TgL6tCKcZQNxn04m01VAtcSi-5sA@mail.gmail.com
обсуждение исходный текст
Ответ на Re: Testing autovacuum wraparound (including failsafe)  (Masahiko Sawada <sawada.mshk@gmail.com>)
Ответы Re: Testing autovacuum wraparound (including failsafe)  (Ian Lawrence Barwick <barwick@gmail.com>)
Список pgsql-hackers
Hi,

On Tue, Feb 1, 2022 at 11:58 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
>
> On Fri, Jun 11, 2021 at 10:19 AM Andres Freund <andres@anarazel.de> wrote:
> >
> > Hi,
> >
> > On 2021-06-10 16:42:01 +0300, Anastasia Lubennikova wrote:
> > > Cool. Thank you for working on that!
> > > Could you please share a WIP patch for the $subj? I'd be happy to help with
> > > it.
> >
> > I've attached the current WIP state, which hasn't evolved much since
> > this message... I put the test in src/backend/access/heap/t/001_emergency_vacuum.pl
> > but I'm not sure that's the best place. But I didn't think
> > src/test/recovery is great either.
> >
>
> Thank you for sharing the WIP patch.
>
> Regarding point (1) you mentioned (StartupSUBTRANS() takes a long time
> for zeroing out all pages), how about using single-user mode instead
> of preparing the transaction? That is, after pg_resetwal we check the
> ages of datfrozenxid by executing a query in single-user mode. That
> way, we don’t need to worry about autovacuum concurrently running
> while checking the ages of frozenxids. I’ve attached a PoC patch that
> does the scenario like:
>
> 1. start cluster with autovacuum=off and create tables with a few data
> and make garbage on them
> 2. stop cluster and do pg_resetwal
> 3. start cluster in single-user mode
> 4. check age(datfrozenxid)
> 5. stop cluster
> 6. start cluster and wait for autovacuums to increase template0,
> template1, and postgres datfrozenxids

The above steps are wrong.

I think we can expose a function in an extension used only by this
test in order to set nextXid to a future value with zeroing out
clog/subtrans pages. We don't need to fill all clog/subtrans pages
between oldestActiveXID and nextXid. I've attached a PoC patch for
adding this regression test and am going to register it to the next
CF.

BTW, while testing the emergency situation, I found there is a race
condition where anti-wraparound vacuum isn't invoked with the settings
autovacuum = off, autovacuum_max_workers = 1. AN autovacuum worker
sends a signal to the postmaster after advancing datfrozenxid in
SetTransactionIdLimit(). But with the settings, if the autovacuum
launcher attempts to launch a worker before the autovacuum worker who
has signaled to the postmaster finishes, the launcher exits without
launching a worker due to no free workers. The new launcher won’t be
launched until new XID is generated (and only when new XID % 65536 ==
0). Although autovacuum_max_workers = 1 is not mandatory for this
test, it's easier to verify the order of operations.

Regards,

--
Masahiko Sawada
EDB:  https://www.enterprisedb.com/

Вложения

В списке pgsql-hackers по дате отправления:

Предыдущее
От: Robert Haas
Дата:
Сообщение: Re: Emit extra debug message when executing extension script.
Следующее
От: Yugo NAGATA
Дата:
Сообщение: Add a test for "cannot truncate foreign table"