Обсуждение: Autovac cancellation is broken in v14

Поиск
Список
Период
Сортировка

Autovac cancellation is broken in v14

От
Jeff Janes
Дата:
If I create a large table with "CREATE TABLE ... AS SELECT ... from generate_series(1,3e7)" with no explicit transactions, then once it is done I wait for autovac to kick in, then when I try to build an index on that table (or drop the table) the autovac doesn't go away on its own.

Bisects down to:

commit 5788e258bb26495fab65ff3aa486268d1c50b123
Author: Andres Freund <andres@anarazel.de>
Date:   Wed Jul 15 15:35:07 2020 -0700

    snapshot scalability: Move PGXACT->vacuumFlags to ProcGlobal->vacuumFlags.

Which makes sense given the parts of the code this touches, although I don't understand exactly what the problem is.  The problem persists in HEAD (77c7267c37).

Cheers,

Jeff

Re: Autovac cancellation is broken in v14

От
Jeff Janes
Дата:

On Thu, Aug 27, 2020 at 3:10 PM Jeff Janes <jeff.janes@gmail.com> wrote:
If I create a large table with "CREATE TABLE ... AS SELECT ... from generate_series(1,3e7)" with no explicit transactions, then once it is done I wait for autovac to kick in, then when I try to build an index on that table (or drop the table) the autovac doesn't go away on its own.

After a bit more poking at this, I think we are checking if we ourselves are an autovac process, not doing the intended check of whether the other guy is one.

Where would be a good spot to add a regression test for this?  "isolation_regression" ?
Cheers,

Jeff

Вложения

Re: Autovac cancellation is broken in v14

От
Andres Freund
Дата:
Hi,

On 2020-08-27 16:20:30 -0400, Jeff Janes wrote:
> On Thu, Aug 27, 2020 at 3:10 PM Jeff Janes <jeff.janes@gmail.com> wrote:
> 
> > If I create a large table with "CREATE TABLE ... AS SELECT ... from
> > generate_series(1,3e7)" with no explicit transactions, then once it is done
> > I wait for autovac to kick in, then when I try to build an index on that
> > table (or drop the table) the autovac doesn't go away on its own.
> >
> 
> After a bit more poking at this, I think we are checking if we ourselves
> are an autovac process, not doing the intended check of whether the other
> guy is one.

Ugh, good catch.


> Where would be a good spot to add a regression test for this?
> "isolation_regression" ?

I'm not immediately sure how we could write a good test for this,
particularly not in the isolation tests. We'd basically have to make
sure that a table needs autovacuuming, then sleep for long enough for
autovacuum to have come around, and block autovacuum from making
progress. That latter is doable by holding a pin on a page it needs to
freeze, e.g. using a cursor.  I suspect all of that would at least
require a TAP test, and might still be too fragile.

Other ideas?

Regards,

Andres



Re: Autovac cancellation is broken in v14

От
Andres Freund
Дата:
Hi,

On 2020-08-27 14:35:06 -0700, Andres Freund wrote:
> On 2020-08-27 16:20:30 -0400, Jeff Janes wrote:
> > On Thu, Aug 27, 2020 at 3:10 PM Jeff Janes <jeff.janes@gmail.com> wrote:
> > 
> > > If I create a large table with "CREATE TABLE ... AS SELECT ... from
> > > generate_series(1,3e7)" with no explicit transactions, then once it is done
> > > I wait for autovac to kick in, then when I try to build an index on that
> > > table (or drop the table) the autovac doesn't go away on its own.
> > >
> > 
> > After a bit more poking at this, I think we are checking if we ourselves
> > are an autovac process, not doing the intended check of whether the other
> > guy is one.
> 
> Ugh, good catch.

Pushed the fix.


> > Where would be a good spot to add a regression test for this?
> > "isolation_regression" ?
> 
> I'm not immediately sure how we could write a good test for this,
> particularly not in the isolation tests. We'd basically have to make
> sure that a table needs autovacuuming, then sleep for long enough for
> autovacuum to have come around, and block autovacuum from making
> progress. That latter is doable by holding a pin on a page it needs to
> freeze, e.g. using a cursor.  I suspect all of that would at least
> require a TAP test, and might still be too fragile.

Perhaps the easiest way for this would be to have an option to have
manual VACUUMs be interruptible by other backends. That seems like a
useful option anyway? I'll start a new thread.

Greetings,

Andres Freund