Обсуждение: Autovacuum worker spawning strategy

Поиск
Список
Период
Сортировка

Autovacuum worker spawning strategy

От
Rafael Thofehrn Castro
Дата:
Hello all,

While investigating a problem in a PG14 instance I noticed that autovacuum workers
stop processing other databases when a database has a temporary table with age
older than `autovacuum_freeze_max_age`. To test that I added a custom logline showing
which database the about to spawned autovacuum worker will target. Here are the details:

```
test=# select oid,datname from pg_database;
  oid  |  datname  
-------+-----------
 13757 | postgres
 32850 | test
     1 | template1
 13756 | template0
(4 rows)
```

Here are the loglines under normal circumstances:

```
2022-07-19 11:24:29.406 -03 [18627] WARNING:  AUTOVACUUM WORKER SPAWNED (db=13756)
2022-07-19 11:24:44.406 -03 [18627] WARNING:  AUTOVACUUM WORKER SPAWNED (db=1)
2022-07-19 11:24:59.406 -03 [18627] WARNING:  AUTOVACUUM WORKER SPAWNED (db=32850)
2022-07-19 11:25:14.406 -03 [18627] WARNING:  AUTOVACUUM WORKER SPAWNED (db=13757)
2022-07-19 11:25:29.417 -03 [18627] WARNING:  AUTOVACUUM WORKER SPAWNED (db=13756)
2022-07-19 11:25:44.417 -03 [18627] WARNING:  AUTOVACUUM WORKER SPAWNED (db=1)
2022-07-19 11:25:59.418 -03 [18627] WARNING:  AUTOVACUUM WORKER SPAWNED (db=32850)
2022-07-19 11:26:14.417 -03 [18627] WARNING:  AUTOVACUUM WORKER SPAWNED (db=13757)
2022-07-19 11:26:29.429 -03 [18627] WARNING:  AUTOVACUUM WORKER SPAWNED (db=13756)
2022-07-19 11:26:44.430 -03 [18627] WARNING:  AUTOVACUUM WORKER SPAWNED (db=1)
2022-07-19 11:26:59.432 -03 [18627] WARNING:  AUTOVACUUM WORKER SPAWNED (db=32850)
2022-07-19 11:27:14.429 -03 [18627] WARNING:  AUTOVACUUM WORKER SPAWNED (db=13757)
2022-07-19 11:27:29.442 -03 [18627] WARNING:  AUTOVACUUM WORKER SPAWNED (db=13756)
2022-07-19 11:27:44.441 -03 [18627] WARNING:  AUTOVACUUM WORKER SPAWNED (db=1)
2022-07-19 11:27:59.446 -03 [18627] WARNING:  AUTOVACUUM WORKER SPAWNED (db=32850)
2022-07-19 11:28:14.442 -03 [18627] WARNING:  AUTOVACUUM WORKER SPAWNED (db=13757)
2022-07-19 11:28:29.454 -03 [18627] WARNING:  AUTOVACUUM WORKER SPAWNED (db=13756)
2022-07-19 11:28:44.454 -03 [18627] WARNING:  AUTOVACUUM WORKER SPAWNED (db=1)
2022-07-19 11:28:59.458 -03 [18627] WARNING:  AUTOVACUUM WORKER SPAWNED (db=32850)
2022-07-19 11:29:14.443 -03 [18627] WARNING:  AUTOVACUUM WORKER SPAWNED (db=13757)
2022-07-19 11:29:29.465 -03 [18627] WARNING:  AUTOVACUUM WORKER SPAWNED (db=32850)
2022-07-19 11:29:44.485 -03 [18627] WARNING:  AUTOVACUUM WORKER SPAWNED (db=1)
2022-07-19 11:29:59.499 -03 [18627] WARNING:  AUTOVACUUM WORKER SPAWNED (db=13756)
```

But when I create a temp table and make it older than `autovacuum_freeze_max_age`
I get this:

```
2022-07-19 11:30:14.496 -03 [18627] WARNING:  AUTOVACUUM WORKER SPAWNED (db=13757)
2022-07-19 11:30:29.495 -03 [18627] WARNING:  AUTOVACUUM WORKER SPAWNED (db=13757)
2022-07-19 11:30:44.507 -03 [18627] WARNING:  AUTOVACUUM WORKER SPAWNED (db=13757)
2022-07-19 11:30:59.522 -03 [18627] WARNING:  AUTOVACUUM WORKER SPAWNED (db=13757)
2022-07-19 11:31:14.536 -03 [18627] WARNING:  AUTOVACUUM WORKER SPAWNED (db=13757)
2022-07-19 11:31:29.551 -03 [18627] WARNING:  AUTOVACUUM WORKER SPAWNED (db=13757)
2022-07-19 11:31:44.565 -03 [18627] WARNING:  AUTOVACUUM WORKER SPAWNED (db=13757)
2022-07-19 11:31:59.579 -03 [18627] WARNING:  AUTOVACUUM WORKER SPAWNED (db=13757)
2022-07-19 11:32:14.591 -03 [18627] WARNING:  AUTOVACUUM WORKER SPAWNED (db=13757)
2022-07-19 11:32:29.606 -03 [18627] WARNING:  AUTOVACUUM WORKER SPAWNED (db=13757)
2022-07-19 11:32:44.619 -03 [18627] WARNING:  AUTOVACUUM WORKER SPAWNED (db=13757)
2022-07-19 11:32:59.631 -03 [18627] WARNING:  AUTOVACUUM WORKER SPAWNED (db=13757)
2022-07-19 11:33:14.643 -03 [18627] WARNING:  AUTOVACUUM WORKER SPAWNED (db=13757)
2022-07-19 11:33:29.655 -03 [18627] WARNING:  AUTOVACUUM WORKER SPAWNED (db=13757)
2022-07-19 11:33:44.667 -03 [18627] WARNING:  AUTOVACUUM WORKER SPAWNED (db=13757)
2022-07-19 11:33:59.679 -03 [18627] WARNING:  AUTOVACUUM WORKER SPAWNED (db=13757)
2022-07-19 11:34:14.694 -03 [18627] WARNING:  AUTOVACUUM WORKER SPAWNED (db=13757)
2022-07-19 11:34:29.707 -03 [18627] WARNING:  AUTOVACUUM WORKER SPAWNED (db=13757)
2022-07-19 11:34:44.719 -03 [18627] WARNING:  AUTOVACUUM WORKER SPAWNED (db=13757)
2022-07-19 11:34:59.732 -03 [18627] WARNING:  AUTOVACUUM WORKER SPAWNED (db=13757)
2022-07-19 11:35:14.743 -03 [18627] WARNING:  AUTOVACUUM WORKER SPAWNED (db=13757)
```

This is actually expected behavior judging by the code logic: https://github.com/postgres/postgres/blob/master/src/backend/postmaster/autovacuum.c#L1201

PG prioritizes databases that need to be frozen and since a temporary table can't
be frozen by a process other than the session that created it, that DB will remain
a priority until the table is dropped.

I acknowledge that having a temp table existing for long enough to reach `autovacuum_freeze_max_age`
is a problem itself as the table will never be frozen and if age reaches 2 billion
the instance will shut down. That being said, perhaps there is room for improvement
in the AV worker spawning strategy to avoid leaving other DBs in the dark.

This database where I spotted the problem is from a customer that consumes 100m xacts/hour
and makes extensive uses of temp tables to load data, so that scenario can actually
happen.

Regards,

Rafael Castro — https://www.EnterpriseDB.com/

Re: Autovacuum worker spawning strategy

От
Nathan Bossart
Дата:
On Tue, Jul 19, 2022 at 12:40:06PM -0300, Rafael Thofehrn Castro wrote:
> PG prioritizes databases that need to be frozen and since a temporary table
> can't
> be frozen by a process other than the session that created it, that DB will
> remain
> a priority until the table is dropped.
> 
> I acknowledge that having a temp table existing for long enough to reach
> `autovacuum_freeze_max_age`
> is a problem itself as the table will never be frozen and if age reaches 2
> billion
> the instance will shut down. That being said, perhaps there is room for
> improvement
> in the AV worker spawning strategy to avoid leaving other DBs in the dark.
> 
> This database where I spotted the problem is from a customer that consumes
> 100m xacts/hour
> and makes extensive uses of temp tables to load data, so that scenario can
> actually
> happen.

I wonder if it's worth tracking a ѕeparate datfrozenxid that does not
include stuff that is beyond autovacuum's control, like temporary tables.

-- 
Nathan Bossart
Amazon Web Services: https://aws.amazon.com