Re: PG12 autovac issues

Поиск
Список
Период
Сортировка
От Andres Freund
Тема Re: PG12 autovac issues
Дата
Msg-id 20200326025956.zqypnlspv5k5pb42@alap3.anarazel.de
обсуждение исходный текст
Ответ на Re: PG12 autovac issues  (Michael Paquier <michael@paquier.xyz>)
Ответы Re: PG12 autovac issues
Список pgsql-general
Hi,

On 2020-03-26 10:43:36 +0900, Michael Paquier wrote:
> On Wed, Mar 25, 2020 at 10:39:17AM -0500, Justin King wrote:
> > Mar 25 14:48:26 cowtn postgres[39875]: [35298-1] 2020-03-25
> > 14:48:26.329 GMT [39875] DEBUG:  skipping redundant vacuum to prevent
> > wraparound of table "postgres.pg_catalog.pg_tablespace"
> > Mar 25 14:48:26 cowtn postgres[39875]: [35299-1] 2020-03-25
> > 14:48:26.339 GMT [39875] DEBUG:  skipping redundant vacuum to prevent
> > wraparound of table "postgres.pg_catalog.pg_auth_members"
> > Mar 25 14:48:26 cowtn postgres[39875]: [35300-1] 2020-03-25
> > 14:48:26.350 GMT [39875] DEBUG:  skipping redundant vacuum to prevent
> > wraparound of table "postgres.pg_catalog.pg_replication_origin"

FWIW, this kind of thing is why I think the added skipping logic is a
bad idea. Silently skipping things like this (same with the "bogus"
logic in datfrozenxid computation) is dangerous. I think we should
seriously consider backing this change out.

And if not, then we should at least include enough detail in the message
to be able to debug this.


> >  postgres=# SELECT oid::regclass, age(relfrozenxid), relfrozenxid FROM
> > pg_class WHERE relfrozenxid <> 0 ORDER BY age(relfrozenxid) DESC LIMIT
> > 1;
> >     oid    |    age    | relfrozenxid
> > -----------+-----------+--------------
> >  pg_authid | 202793549 |   4284570172
> 
> Ugh.  I think that this is exactly the thing I was suspecting
> previously:
> - The database stats look sane.
> - The relation stats don't look good and visibly are put in such a
> state that only one type of jobs gets triggered (non-aggressive but
> anti-wraparound), which just keep being skipped and the relation stats
> don't get refreshed.  (Such autovacuum jobs should never happen and we
> have some underlying issues that will need separate care).

Hm. Why is this a state that is clearly invalid compared to pg_database?
Seems to precisely match

> postgres=# SELECT datname, age(datfrozenxid), datfrozenxid FROM
> pg_database ORDER BY age(datfrozenxid) DESC LIMIT 1;
>  datname  |    age    | datfrozenxid
> ----------+-----------+--------------
>  postgres | 202773709 |   4284570172

And why should this lead to anti-wraparound vacuums not happening? This
is older than the the cutoff age?

xid 4284570172 having the age of 202 million xids suggests that
ReadNewTransactionId() is approx 192376585. Which comports with the log
saying: oldest xmin: 189591147.


Or are you saying that you conclude that the relcache entry is somehow
out of date? It sure is interesting that all of the tables that hit the
"skipping redundant vacuum" condition are shared tables.

Greetings,

Andres Freund



В списке pgsql-general по дате отправления:

Предыдущее
От: Michael Paquier
Дата:
Сообщение: Re: PG12 autovac issues
Следующее
От: Michael Paquier
Дата:
Сообщение: Re: PG12 autovac issues