On Sun, May 1, 2016 at 10:39 PM, McCoy, Shawn <shamccoy@amazon.com> wrote:
> I have been debugging a problem on a 9.3.10 Postgres database cluster with
> over 1200 databases. 10 workers, increased maintenance_work_mem, auto
> vacuum settings to run more frequently than default. What I will notice is
> that autovacuum will run for a week or so and traverse databases as
> expected. I will be able to see that age(datfrozenxid) for all 1200
> databases will stay close to autovacuum_freeze_max_age as desired.
>
> Then, suddenly I will see it get “stuck”. Autovacuum launcher will not
> launch worker processes even though databases start to age past
> autovacuum_freeze_max_age. If I create a list of databases and sort by
> age(datfrozenxid), connect to the database with the oldest and execute a
> simple: "vacuum freeze pg_database;”, autovacuum springs back into action.
>
> It’s never the same database where autovacuum seems to get “stuck”. I’m
> attempting to gather more debugging information, but, also can’t understand
> why simply doing a “vacuum freeze pg_database” breaks up the jam.
>
> Any thoughts?
So when it's stuck, there are no AV worker processes running at all,
for a sustained period of time?
--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company