BUG #13373: Imminent transaction id wrap around in spite of daily VACUUM

Поиск
Список
Период
Сортировка
От ms@clickware.de
Тема BUG #13373: Imminent transaction id wrap around in spite of daily VACUUM
Дата
Msg-id 20150528225245.15331.65804@wrigleys.postgresql.org
обсуждение исходный текст
Ответы Re: BUG #13373: Imminent transaction id wrap around in spite of daily VACUUM  (Robert Haas <robertmhaas@gmail.com>)
Список pgsql-bugs
The following bug has been logged on the website:

Bug reference:      13373
Logged by:          Marc Schablewski
Email address:      ms@clickware.de
PostgreSQL version: 9.3.4
Operating system:   SuSE Enterprise Linux 11.3
Description:

Today one of our servers started complaining about getting close to the
transaction limit and that our database "mydb" should be VACUUMed to prevent
shutdown and transaction ID wrap around, although the database is VACUUMed
every night and autovacuum is enabled.

2015-05-28 01:42:04 CEST ::1(44588) 40015 WARNING:  database "mydb" must be
vacuumed within 11000000 transactions
2015-05-28 01:42:04 CEST ::1(44588) 40015 HINT:  To avoid a database
shutdown, execute a database-wide VACUUM in that database.
        You might also need to commit or roll back old prepared
transactions.

Sadly this condition remained undetected until the automatic shutdown.

Since the docs are a bit unclear on what kind of VACUUM one should run in
this case, I started the database in single user mode and executed a regular
VACUUM and a VACUUM FREEZE. Both commands ran for a while (~45 min) before
returning and printing the same warning message about VACUUMing the
database. Each time the remaining transaction count was reduced by one.

As stated before, the database is VACUUMed once every night and autovacuum
is enabled. We also do not use prepared transaction. So I wonder how the
database got into this state.

While analyzing the problem I noticed some other strange things about the
database that might or might not be connected to our problem. There are
still a lot (~100) of temporary tables left, even after closing all
connections / shutting down the database.

When I use this statement, taken from chapter 23.1.4. of the docs:

    SELECT c.oid::regclass as table_name,
           greatest(age(c.relfrozenxid),age(t.relfrozenxid)) as age
    FROM pg_class c
    LEFT JOIN pg_class t ON c.reltoastrelid = t.oid
    WHERE c.relkind = 'r';

all tables have an "age" of "3", except for most of the temp tables. Their
"age" ranges from a couple of hundreds to millions:

        ----
         1: table_name = "pg_temp_3.temp_table0" (typeid = 2205, len = 4,
typmod = -1, byval = t)
         2: age = "2146483651"  (typeid = 23, len = 4, typmod = -1, byval =
t)
        ----
         1: table_name = "pg_temp_48.temp_table1" (typeid = 2205, len = 4,
typmod = -1, byval = t)
         2: age = "901" (typeid = 23, len = 4, typmod = -1, byval = t)
        ----

The highest "age" is also the "age" of the database "mydb" (statement also
taken from chapter 23.1.4.):

        SELECT datname, age(datfrozenxid) FROM pg_database;

        ----
         1: datname = "template1"       (typeid = 19, len = 64, typmod = -1,
byval = f)
         2: age = "50000001"    (typeid = 23, len = 4, typmod = -1, byval =
t)
        ----
         1: datname = "template0"       (typeid = 19, len = 64, typmod = -1,
byval = f)
         2: age = "2068988610"  (typeid = 23, len = 4, typmod = -1, byval =
t)
        ----
         1: datname = "postgres"        (typeid = 19, len = 64, typmod = -1,
byval = f)
         2: age = "111784253"   (typeid = 23, len = 4, typmod = -1, byval =
t)
        ----
         1: datname = "mydb"     (typeid = 19, len = 64, typmod = -1, byval
= f)
         2: age = "2146483651"  (typeid = 23, len = 4, typmod = -1, byval =
t)
        ----

I have to admit that I am not sure how to interpret these "age" values.

Secondly, there are a lot of files in pg_clog. The oldest ones are going
back to October 2014 and they seem to have been created continuously since
then. The whole directory is about 512MB in size. The docs state that it
should only be about 1/10th of that size with the default setting of
autovacuum_freeze_max_age.

All parameters relating to vacuum or autovacuum are unchanged in our
postgresql.conf, i.e we are using their default values.

The next thing I'll try is to drop all those temporary schemas and tables
and doing another VACUUM afterwards. I suspect one of them might cause the
issue, but I would be thankful for other ideas.

Regards,
    Marc

В списке pgsql-bugs по дате отправления:

Предыдущее
От: Tom Lane
Дата:
Сообщение: Re: pg_get_functiondef() does not show LEAKPROOF for leakproof functions
Следующее
От: Sandeep Thakkar
Дата:
Сообщение: Re: BUG #13365: pg9.4.0 isn't support windows xp