Re: Autovacuum to prevent wraparound tries to consume xid

Поиск

Список

Период

Сортировка

От	Alexander Korotkov
Тема	Re: Autovacuum to prevent wraparound tries to consume xid
Дата	22 мая 2016 г. 15:24:39
Msg-id	CAPpHfdv3aK7XjkhL1EpZSd2E-S2p0hwG5Y=c5dR4O4o-sMfmRg@mail.gmail.com обсуждение исходный текст
Ответ на	Re: Autovacuum to prevent wraparound tries to consume xid (Amit Kapila <amit.kapila16@gmail.com>)
Ответы	Re: Autovacuum to prevent wraparound tries to consume xid
Список	pgsql-hackers

Дерево обсуждения

On Sun, May 22, 2016 at 12:39 PM, Amit Kapila <amit.kapila16@gmail.com> wrote:

On Mon, Mar 28, 2016 at 4:35 PM, Alexander Korotkov <a.korotkov@postgrespro.ru> wrote:
Hackers,

one our customer meet near xid wraparound situation. xid counter reached xidStopLimit value. So, no transactions could be executed in normal mode. But what I noticed is strange behaviour of autovacuum to prevent wraparound. It vacuums tables, updates pg_class and pg_database, but then falls with "database is not accepting commands to avoid wraparound data loss in database" message. We end up with situation that according to pg_database maximum age of database was less than 200 mln., but transactions couldn't be executed, because ShmemVariableCache wasn't updated (checked by gdb).

I've reproduced this situation on my laptop as following:

1) Connect gdb, do "set ShmemVariableCache->nextXid = ShmemVariableCache->xidStopLimit"
2) Stop postgres
3) Make some fake clog: "dd bs=1m if=/dev/zero of=/usr/local/pgsql/data/pg_clog/07FF count=1024"
4) Start postgres

Then I found the same situation as in customer database. Autovacuum to prevent wraparound regularly produced following messages in the log:

ERROR: database is not accepting commands to avoid wraparound data loss in database "template1"
HINT: Stop the postmaster and vacuum that database in single-user mode.
You might also need to commit or roll back old prepared transactions.

Finally all databases was frozen

# SELECT datname, age(datfrozenxid) FROM pg_database;
datname │ age
───────────┼──────────
template1 │ 0
template0 │ 0
postgres │ 50000000
(3 rows)

but no transactions could be executed (ShmemVariableCache wasn't updated).

After some debugging I found that vac_truncate_clog consumes xid just to produce warning. I wrote simple patch which replaces GetCurrentTransactionId() with ShmemVariableCache->nextXid. That completely fixes this situation for me: ShmemVariableCache was successfully updated.

As per your latest patch, you are using ReadNewTransactionId() to get the nextXid which then is used to check if any database's frozenxid is already wrapped. Now, isn't the value of nextXID in your patch same as lastSaneFrozenXid in most cases (I mean there is a small window where some new transaction might have started due to which the value of ShmemVariableCache->nextXid has been advanced)? So isn't relying on lastSaneFrozenXid check sufficient?

Hmm... So, this code already contains comparison with lastSaneFrozenXid. Thus, current code compares against both of lastSaneFrozenXid and myXID. I have no comment clarifying why this should be so. In my opinion we can just remove myXID with its checks. Git shows that Tom Lane committed lastSaneFrozenXid and lastSaneMinMulti checks in addition to myXID check in 78db307b.

Tom, what do you think? Could we remove myXID from vac_truncate_clog()?

------
Alexander Korotkov
Postgres Professional: http://www.postgrespro.com

The Russian Postgres Company

В списке pgsql-hackers по дате отправления:

Предыдущее

От: Andreas Seltenreich
Дата: 22 мая 2016 г., 12:42:53
Сообщение: [sqlsmith] PANIC: failed to add BRIN tuple

Следующее

От: Tatsuo Ishii
Дата: 22 мая 2016 г., 16:05:08
Сообщение: Re: Parallel query

Вход в личный кабинет

Восстановление пароля

Подтверждение аккаунта

Изменение пароля

Re: Autovacuum to prevent wraparound tries to consume xid

Предыдущее

Следующее