Database corruption event, unlockable rows, possibly bogus virtual xids? Invalid backend server xid

Поиск

Список

Период

Сортировка

От	Ned Wolpert
Тема	Database corruption event, unlockable rows, possibly bogus virtual xids? Invalid backend server xid
Дата	21 февраля 2013 г. 22:59:53
Msg-id	CAFehBF=ktRXFex1tQOAx-cXUaph-jBXkhuSfO9oqgH_4PtNoOA@mail.gmail.com обсуждение исходный текст
Ответы	Re: Database corruption event, unlockable rows, possibly bogus virtual xids? Invalid backend server xid (Kevin Grittner <kgrittn@ymail.com>) Re: Database corruption event, unlockable rows, possibly bogus virtual xids? Invalid backend server xid (Tom Lane <tgl@sss.pgh.pa.us>)
Список	pgsql-general

Дерево обсуждения

(I originally posted this to pgsql-admin and was pointed to here instead.)

Folks-

  I'm doing a postmortem on a corruption event we had. I have an idea on
what happened, but not sure. I figure I'd share what happened and see if
I'm close to right here.

  Event: Running 9.1.6 with hot-standby, archiving 4 months of wal files,
and even a nightly pg_dump all. 50G database.  Trying to update or delete a
row in a small (21 row, but heavily used table) would lock up completely.
Never finish. Removed all clients, restarted the db instance, no joy. Check
pg_stat_activity, and nothing that wasn't idle.... run the delete, locked
up.

  Ran (SELECT*FROM pg_class JOIN pg_locks ON pg_locks.relation =
pg_class.oid;) with no clients touching this row, fresh restart of the db,
and saw virtualtransactions against this same table where the values would
be -1/nnnnn were nnnnn was a huge number.  Turns out we had about 159
entries from different tables in the database.  Checked hot-standby and, of
course, no locks or anything.  Switched to hot-standby.

  Hot-standby instantly gained these locks, Also noticed that 2 prepared
transactions migrated to the hot-standby. Binary upgraded to 9.1.8, locks
still existed. Ended up executing the one thing we knew would work. Take
the site down, pg_dumpall to fresh instance. Everything is fine.

A little more background: We were running 9.1.4 back when 9.1.6 came out.
We saw there was possible corruption issues and did a binary upgrade and
reindexing. Everything seemed to be fine, but I wonder if we really had
problems back then. We rebuilt the hot-standby after the binary upgrade via
normal restore and wal-file replays.  I should also note that this row that
had the lock on it that would not go away, was created by an app server
that was killed (via -9) since it was non-responsive, and the row 'create
date' (in db and logs) is the exact time the app server was killed.

I was wondering if a) these virtualtransactions that start with '-1/'
indicate a problem, b) if this could have happened from pre 9.1.6
corruption that was fixed in 9.1.6. Or, could this have occurred when we
killed that app server? Or.... am I looking in the wrong place.

I do still have the old data directories so I can start them up and check
out the dataset. Any advice?

Currently running environment: CentOS 5.x
Used the http://yum.postgresql.org/9.1 repo...
Currently using PostgreSQL 9.1.8 on x86_64-unknown-linux-gnu, compiled by
gcc (GCC) 4.1.2 20080704 (Red Hat 4.1.2-52), 64-bit
but was on 9.1.6 at failure time.
The hardware on both primary and secondary servers are in good condition,
raid5 via a hardware card with no issues.  Each a 2x4x2 (2 hyperthreaded
Xeon E5620 2.4Ghz, 4 cores each, 16 threads total) 32G Ram.

Data from pg_dumpall checked out fine; no data loss occurred that we could
tell. Just this transaction lock and what seemed like invalid backend ids
listed in the virtualtransaction from the pg_locks table/view.
(-1/verylargenumber)






--
Virtually, Ned Wolpert

"Settle thy studies, Faustus, and begin..."   --Marlowe

В списке pgsql-general по дате отправления:

Предыдущее

От: bricklen
Дата: 21 февраля 2013 г., 21:37:27
Сообщение: Re: View to show privileges on views/tables/sequences/foreign tables

Следующее

От: Kevin Grittner
Дата: 21 февраля 2013 г., 23:03:56
Сообщение: Re: Full text and removing dashes from names

Вход в личный кабинет

Восстановление пароля

Подтверждение аккаунта

Изменение пароля

Database corruption event, unlockable rows, possibly bogus virtual xids? Invalid backend server xid

Предыдущее

Следующее