Backends stalled in 'startup' state: index corruption

Поиск
Список
Период
Сортировка
От Greg Sabino Mullane
Тема Backends stalled in 'startup' state: index corruption
Дата
Msg-id 20120524173325.GA5533@tinybird.home
обсуждение исходный текст
Ответы Re: Backends stalled in 'startup' state: index corruption  (Tom Lane <tgl@sss.pgh.pa.us>)
Список pgsql-hackers
Yesterday I had a client that experienced a sudden high load on
one of their servers (8.3.5 - yes, I know. Those of you with
clients will understand). When I checked, almost all connections
were in a "startup" state, very similar to this thread:

http://postgresql.1045698.n5.nabble.com/9-1-3-backends-getting-stuck-in-startup-td5670712.html

Running a strace showed a lot of semop activity, and the logs showed a
successful connection, then a 5 minute plus wait before a query was issued.
So obviously, blocking on something. Unlike the thread above, I *did* find
problems in the system catalogs. For example, both pg_class and pg_index
gave warnings like this for every index during a VACUUM FULL
VERBOSE tablename:

WARNING: index "pg_class_relname_nsp_index" contains 7712 row versions, but table contains 9471 row versions
HINT:  Rebuild the index with REINDEX.

A REINDEX did not solve the problem (initially), as a REINDEX followed
by a VAC showed the same warning and hint.

The next step was dropping to standalone mode, but before that could
be done, the REINDEXes fixed the problem (no warnings, no stalled
connections). So my questions are:

* Why would a REINDEX not fix the problem as the hint suggested?
Why would it then start "working"?

* What exactly is the relationship between bad indexes and shared
memory locks?

* Is there some other emergency fix when it happens, such as killing
all backends and hoping you kill the one that is actually holding
the lock (if any was).

* Did anything in the 8.3 series fix this? I saw nothing relevant in
the release notes for everything up to 8.3.18 (which it will be on soon).

--
Greg Sabino Mullane greg@endpoint.com
End Point Corporation
PGP Key: 0x14964AC8

В списке pgsql-hackers по дате отправления:

Предыдущее
От: Florian Pflug
Дата:
Сообщение: Re: [RFC] Interface of Row Level Security
Следующее
От: Alex Hunsaker
Дата:
Сообщение: Re: plperl_helpers.h fix for clang