clog problems after dump/restore

Поиск
Список
Период
Сортировка
От Bernhard Schrader
Тема clog problems after dump/restore
Дата
Msg-id 52D7E2A5.6060905@innogames.de
обсуждение исходный текст
Список pgsql-admin
Hi List,

today i come to you with a problem of some of our DB-Servers which have
a clog problem, in particular this one:

2014-01-16 10:13:01 UTC 12765ERROR:  could not access status of
transaction 26619670
2014-01-16 10:13:01 UTC 12765DETAIL:  Could not open file
"pg_clog/0019": No such file or directory.

The Servers have the version 9.3.1 and are running on Debian Wheezy / 7.1

So far, this servers are normally not managed by me, but by  a
colleague, some time ago (about two month) he had to upgrade the
operation system to wheezy and also upgraded postgres to 9.3.1. I
_cannot_ totally clarify how he did it, but according to my research it
went this way.

Installed a new server with wheezy and psql 9.2.4, dump restore from old
to new, installation of 9.3.1 and pg_upgrade, upgrade process.

10 Days ago, he made this dd "fix" which you can find in google if you
search enough

"dd if=/dev/zero of=/var/lib/postgresql/9.3/main/pg_clog/002D bs=256k
count=1"

Which, afaik is no fix at all, as the state of the transactions is still
unclear.

If i go through the logfile, the first occurence of the clog problems
starts at 4th of January. the "dd" command at 5th and clog reoccured at
11th on one of the servers. And until now two new machines have clog
errors.

On the machine where it reoccured, another colleague did the dd again,
and dumped the DB.

After that i dropped the DB and replayed the dump, AFAIR this is the
only way to get rid of the CLOG issue completely, even if the dump
itself might be not completely consistent in the view of the data
itself. Is this correct?

If not, how can I fix this issue? Is there any way?

The other question would be, how this could be happen, yes we have
9.3.1, but as far as i understood, the clog bug from december only
occurs with hot_standby.

So far i plan to do upgrade to 9.3.2 asap, but first i would like to
know how a valid fix would look like. If there is only a fix which might
end up in some missing data, this wouldn't be the worse.

p.s. do you know any good documentation about clog, as it is quite
unclear to me how it works exactly.

regards

Bernhard




В списке pgsql-admin по дате отправления:

Предыдущее
От: Ahmed Bessifi
Дата:
Сообщение: incorrect pgbench results when postgres fails
Следующее
От: Mario Splivalo
Дата:
Сообщение: Checking replication slave state