Re: Unable to restart postgres - database system was interrupted

Поиск
Список
Период
Сортировка
От Tom Lane
Тема Re: Unable to restart postgres - database system was interrupted
Дата
Msg-id 10905.1165351755@sss.pgh.pa.us
обсуждение исходный текст
Ответ на Re: Unable to restart postgres - database system was  (andy rost <Andy.Rost@noaa.gov>)
Список pgsql-general
andy rost <Andy.Rost@noaa.gov> writes:
> I'm curious about a couple of things. Why didn't the logs reflect the
> problem that it noticed when it tried to restart on 2006-12-04(what I
> mean by that, is Postgres thought the server had been interrupted on
> 2006-12-02 16:45 yet the logs for that date and time don't show that
> anything unusual happened).

Probably nothing did.  That message is actually just reporting the
last-update timestamp found in $PGDATA/global/pg_control, which was
probably updated during a routine checkpoint or log segment switch.
IOW it's not the time of a problem, but the time the server was last
known to be functioning normally.

The question is why do you have a two-day-stale copy of pg_control :-(
... it should certainly have been updated many times since then.
In particular, given your log entries that indicate normal shutdown at
2006-12-04 10:30:11, pg_control *should* have contained a timestamp
equal to that (plus or minus a second or so at most).

> Secondly, how did Postgres know at the restart that a) a problem had
> occurred sometime in the past and b) a specific set of transaction logs
> is required to get back up again.

Again, this is based on the checkpoint pointer found in pg_control;
it wants xlog files starting at where the last checkpoint is alleged
to be by pg_control.  It'd seem that pg_control is a lot older than
what is in pg_xlog/.  I suspect if you checked the logs you'd find
that 0000000100000065000000F7 corresponds to about 2006-12-02 16:45.

The only previous instances that I can recall of something like this
were in databases that are normally mounted on NFS volumes, and because
of some NFS problem or other the database volume had become dismounted,
leaving the postmaster seeing directories underneath the mount point
on the root volume --- and in particular a different copy of pg_control.
Usually this causes all hell to break loose immediately, though, so if
you hadn't had any signs of trouble or missing data before you stopped
the database, I doubt that could be the explanation.

            regards, tom lane

В списке pgsql-general по дате отправления:

Предыдущее
От: "Anton Melser"
Дата:
Сообщение: Re: n00b RAID + wal hot standby question
Следующее
От: Wei Weng
Дата:
Сообщение: Anything I can do to speed up this query?