Re: Detecting database corruption

Поиск
Список
Период
Сортировка
От Andrew Sullivan
Тема Re: Detecting database corruption
Дата
Msg-id 20040120184300.GA4768@phlogiston.dyndns.org
обсуждение исходный текст
Ответ на Re: Detecting database corruption  (Jack Orenstein <jorenstein@reference-info.com>)
Список pgsql-general
On Mon, Jan 19, 2004 at 02:45:27PM -0500, Jack Orenstein wrote:
> > If this means, "Does the database usually check for corruption?" the
> > answer is, "Not as a matter of course."
>
> Do you mean that this happens in a few select situations? Or that
> there are configuration flags that can be used to enable such checks?

There have been occasional reports of such corruption, but it seems
always to come down to bad hardware.  There are no flags to check for
this as a part of regular operations, although you'd certainly get an
error if you tried to retrieve bad data.

> Database corruption is a concern for two reasons. First, if it ever
> does occur, we have to be able to deal with the situation gracefully,
> even if that means nothing beyond a clean shutdown of the
> application.

In the cases where people experience it, what usually shows up is
some sort of inability to access data that is supposed to be in a
place on the disk, but turns out not to be.  You get error messages
about missing tuples, mangled data, or a core dump.  I think in such
cases you probably would indeed want to shut down your application.

> Second, we are struggling with the IDE vs. fsync issue,
> that has come up on this mailing list. We definitely have to support
> IDE drives, and we're trying to determine how to balance performance
> against other concerns. If we do end up leaving IDE caching enabled,
> then my understanding is that corruption is a real possibility, (or
> have I drawn the wrong conclusion on this point?)

This is a different problem.  My best advice is, "get a UPS with a
brain."  A UPS which will keep your system up for 10 minutes and
which will shut it down as soon as the battery kicks in is pretty
cheap.  That and some regular testing and maintenance of it is likely
to prevent most problematic cases you might run into here.

Most fsync worries actually have to do with losing data rather than
data corruption: fsync is called when a transaction commits, and if
the hardware is lying about whether the bits are actually on the
disk, you might lose some things you think are committed.  You can
apparently tolerate some data loss anyway, so in this case it's not
too big a deal.

A

--
Andrew Sullivan  | ajs@crankycanuck.ca
Music is no business of mine.
        --Marge Simpson

В списке pgsql-general по дате отправления:

Предыдущее
От: Tom Lane
Дата:
Сообщение: Re: Getting rid of duplicate tables.
Следующее
От: Jan Wieck
Дата:
Сообщение: Re: Transaction id