Re: Understanding fsync (was: Need Help Recovering from Botched Upgrade Attempt)

Поиск
Список
Период
Сортировка
От Greg Smith
Тема Re: Understanding fsync (was: Need Help Recovering from Botched Upgrade Attempt)
Дата
Msg-id Pine.GSO.4.64.0806181358230.11228@westnet.com
обсуждение исходный текст
Ответ на Understanding fsync (was: Need Help Recovering from Botched Upgrade Attempt)  (Sam Mason <sam@samason.me.uk>)
Ответы Re: Understanding fsync (was: Need Help Recovering from Botched Upgrade Attempt)  (Sam Mason <sam@samason.me.uk>)
Список pgsql-general
On Wed, 18 Jun 2008, Sam Mason wrote:

> Isn't fsync only a side-effect of having a write-back cache between
> programs and the disk?  This means it's only purpose is to ensure that
> the cache is consistent with what's on disk.  Because all programs
> running within a system are running on top of the cache they don't know
> or care whether the cache actually matches up to the disk.

Most programs don't.  PostgreSQL writes to the database in two stages:
the WAL, followed by an fsync, then later to the main database files.
You can't trust the WAL will be around for recovery until the first fsync
returns.  The checkpoint process makes sure everything that went into the
WAL then made it to the main database files, and again it doesn't trust
that it's really on disk until the fsync returns.

> Therefore, if I understand things correctly, the state of fsync
> shouldn't matter in this use case.  It's equally borken independent to
> the state of fsync.

Quote borken indeed, and fsync has nothing to do with it.  The theory
proposed is that since no writes were done, the backup should be
consistant.  This is quite wrong.  The most obvious case showing that is
one where a time-driven checkpoint occured (as happens every 5 minutes by
default) while you were in the middle of backing up.  Let's say the main
database files are backed up before the checkpoint, but the backup is
still going on some giant archival table.  The checkpoint happens; it
updates the earlier files already in the backup.  The checkpoint finishes,
and erases the WAL logs.  Now the backup makes it way to the WAL files.
You're screwed when you try and recover this database from the backup.
The database doesn't have the latest updates, and the WAL can't recover
them because it already cleared its copy of them out thinking they weren't
needed anymore.  You'll be lucky to get the database to start at all, it's
missing data you thought was commited before the backup started, and who
knows what subtle corruption you'll find.

Now, in reality, even time-driven checkpoints don't do anything if there
hasn't been activity, so it may very well be the case that any one
database backup is fine.  But you can't ignore the requirement to do a
pg_start_backup before making a filesystem level backup and expect you'll
get that lucky--sooner or later you will get a backup that won't restore
if you keep that up.

--
* Greg Smith gsmith@gregsmith.com http://www.gregsmith.com Baltimore, MD

В списке pgsql-general по дате отправления:

Предыдущее
От: Rich Shepard
Дата:
Сообщение: Re: Correct pg_dumpall Syntax
Следующее
От: Tom Lane
Дата:
Сообщение: Re: migrating from mysql: need to convert empty string to null