Re: PostgreSQL's handling of fsync() errors is unsafe and risks data loss at least on XFS

Поиск
Список
Период
Сортировка
От Craig Ringer
Тема Re: PostgreSQL's handling of fsync() errors is unsafe and risks data loss at least on XFS
Дата
Msg-id CAMsr+YEu6fDBiOuuHJO1krKpQT_FgeMUpizk-+2mA4dfGpz7Tw@mail.gmail.com
обсуждение исходный текст
Ответ на Re: PostgreSQL's handling of fsync() errors is unsafe and risks data loss at least on XFS  (Mark Dilger <hornschnorter@gmail.com>)
Список pgsql-hackers
On 10 April 2018 at 04:25, Mark Dilger <hornschnorter@gmail.com> wrote:

> I was reading this thread up until now as meaning that the standby could
> receive corrupt WAL data and become corrupted.

Yes, it can, but not directly through the first error.

What can happen is that we think a block got written when it didn't.

If our in memory state diverges from our on disk state, we can make
subsequent WAL writes based on that wrong information. But that's
actually OK, since the standby will have replayed the original WAL
correctly.

I think the only time we'd run into trouble is if we evict the good
(but not written out) data from s_b and the fs buffer cache, then
later read in the old version of a block we failed to overwrite. Data
checksums (if enabled) might catch it unless the write left the whole
block stale. In that case we might generate a full page write with the
stale block and propagate that over WAL to the standby.

So I'd say standbys are relatively safe - very safe if the issue is
caught promptly, and less so over time. But AFAICS WAL-based
replication (physical or logical) is not a perfect defense for this.

However, remember, if your storage system is free of any sort of
overprovisioning, is on a non-network file system, and doesn't use
multipath (or sets it up right) this issue *is exceptionally unlikely
to affect you*.

-- 
 Craig Ringer                   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


В списке pgsql-hackers по дате отправления:

Предыдущее
От: Thomas Munro
Дата:
Сообщение: Re: PostgreSQL's handling of fsync() errors is unsafe and risks data loss at least on XFS
Следующее
От: Alvaro Herrera
Дата:
Сообщение: Re: Excessive PostmasterIsAlive calls slow down WAL redo