Re: Postgres, fsync, and OSs (specifically linux)

Поиск
Список
Период
Сортировка
От Andres Freund
Тема Re: Postgres, fsync, and OSs (specifically linux)
Дата
Msg-id 20180427231043.of7vhjmhx4qzexm7@alap3.anarazel.de
обсуждение исходный текст
Ответ на Re: Postgres, fsync, and OSs (specifically linux)  (Bruce Momjian <bruce@momjian.us>)
Ответы Re: Postgres, fsync, and OSs (specifically linux)  (Bruce Momjian <bruce@momjian.us>)
Список pgsql-hackers
Hi,

On 2018-04-27 19:04:47 -0400, Bruce Momjian wrote:
> On Fri, Apr 27, 2018 at 03:28:42PM -0700, Andres Freund wrote:
> > - We need more aggressive error checking on close(), for ENOSPC and
> >   EIO. In both cases afaics we'll have to trigger a crash recovery
> >   cycle. It's entirely possible to end up in a loop on NFS etc, but I
> >   don't think there's a way around that.
> 
> If the no-space or write failures are persistent, as you mentioned
> above, what is the point of going into crash recovery --- why not just
> shut down?

Well, I mentioned that as an alternative in my email. But for one we
don't really have cases where we do that right now, for another we can't
really differentiate between a transient and non-transient state. It's
entirely possible that the admin on the system that ran out of space
fixes things, clearing up the problem.


> Also, since we can't guarantee that we can write any persistent state
> to storage, we have no way of preventing infinite crash recovery
> loops, which, based on inconsistent writes, might make things worse.

How would it make things worse?


> An additional features we have talked about is running some kind of
> notification shell script to inform administrators, similar to
> archive_command.  We need this too when sync replication fails.

To me that seems like a feature independent of this thread.

Greetings,

Andres Freund


В списке pgsql-hackers по дате отправления:

Предыдущее
От: Bruce Momjian
Дата:
Сообщение: Re: Postgres, fsync, and OSs (specifically linux)
Следующее
От: Michael Paquier
Дата:
Сообщение: Re: FinishPreparedTransaction missing HOLD_INTERRUPTS section