Re: subscriptionCheck failures on nightjar

Поиск
Список
Период
Сортировка
От Andres Freund
Тема Re: subscriptionCheck failures on nightjar
Дата
Msg-id 20190213215147.cjbymfojf6xndr4t@alap3.anarazel.de
обсуждение исходный текст
Ответ на Re: subscriptionCheck failures on nightjar  (Thomas Munro <thomas.munro@enterprisedb.com>)
Ответы Re: subscriptionCheck failures on nightjar  (Tom Lane <tgl@sss.pgh.pa.us>)
Re: subscriptionCheck failures on nightjar  (Michael Paquier <michael@paquier.xyz>)
Список pgsql-hackers
Hi,

On 2019-02-14 09:52:33 +1300, Thomas Munro wrote:
> On Thu, Feb 14, 2019 at 8:11 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:
> > Andres Freund <andres@anarazel.de> writes:
> > > I was kinda pondering just open coding it.  I am not yet convinced that
> > > my idea of just using an open FD isn't the least bad approach for the
> > > issue at hand.  What precisely is the NFS issue you're concerned about?
> >
> > I'm not sure that fsync-on-FD after the rename will work, considering that
> > the issue here is that somebody might've unlinked the file altogether
> > before we get to doing the fsync.  I don't have a hard time believing that
> > that might result in a failure report on NFS or similar.  Yeah, it's
> > hypothetical, but the argument that we need a repeat fsync at all seems
> > equally hypothetical.
> >
> > > Right now fsync_fname_ext isn't exposed outside fd.c...
> >
> > Mmm.  That makes it easier to consider changing its API.
> 
> Just to make sure I understand: it's OK for the file not to be there
> when we try to fsync it by name, because a concurrent checkpoint can
> remove it, having determined that we don't need it anymore?  In other
> words, we really needed either missing_ok=true semantics, or to use
> the fd we already had instead of the name?

I'm not yet sure that that's actually something that's supposed to
happen, I got to spend some time analysing how this actually
happens. Normally the contents of the slot should actually prevent it
from being removed (as they're newer than
ReplicationSlotsComputeLogicalRestartLSN()). I kind of wonder if that's
a bug in the drop logic in newer releases.

Greetings,

Andres Freund


В списке pgsql-hackers по дате отправления:

Предыдущее
От: Tom Lane
Дата:
Сообщение: Re: subscriptionCheck failures on nightjar
Следующее
От: Alvaro Herrera
Дата:
Сообщение: Re: Using POPCNT and other advanced bit manipulation instructions