Обсуждение: fsync failure in durable_unlink ignored in xlog.c?

Поиск
Список
Период
Сортировка

fsync failure in durable_unlink ignored in xlog.c?

От
Mark Dilger
Дата:
Hackers,

I have seen other lengthy discussions about fsync semantics, and if this
question is being addressed there, this question might not be relevant.

Two calls to durable_unlink at log level DEBUG1 are ignoring the
return value.  Other calls at ERROR and FATAL are likewise ignoring
the return value, though those make perfect sense to me.  There may
be a reason why logging a debug message inside durable_unlink and
continuing along is safe, but it is not clear from the structure of the
code.

In InstallXLogFileSegment, durable_unlink(path, DEBUG1) is called
without the return value being checked followed by a call to
durable_link_or_rename, and perhaps that second call works
whether the durable_unlink succeeded or failed, but the logic of that is
not at all clear.

In do_pg_stop_backup, durable_unlink(TABLESPACE_MAP, DEBUG1)
is similarly called without the return value being checked.

This code appears to have been changed in
1b02be21f271db6bd3cd43abb23fa596fcb6bac3.

Is this code safe against fsync failures?  If so, can I get an explanation
that I might put into a code comment patch?

mark



Re: fsync failure in durable_unlink ignored in xlog.c?

От
Andres Freund
Дата:
Hi,

On 2019-05-23 10:46:02 -0700, Mark Dilger wrote:
> I have seen other lengthy discussions about fsync semantics, and if this
> question is being addressed there, this question might not be relevant.
> 
> Two calls to durable_unlink at log level DEBUG1 are ignoring the
> return value.  Other calls at ERROR and FATAL are likewise ignoring
> the return value, though those make perfect sense to me.  There may
> be a reason why logging a debug message inside durable_unlink and
> continuing along is safe, but it is not clear from the structure of the
> code.
> 
> In InstallXLogFileSegment, durable_unlink(path, DEBUG1) is called
> without the return value being checked followed by a call to
> durable_link_or_rename, and perhaps that second call works
> whether the durable_unlink succeeded or failed, but the logic of that is
> not at all clear.
> 
> In do_pg_stop_backup, durable_unlink(TABLESPACE_MAP, DEBUG1)
> is similarly called without the return value being checked.
> 
> This code appears to have been changed in
> 1b02be21f271db6bd3cd43abb23fa596fcb6bac3.
> 
> Is this code safe against fsync failures?  If so, can I get an explanation
> that I might put into a code comment patch?

What's the danger you're thinking of here? The issue with ignoring fsync
failures is that it could be the one signal about data corruption we get
for a write()/fsync() that failed - i.e. that durability cannot be
guaranteed. But we don't care about the file contents of those files.

Greetings,

Andres Freund



Re: fsync failure in durable_unlink ignored in xlog.c?

От
Tom Lane
Дата:
Andres Freund <andres@anarazel.de> writes:
> On 2019-05-23 10:46:02 -0700, Mark Dilger wrote:
>> Is this code safe against fsync failures?  If so, can I get an explanation
>> that I might put into a code comment patch?

> What's the danger you're thinking of here? The issue with ignoring fsync
> failures is that it could be the one signal about data corruption we get
> for a write()/fsync() that failed - i.e. that durability cannot be
> guaranteed. But we don't care about the file contents of those files.

Hmm ... if we don't care, why are we issuing an fsync at all?

            regards, tom lane



Re: fsync failure in durable_unlink ignored in xlog.c?

От
Mark Dilger
Дата:
On Thu, May 23, 2019 at 11:07 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:
>
> Andres Freund <andres@anarazel.de> writes:
> > On 2019-05-23 10:46:02 -0700, Mark Dilger wrote:
> >> Is this code safe against fsync failures?  If so, can I get an explanation
> >> that I might put into a code comment patch?
>
> > What's the danger you're thinking of here? The issue with ignoring fsync
> > failures is that it could be the one signal about data corruption we get
> > for a write()/fsync() that failed - i.e. that durability cannot be
> > guaranteed. But we don't care about the file contents of those files.
>
> Hmm ... if we don't care, why are we issuing an fsync at all?

Tom's question is about as far as my logic went.  It seemed the fsync
must be important, or the author of this code wouldn't have put such
an expensive operation in that spot, and if so, then how can it be safe
to ignore whether the fsync returned an error.  Beyond that, I do not
have a specific danger in mind.

mark



Re: fsync failure in durable_unlink ignored in xlog.c?

От
Andres Freund
Дата:
Hi,

On 2019-05-23 14:06:57 -0400, Tom Lane wrote:
> Andres Freund <andres@anarazel.de> writes:
> > On 2019-05-23 10:46:02 -0700, Mark Dilger wrote:
> >> Is this code safe against fsync failures?  If so, can I get an explanation
> >> that I might put into a code comment patch?
> 
> > What's the danger you're thinking of here? The issue with ignoring fsync
> > failures is that it could be the one signal about data corruption we get
> > for a write()/fsync() that failed - i.e. that durability cannot be
> > guaranteed. But we don't care about the file contents of those files.
> 
> Hmm ... if we don't care, why are we issuing an fsync at all?

Fair point. I think we do care in most of those cases, but we don't need
to trigger a PANIC. We'd be in trouble if e.g. an older tablespace map
file were to "revive" later. Looking at the cases:

- durable_unlink(TABLESPACE_MAP, DEBUG1) - we definitely care about a
  failure to unlink/remove, but *not* about ENOENT, because that's expected.

- /* Force installation: get rid of any pre-existing segment file */
  durable_unlink(path, DEBUG1);

  same.

- RemoveXlogFile():
  rc = durable_unlink(path, LOG);

  It's probably *tolerable* to fail here. Not sure why this is a
  durable_unlink(LOG) - doesn't make a ton of sense to me.

- durable_unlink(BACKUP_LABEL_FILE, ERROR);

  This is a "whaa, bad shit is happening" kind of situation. But
  crashing probably would make it even worse, because we'd restart
  assuming we're restoring from a backup.

ISTM that durable_unlink() for the first two cases really needs a
separate 'missing_ok' type argument. And that the reason for using
DEBUG1 here is solely an outcome of that not existing.

Greetings,

Andres Freund