Re: Changeset Extraction v7.0 (was logical changeset generation)

Поиск
Список
Период
Сортировка
От Andres Freund
Тема Re: Changeset Extraction v7.0 (was logical changeset generation)
Дата
Msg-id 20140122154858.GK21170@alap3.anarazel.de
обсуждение исходный текст
Ответ на Re: Changeset Extraction v7.0 (was logical changeset generation)  (Robert Haas <robertmhaas@gmail.com>)
Ответы Re: Changeset Extraction v7.0 (was logical changeset generation)  (Robert Haas <robertmhaas@gmail.com>)
Список pgsql-hackers
On 2014-01-22 10:14:27 -0500, Robert Haas wrote:
> On Wed, Jan 22, 2014 at 9:48 AM, Andres Freund <andres@2ndquadrant.com> wrote:
> > On 2014-01-18 08:35:47 -0500, Robert Haas wrote:
> >> > I am not sure I understand that point. We can either update the
> >> > in-memory bit before performing the on-disk operations or
> >> > afterwards. Either way, there's a way to be inconsistent if the disk
> >> > operation fails somewhere inbetween (it might fail but still have
> >> > deleted the file/directory!). The normal way to handle that in other
> >> > places is PANICing when we don't know so we recover from the on-disk
> >> > state.
> >> > I really don't see the problem here? Code doesn't get more robust by
> >> > doing s/PANIC/ERROR/, rather the contrary. It takes extra smarts to only
> >> > ERROR, often that's not warranted.
> >>
> >> People get cranky when the database PANICs because of a filesystem
> >> failure.  We should avoid that, especially when it's trivial to do so.
> >>  The update to shared memory should be done second and should be set
> >> up to be no-fail.
> >
> > I don't see how that would help. If we fail during unlink/rmdir, we
> > don't really know at which point we failed.
> 
> This doesn't make sense to me.  unlink/rmdir are atomic operations.

Yes, individual operations should be, but you cannot be sure whether a
rename()/unlink() will survive a crash until the directory is
fsync()ed. So, what is one going to do if the unlink suceeded, but the
fsync didn't?

Deletion currently works like:   if (rename(path, tmppath) != 0)       ereport(ERROR,
(errcode_for_file_access(),               errmsg("could not rename \"%s\" to \"%s\": %m",                       path,
tmppath)));
   /* make sure no partial state is visible after a crash */   fsync_fname(tmppath, false);
fsync_fname("pg_replslot",true);
 
   if (!rmtree(tmppath, true))   {       ereport(ERROR,               (errcode_for_file_access(),
errmsg("couldnot remove directory \"%s\": %m",                       tmppath)));   }
 

If we fail between the rename() and the fsync_fname() we don't really
know which state we are in. We'd also have to add code to handle
incomplete slot directories, which currently only exists for startup, to
other places.

Greetings,

Andres Freund

-- Andres Freund                       http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training &
Services



В списке pgsql-hackers по дате отправления:

Предыдущее
От: Robert Haas
Дата:
Сообщение: Re: [Lsf-pc] Linux kernel impact on PostgreSQL performance
Следующее
От: Christian Kruse
Дата:
Сообщение: Re: Patch: Show process IDs of processes holding a lock; show relation and tuple infos of a lock to acquire