Re: PATCH: track last known XLOG segment in control file

Поиск
Список
Период
Сортировка
От Andres Freund
Тема Re: PATCH: track last known XLOG segment in control file
Дата
Msg-id 20151212223948.GS14789@awork2.anarazel.de
обсуждение исходный текст
Ответ на Re: PATCH: track last known XLOG segment in control file  (Tomas Vondra <tomas.vondra@2ndquadrant.com>)
Ответы Re: PATCH: track last known XLOG segment in control file  (Tomas Vondra <tomas.vondra@2ndquadrant.com>)
Список pgsql-hackers
On 2015-12-12 23:28:33 +0100, Tomas Vondra wrote:
> On 12/12/2015 11:20 PM, Andres Freund wrote:
> >On 2015-12-12 22:14:13 +0100, Tomas Vondra wrote:
> >>this is the second improvement proposed in the thread [1] about ext4 data
> >>loss issue. It adds another field to control file, tracking the last known
> >>WAL segment. This does not eliminate the data loss, just the silent part of
> >>it when the last segment gets lost (due to forgetting the rename, deleting
> >>it by mistake or whatever). The patch makes sure the cluster refuses to
> >>start if that happens.
> >
> >Uh, that's fairly expensive. In many cases it'll significantly
> >increase the number of fsyncs.
> 
> It should do exactly 1 additional fsync per WAL segment. Or do you think
> otherwise?

Which is nearly doubling the number of fsyncs, for a good number of
workloads. And it does so to a separate file, i.e. it's not like these
writes and the flushes can be combined. In workloads where pg_xlog is on
a separate partition it'll add the only source of fsyncs besides
checkpoint to the main data directory.

> > I've a bit of a hard time believing this'll be worthwhile.
> 
> The trouble is protections like this only seem worthwhile after the fact,
> when something happens. I think it's reasonable protection against issues
> similar to the one I reported ~2 weeks ago. YMMV.

Meh. That argument can be used to justify about everything.

Obviously we should be more careful about fsyncing files, including the
directories. I do plan come back to your recent patch.

> > Additionally this doesn't seem to take WAL replay into account?
> 
> I think the comparison in StartupXLOG needs to be less strict, to allow
> cases when we actually replay more WAL segments. Is that what you mean?

What I mean is that the value isn't updated during recovery, afaics. You
could argue that minRecoveryPoint is that, in a way.

Greetings,

Andres Freund



В списке pgsql-hackers по дате отправления:

Предыдущее
От: Tom Lane
Дата:
Сообщение: Using a single standalone-backend run in initdb (was Re: Bootstrap DATA is a pita)
Следующее
От: Andres Freund
Дата:
Сообщение: Re: Bootstrap DATA is a pita