Re: fsync-pgdata-on-recovery tries to write to more files than previously

Поиск
Список
Период
Сортировка
От Andres Freund
Тема Re: fsync-pgdata-on-recovery tries to write to more files than previously
Дата
Msg-id 20150526170720.GB5310@alap3.anarazel.de
обсуждение исходный текст
Ответ на Re: fsync-pgdata-on-recovery tries to write to more files than previously  (Tom Lane <tgl@sss.pgh.pa.us>)
Ответы Re: fsync-pgdata-on-recovery tries to write to more files than previously  (Andres Freund <andres@anarazel.de>)
Re: fsync-pgdata-on-recovery tries to write to more files than previously  (Andres Freund <andres@anarazel.de>)
Re: fsync-pgdata-on-recovery tries to write to more files than previously  (Abhijit Menon-Sen <ams@2ndQuadrant.com>)
Список pgsql-hackers
On 2015-05-26 10:41:12 -0400, Tom Lane wrote:
> Yeah.  Perhaps I missed it, but was the original patch motivated by
> actual failures that had been seen in the field, or was it just a
> hypothetical concern?

I'd mentioned that it might be a good idea to do this while
investingating a bug with unlogged tables that several parties had
reported. That patch has been fixed in a more granular fashion. From
there it took on kind of a life on its own.

It is somewhat interesting that similar code has been used in
pg_upgrade, via initdb -S, for a while now, without, to my knowledge, it
causing reported problem. I think the relevant difference is that that
code doesn't follow symlinks.  It's obviously also less exercised and
poeople might just have fixed up permissions when encountering troubles.

Abhijit, do you recall why the code was changed to follow all symlinks
in contrast to explicitly going through the tablespaces as initdb -S
does? I'm pretty sure early versions of the patch pretty much had a
verbatim copy of the initdb logic?  That logic is missing pg_xlog btw,
which is bad for pg_upgrade.



I *can* reproduce corrupted clusters without this without trying too
hard. I yesterday wrote a test for the crash testing infrastructure I've
on and off worked on (since 2011. Uhm) and I could reproduce a bunch of
corrupted clusters without this patch.  When randomly triggering crash
restarts shortly afterwards follwed by a simulated hw restart (stop
accepting writes via linux device mapper) while concurrent COPYs are
running, I can trigger a bunch of data corruptions.

Since then my computer in berlin has done 440 testruns with the patch,
and 450 without. I've gotten 7 errors without, 0 with. But the
instrumentation for detecting errors is really shitty (pkey lookup for
every expected row) and doesn't yet keep meaningful diagnosistics. So
this isn't particularly bulletproof either way.

I can't tell whether the patch is just masking yet another problem due
to different timings caused by the fsync, or whether it's fixing the
problem that we can forget to sync WAL segments.

Greetings,

Andres Freund



В списке pgsql-hackers по дате отправления:

Предыдущее
От: Tom Lane
Дата:
Сообщение: Re: fsync-pgdata-on-recovery tries to write to more files than previously
Следующее
От: Andres Freund
Дата:
Сообщение: Re: fsync-pgdata-on-recovery tries to write to more files than previously