Re: Protecting against unexpected zero-pages: proposal

Поиск
Список
Период
Сортировка
От Greg Stark
Тема Re: Protecting against unexpected zero-pages: proposal
Дата
Msg-id AANLkTi=p_p2_QPbtHVVcUQzPk7LDwiWr7ixxxW81pTQz@mail.gmail.com
обсуждение исходный текст
Ответ на Re: Protecting against unexpected zero-pages: proposal  (Gurjeet Singh <singh.gurjeet@gmail.com>)
Ответы Re: Protecting against unexpected zero-pages: proposal  (Aidan Van Dyk <aidan@highrise.ca>)
Список pgsql-hackers
On Sun, Nov 7, 2010 at 4:23 AM, Gurjeet Singh <singh.gurjeet@gmail.com> wrote:
> I understand that it is a pretty low-level change, but IMHO the change is
> minimal and is being applied in well understood places. All the assumptions
> listed have been effective for quite a while, and I don't see these
> assumptions being affected in the near future. Most crucial assumptions we
> have to work with are, that XLogPtr{n, 0xFFFFFFFF} will never be used, and
> that mdextend() is the only place that extends a relation (until we
> implement an md.c sibling, say flash.c or tape.c; the last change to md.c
> regarding mdextend() was in January 2007).

I think the assumption that isn't tested here is what happens if the
server crashes. The logic may work fine as long as nothing goes wrong
but if something does it has to be fool-proof.

I think having zero-filled blocks at the end of the file if it has
been extended but hasn't been fsynced is an expected failure mode of a
number of filesystems. The log replay can't assume seeing such a block
is a problem since that may be precisely the result of the crash that
caused the replay. And if you disable checking for this during WAL
replay then you've lost your main chance to actually detect the
problem.

Another issue -- though I think a manageable one -- is that I expect
we'll want to be be using posix_fallocate() sometime soon. That will
allow efficient guaranteed pre-allocated space with better contiguous
layout than currently. But ext4 can only pretend to give zero-filled
blocks, not any random bitpattern we request. I can see this being an
optional feature that is just not compatible with using
posix_fallocate() though.

It does seem like this is kind of part and parcel of adding checksums
to blocks. It's arguably kind of silly to add checksums to blocks but
have an commonly produced bitpattern in corruption cases go
undetected.

-- 
greg


В списке pgsql-hackers по дате отправления:

Предыдущее
От: Daniel Farina
Дата:
Сообщение: Re: ALTER TABLE ... IF EXISTS feature?
Следующее
От: Gurjeet Singh
Дата:
Сообщение: Re: Protecting against unexpected zero-pages: proposal