Re: Torn page hazard in ginRedoUpdateMetapage()

Поиск
Список
Период
Сортировка
От Tom Lane
Тема Re: Torn page hazard in ginRedoUpdateMetapage()
Дата
Msg-id 11920.1336018594@sss.pgh.pa.us
обсуждение исходный текст
Ответ на Re: Torn page hazard in ginRedoUpdateMetapage()  (Daniel Farina <daniel@heroku.com>)
Ответы Re: Torn page hazard in ginRedoUpdateMetapage()  (Robert Haas <robertmhaas@gmail.com>)
Список pgsql-hackers
Daniel Farina <daniel@heroku.com> writes:
> On Wed, May 2, 2012 at 6:06 PM, Noah Misch <noah@leadboat.com> wrote:
>> Can we indeed assume that all support-worthy filesystems align the start of
>> every file to a physical sector?  I know little about modern filesystem
>> design, but these references leave me wary of that assumption:
>> 
>> http://www.mail-archive.com/linux-btrfs@vger.kernel.org/msg14690.html
>> http://en.wikipedia.org/wiki/Block_suballocation
>> 
>> If it is a safe assumption, we could exploit it elsewhere.

> Not to say whether this is safe or not, but it *is* exploited
> elsewhere, as I understand it: the pg_control information, whose
> justification for its safety is its small size.  That may point to a
> very rare problem with pg_control rather the safety of the assumption
> it makes.

I think it's somewhat common now for filesystems to attempt to optimize
very small files (on the order of a few dozen bytes) in that way.  It's
hard to see where's the upside for changing the conventional storage
allocation when the file is sector-sized or larger; the file system does
have to be prepared to rewrite the file on demand, and moving it from
one place to another isn't cheap.

That wikipedia reference argues for doing this type of optimization on
the last partial block of a file, which is entirely irrelevant for our
purposes since we always ask for page-multiples of space.  (The fact
that much of that might be useless padding is, I think, unknown to the
filesystem.)

Having said all that, I wasn't really arguing that this was a guaranteed
safe thing for us to rely on; just pointing out that it's quite likely
that the issue hasn't been seen in the field because of this type of
consideration.
        regards, tom lane


В списке pgsql-hackers по дате отправления:

Предыдущее
От: Robert Haas
Дата:
Сообщение: Re: Latch for the WAL writer - further reducing idle wake-ups.
Следующее
От: Jim Nasby
Дата:
Сообщение: Re: Re: xReader, double-effort (was: Temporary tables under hot standby)