Re: Initdb-time block size specification

Поиск
Список
Период
Сортировка
От Andres Freund
Тема Re: Initdb-time block size specification
Дата
Msg-id 20230630225909.ecthnlfvlnk3ij2k@awork3.anarazel.de
обсуждение исходный текст
Ответ на Re: Initdb-time block size specification  (Bruce Momjian <bruce@momjian.us>)
Ответы Re: Initdb-time block size specification  (Tomas Vondra <tomas.vondra@enterprisedb.com>)
Список pgsql-hackers
On 2023-06-30 18:37:39 -0400, Bruce Momjian wrote:
> On Sat, Jul  1, 2023 at 12:21:03AM +0200, Tomas Vondra wrote:
> > On 6/30/23 23:53, Bruce Momjian wrote:
> > > For a 4kB write, to say it is not partially written would be to require
> > > the operating system to guarantee that the 4kB write is not split into
> > > smaller writes which might each be atomic because smaller atomic writes
> > > would not help us.
> > 
> > Right, that's the dance we do to protect against torn pages. But Andres
> > suggested that if you have modern storage and configure it correctly,
> > writing with 4kB pages would be atomic. So we wouldn't need to do this
> > FPI stuff, eliminating pretty significant source of write amplification.
> 
> I agree the hardware is atomic for 4k writes, but do we know the OS
> always issues 4k writes?

When using a sector size of 4K you *can't* make smaller writes via normal
paths. The addressing unit is in sectors. The details obviously differ between
storage protocol, but you pretty much always just specify a start sector and a
number of sectors to be operated on.

Obviously the kernel could read 4k, modify 512 bytes in-memory, and then write
4k back, but that shouldn't be a danger here.  There might also be debug
interfaces to allow reading/writing in different increments, but that'd not be
something happening during normal operation.



В списке pgsql-hackers по дате отправления:

Предыдущее
От: Bruce Momjian
Дата:
Сообщение: Re: Initdb-time block size specification
Следующее
От: Bruce Momjian
Дата:
Сообщение: Re: Initdb-time block size specification