Re: Postgres, fsync, and OSs (specifically linux)

Поиск
Список
Период
Сортировка
От Thomas Munro
Тема Re: Postgres, fsync, and OSs (specifically linux)
Дата
Msg-id CAEepm=3VioiGiNaUNCPZoZB63GAKkdVN-LyHE0Os1Hh+mu5Psw@mail.gmail.com
обсуждение исходный текст
Ответ на Re: Postgres, fsync, and OSs (specifically linux)  (Simon Riggs <simon@2ndquadrant.com>)
Ответы Re: Postgres, fsync, and OSs (specifically linux)  (Thomas Munro <thomas.munro@enterprisedb.com>)
Список pgsql-hackers
On Sun, Apr 29, 2018 at 10:42 PM, Simon Riggs <simon@2ndquadrant.com> wrote:
> On 28 April 2018 at 09:15, Andres Freund <andres@anarazel.de> wrote:
>> On 2018-04-28 08:25:53 -0700, Simon Riggs wrote:
>>> The people I've spoken to so far have encouraged us to continue
>>> working with the filesystem layer, offering encouragement of our
>>> decision to use filesystems.
>>
>> There's a lot of people disagreeing with it too.
>
> Specific recent verbal feedback from OpenLDAP was that the project
> adopted DIO and found no benefit in doing so, with regret the other
> way from having tried.

I'm not sure if OpenLDAP is really comparable.  The big three RDBMSs +
MySQL started like us and eventually switched to direct IO, I guess at
a time when direct IO support matured in OSs and their own IO
scheduling was thought to be superior.  I'm pretty sure they did that
because they didn't like wasting RAM on double buffering and had
better ideas about IO scheduling.  From some googling this morning:

DB2: The Linux/Unix/Windows edition changed its default to DIO ("NO
FILESYSTEM CACHING") in release 9.5 in 2007[1], but it can still do
buffered IO if you ask for it.

Oracle: Around the same time or earlier, in the Linux 2.4 era, Oracle
apparently supported direct IO ("FILESYSTEMIO_OPTIONS = DIRECTIO" (or
SETALL for DIRECTIO + ASYNCH)) on big iron Unix but didn't yet use it
on Linux[2].  There were some amusing emails from Linus Torvalds on
this topic[3].  I'm not sure what FILESYSTEMIO_OPTIONS's default value
is on each operating system today or when it changed, it's probably
SETALL everywhere by now?  I wonder if they stuck with buffered IO for
a time on Linux despite the availability of direct IO because they
thought it was more reliable or more performant.

SQL Server: I couldn't find any evidence that they've even kept the
option to use buffered IO (which must have existed in the ancestral
code base).  Can it?  It's a different situation though, targeting a
reduced set of platforms.

MySQL: The default is still buffered ("innodb_flush_method = fsync" as
opposed to "O_DIRECT") but O_DIRECT is supported and widely
recommended, so it sounds like it's usually a win.  Maybe not on
smaller systems though?

On MySQL, there are anecdotal reports of performance suffering on some
systems when you turn on O_DIRECT however.  If that's true, it's
interesting to speculate about why that might be as it would probably
apply also to us in early versions (optimistic explanation: the
kernel's stretchy page cache allows people to get away with poorly
tuned buffer pool size?  pessimistic explanation: the page reclamation
or IO scheduling (asynchronous write-back, write clustering,
read-ahead etc) is not as good as the OS's, but that effect is hidden
by suitably powerful disk subsystem with its own magic caching?)  Note
that its O_DIRECT setting *also* calls fsync() to flush filesystem
meta-data (necessary if the file was extended); I wonder if that is
exposed to write-back error loss.

[1] https://www.ibm.com/support/knowledgecenter/en/SSEPGG_9.5.0/com.ibm.db2.luw.admin.dbobj.doc/doc/c0051304.html
[2] http://www.ixora.com.au/notes/direct_io.htm
[3] https://lkml.org/lkml/2002/5/11/58

-- 
Thomas Munro
http://www.enterprisedb.com


В списке pgsql-hackers по дате отправления:

Предыдущее
От: Andres Freund
Дата:
Сообщение: Re: Cold welcoming message when installing anything because of LLVMbitcode stuff
Следующее
От: Michael Paquier
Дата:
Сообщение: Re: Cold welcoming message when installing anything because of LLVMbitcode stuff