>>>>> "Thomas" == Thomas Munro <thomas.munro@enterprisedb.com> writes:
Thomas> * it's also been claimed that readahead heuristics are not
Thomas> defeated on Linux or FreeBSD, which isn't too surprising
Thomas> because you'd expect it to be about blocks being faulted in,
Thomas> not syscalls
I don't know about linux, but on FreeBSD, readahead/writebehind is
tracked at the level of open files but implemented at the level of
read/write clustering. I have patched kernels in the past to improve the
performance in mixed read/write cases; pg would benefit on unpatched
kernels from using separate file opens for backend reads and writes.
(The typical bad scenario is doing a create index, or other seqscan that
updates hint bits, on a freshly-restored table; the alternation of
reading block N and writing block N-x destroys the readahead/writebehind
since they use a common offset.)
The code that detects sequential behavior can not distinguish between
pread() and lseek+read, it looks only at the actual offset of the
current request compared to the previous one for the same fp.
Thomas> +1 for adopting pread()/pwrite() in PG12.
ditto
--
Andrew (irc:RhodiumToad)