Jonah H. Harris wrote:
> fadvise is a kludge.
I don't think it's a kludge at all. posix_fadvise() is a pretty nice and
clean interface to hint the kernel what pages you're going to access in
the near future. I can't immediately come up with a cleaner interface to
do that.
Compared to async I/O, it's helluva lot simpler to add a few
posix_fadvise() calls to an application, than switch to a completely
different paradigm. And while posix_fadvise() is just a hint, allowing
the OS to prioritize accordingly, all async I/O requests look the same.
> While it will help, it still makes us completely
> reliant on the OS.
That's not a bad thing in my opinion. The OS knows the I/O hardware,
disk layout, utilization, and so forth, and is in a much better position
to do I/O scheduling than a user process. The only advantage a user
process has is that it knows better what pages it's going to need, and
posix_fadvise() is a good interface to let the user process tell the
kernel that.
> IIRC, we currently have support for rings in the buffer pool, which we could read
> directly into.
The rings won't help you a bit. It's just a different way to choose
victim buffers.
-- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com