On Sunday 21 November 2010 23:19:30 Martijn van Oosterhout wrote:
> For a similar problem we had (kernel buffering too much) we had success
> using the fadvise and madvise WONTNEED syscalls to force the data to
> exit the cache much sooner than it would otherwise. This was on Linux
> and it had the side-effect that the data was deleted from the kernel
> cache, which we wanted, but probably isn't appropriate here.
Yep, works fine. Although it has the issue that the data will get read again if
archiving/SR is enabled.
> There is also sync_file_range, but that's linux specific, although
> close to what you want I think. It would allow you to work with blocks
> smaller than 1GB.
Unfortunately that puts the data under quite high write-out pressure inside
the kernel - which is not what you actually want because it limits reordering
and such significantly.
It would be nicer if you could get a mix of both semantics (looking at it,
depending on the approach that seems to be about a 10 line patch to the
kernel). I.e. indicate that you want to write the pages soonish, but don't put
it on the head of the writeout queue.
Andres