Jeff Janes wrote:
> Do you know where this competition is happening? Is it on the
> platters, or is it in the hard drive write cache (I thought high-end
> hardware had tagged writes to avoid that), or in the kernel?
>
Kernel. Linux systems with lots of memory will happily queue up
gigabytes of memory in their write cache, only getting serious about
writing it out to disk when demanded to by fsync.
> This makes sense if we just need to append to a queue. But once the
> queue is full and we are about to do a backend fsync, might it make
> sense to do a little more work to look for dups?
>
One of the paths I'd like to follow is experimenting with both sorting
writes by file and looking for duplication in the queues. I think a
basic, simple sync spreading approach needs to get finished first
through; this sort of thing would then be an optimization on top of it.
--
Greg Smith 2ndQuadrant US greg@2ndQuadrant.com Baltimore, MD
PostgreSQL Training, Services and Support www.2ndQuadrant.us