Re: Should io_method=worker remain the default?
От | Jeff Davis |
---|---|
Тема | Re: Should io_method=worker remain the default? |
Дата | |
Msg-id | d2018eee32e211bdfc505862e9ae24b55cec5af0.camel@j-davis.com обсуждение исходный текст |
Ответ на | Re: Should io_method=worker remain the default? (Thomas Munro <thomas.munro@gmail.com>) |
Ответы |
Re: Should io_method=worker remain the default?
|
Список | pgsql-hackers |
On Mon, 2025-09-08 at 14:39 +1200, Thomas Munro wrote: > Some raw thoughts on this topic, and how we got here: This type of > extreme workload, namely not doing any physical I/O, just copying the > same data from the kernel page cache to the buffer pool over and over > again, Isn't that one of the major selling points of AIO? It does "real readahead" from kernel buffers into PG buffers ahead of the time, so that the backend doesn't have to do the memcpy and checksum calculation. The benefit will be even larger when AIO eventually enables effective Direct IO, so that shared_buffers can be a larger share of system memory, and we don't need to move back-and-forth between kernel buffers and PG buffers (and recalculate the checksum). The only problem right now is that it doesn't (yet) work great when the concurrency is higher because: (a) we don't adapt well to saturated workers; and (b) there's lock contention on the queue if there are more workers and I believe both of those problems can be solved in 19. > is also the workload where io_method=worker can beat > io_method=io_uring (and other native I/O methods I've prototyped), > assuming io_workers is increased to a level that keeps up. Right, when the workers are not saturated, they *increase* the parallelism because there are more processes doing the work (unless you run into lock contention). > didn't matter much before the > checksums-by-default change went in just a little ahead of basic AIO > support. Yeah, the trade-offs are much different when checksums are on vs off. > > Interesting that it shows up so clearly for Andres but not for you. When I increase the io_worker count, then it does seem to be limited by lock contention (at least I'm seeing evidence with -DLWLOCK_STATS). I suppose my case is just below some threshold. > > BTW There are already a couple of levels of signal suppression: if > workers are not idle then we don't set any latches, and even if we > did, SetLatch() only sends signals when the recipient is actually > waiting, which shouldn't happen when the pool is completely busy. Oh, good to know. > + nextWakeupWorker = (nextWakeupWorker + 1) % io_workers; > > FWIW, I experimented extensively with wakeup distribution schemes My patch was really just to try to test the two hypotheses; I wasn't proposing it. But I was curious whether a simpler scheme might be just as good, and looks like you already considered and rejected it. > > I would value your feedback and this type of analysis on the thread > about automatic tuning for v19. OK, I will continue the tuning discussion there. Regarding $SUBJECT: it looks like others are just fine with worker mode as the default in 18. I have added discussion links to the "no change" entry in the Open Items list. I think we'll probably see some of the effects (worker saturation or lock contention) from my test case appear in real workloads, but affected users can change to sync mode until we sort these things out in 19. Regards, Jeff Davis
В списке pgsql-hackers по дате отправления: