Re: Should io_method=worker remain the default?
От | Andres Freund |
---|---|
Тема | Re: Should io_method=worker remain the default? |
Дата | |
Msg-id | adywrhdn5zcfeldzug2pkzbdxizactr46goh4doiztvexrqomy@lgvl26gui63a обсуждение исходный текст |
Ответ на | Should io_method=worker remain the default? (Jeff Davis <pgsql@j-davis.com>) |
Ответы |
Re: Should io_method=worker remain the default?
Re: Should io_method=worker remain the default? |
Список | pgsql-hackers |
Hi, On 2025-09-02 23:47:48 -0700, Jeff Davis wrote: > Has there already been a discussion about leaving the default as > io_method=worker? There was an Open Item for this, which was closed as > "Won't Fix", but the links don't explain why as far as I can see. > I tested a concurrent scan-heavy workload (see below) where the data > fits in memory, and "worker" seems to be 30% slower than "sync" with > default settings. > > Test summary: 32 connections each perform repeated sequential scans. > Each connection scans a different 1GB partition of the same table. I > used partitioning and a predicate to make it easier to script in > pgbench. 32 parallel seq scans of a large relations, with default shared buffers, fully cached in the OS page cache, seems like a pretty absurd workload. That's not to say we shouldn't spend some effort to avoid regressions for it, but it also doesn't seem to be worth focusing all that much on it. Or is there a real-world scenario this actually emulating? I think the regression is not due to anything inherent to worker, but due to pressure on AioWorkerSubmissionQueueLock - at least that's what I'm seeing on a older two socket machine. It's possible the bottleneck is different on a newer machine (my newer workstation is busy on another benchmark rn). *If* we actually care about this workload, we can make pgaio_worker_submit_internal() acquire that lock conditionally, and perform the IOs synchronously instead. That seems to help here, sufficiently to make worker the same as sync - although plenty contention remains, from "the worker side", which can't just acquire the lock conditionally. But I'm really not sure doing > 30GB/s of repeated reads from the page cache is a particularly useful thing to optimize. I see a lot of unrelated contention, e.g. on the BufferMappingLock - unsurprising, it's a really extreme workload... If I instead just increase s_b, I get 2x the throughput... Greetings, Andres Freund
В списке pgsql-hackers по дате отправления: