Re: Should we update the random_page_cost default value?
От | Andres Freund |
---|---|
Тема | Re: Should we update the random_page_cost default value? |
Дата | |
Msg-id | aqz34bqmh6v6r6bplgflid3buhdkv45dkkbx6y6gq34dx4gp42@rcz2est27arz обсуждение исходный текст |
Ответ на | Re: Should we update the random_page_cost default value? (Tomas Vondra <tomas@vondra.me>) |
Ответы |
Re: Should we update the random_page_cost default value?
|
Список | pgsql-hackers |
Hi, On 2025-10-07 16:23:36 +0200, Tomas Vondra wrote: > On 10/7/25 14:08, Tomas Vondra wrote: > > ... > >>>>>> I think doing this kind of measurement via normal SQL query processing is > >>>>>> almost always going to have too much other influences. I'd measure using fio > >>>>>> or such instead. It'd be interesting to see fio numbers for your disks... > >>>>>> > >>>>>> fio --directory /srv/fio --size=8GiB --name test --invalidate=0 --bs=$((8*1024)) --rw read --buffered 0 --time_based=1--runtime=5 --ioengine pvsync --iodepth 1 > >>>>>> vs --rw randread > >>>>>> > >>>>>> gives me 51k/11k for sequential/rand on one SSD and 92k/8.7k for another. > >>>>>> > >>>>> > >>>>> I can give it a try. But do we really want to strip "our" overhead with > >>>>> reading data? > > > > I got this on the two RAID devices (NVMe and SATA): > > > > NVMe: 83.5k / 15.8k > > SATA: 28.6k / 8.5k > > > > So the same ballpark / ratio as your test. Not surprising, really. > > > > FWIW I do see about this number in iostat. There's a 500M test running > right now, and iostat reports this: > > Device r/s rkB/s ... rareq-sz ... %util > md1 15273.10 143512.80 ... 9.40 ... 93.64 > > So it's not like we're issuing far fewer I/Os than the SSD can handle. Not really related to this thread: IME iostat's utilization is pretty much useless for anything other than "is something happening at all", and even that is not reliable. I don't know the full reason for it, but I long learned to just discount it. I ran fio --directory /srv/fio --size=8GiB --name test --invalidate=0 --bs=$((8*1024)) --rw read --buffered 0 --time_based=1 --runtime=100--ioengine pvsync --iodepth 1 --rate_iops=40000 a few times in a row, while watching iostat. Sometimes utilization is 100%, sometimes it's 0.2%. Whereas if I run without rate limiting, utilization never goes above 71%, despite doing more iops. And then gets completely useless if you use a deeper iodepth, because there's just not a good way to compute something like a utilization number once you take parallel IO processing into account. fio --directory /srv/fio --size=8GiB --name test --invalidate=0 --bs=$((8*1024)) --rw read --buffered 0 --time_based=1 --runtime=100--ioengine io_uring --iodepth 1 --rw randread iodepth util iops 1 94% 9.3k 2 99.6% 18.4k 4 100% 35.9k 8 100% 68.0k 16 100% 123k Greetings, Andres Freund
В списке pgsql-hackers по дате отправления: