Re: pgcon unconference / impact of block size on performance

Поиск
Список
Период
Сортировка
От Tomas Vondra
Тема Re: pgcon unconference / impact of block size on performance
Дата
Msg-id 31c3f2cd-5ce9-6130-4c06-2700fad0a970@enterprisedb.com
обсуждение исходный текст
Ответ на RE: pgcon unconference / impact of block size on performance  (Jakub Wartak <Jakub.Wartak@tomtom.com>)
Ответы RE: pgcon unconference / impact of block size on performance  (Jakub Wartak <Jakub.Wartak@tomtom.com>)
Список pgsql-hackers

On 6/7/22 15:48, Jakub Wartak wrote:
> Hi,
> 
>> The really
>> puzzling thing is why is the filesystem so much slower for smaller pages. I mean,
>> why would writing 1K be 1/3 of writing 4K?
>> Why would a filesystem have such effect?
> 
> Ha! I don't care at this point as 1 or 2kB seems too small to handle many real world scenarios ;)
> 

I think that's not quite true - a lot of OLTP works with fairly narrow
rows, and if they use more data, it's probably in TOAST, so again split
into smaller rows. It's true smaller pages would cut some of the limits
(columns, index tuple, ...) of course, and that might be an issue.

Independently of that, it seems like an interesting behavior and it
might tell us something about how to optimize for larger pages.

>>> b) Another thing that you could also include in testing is that I've spotted a
>> couple of times single-threaded fio might could be limiting factor (numjobs=1 by
>> default), so I've tried with numjobs=2,group_reporting=1 and got this below
>> ouput on ext4 defaults even while dropping caches (echo 3) each loop iteration -
>> - something that I cannot explain (ext4 direct I/O caching effect? how's that
>> even possible? reproduced several times even with numjobs=1) - the point being
>> 206643 1kb IOPS @ ext4 direct-io > 131783 1kB IOPS @ raw, smells like some
>> caching effect because for randwrite it does not happen. I've triple-checked with
>> iostat -x... it cannot be any internal device cache as with direct I/O that doesn't
>> happen:
>>>
>>> [root@x libaio-ext4]# grep -r -e 'write:' -e 'read :' *
>>> nvme/randread/128/1k/1.txt:  read : io=12108MB, bw=206644KB/s,
>>> iops=206643, runt= 60001msec [b]
>>> nvme/randread/128/2k/1.txt:  read : io=18821MB, bw=321210KB/s,
>>> iops=160604, runt= 60001msec [b]
>>> nvme/randread/128/4k/1.txt:  read : io=36985MB, bw=631208KB/s,
>>> iops=157802, runt= 60001msec [b]
>>> nvme/randread/128/8k/1.txt:  read : io=57364MB, bw=976923KB/s,
>>> iops=122115, runt= 60128msec
>>> nvme/randwrite/128/1k/1.txt:  write: io=1036.2MB, bw=17683KB/s,
>>> iops=17683, runt= 60001msec [a, as before]
>>> nvme/randwrite/128/2k/1.txt:  write: io=2023.2MB, bw=34528KB/s,
>>> iops=17263, runt= 60001msec [a, as before]
>>> nvme/randwrite/128/4k/1.txt:  write: io=16667MB, bw=282977KB/s,
>>> iops=70744, runt= 60311msec [reproduced benefit, as per earlier email]
>>> nvme/randwrite/128/8k/1.txt:  write: io=22997MB, bw=391839KB/s,
>>> iops=48979, runt= 60099msec
>>>
>>
>> No idea what might be causing this. BTW so you're not using direct-io to access
>> the raw device? Or am I just misreading this?
> 
> Both scenarios (raw and fs) have had direct=1 set. I just cannot understand how having direct I/O enabled (which
disablescaching) achieves better read IOPS on ext4 than on raw device... isn't it contradiction?
 
> 

Thanks for the clarification. Not sure what might be causing this. Did
you use the same parameters (e.g. iodepth) in both cases?


regards

-- 
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



В списке pgsql-hackers по дате отправления:

Предыдущее
От: Jakub Wartak
Дата:
Сообщение: RE: pgcon unconference / impact of block size on performance
Следующее
От: Pavel Borisov
Дата:
Сообщение: Re: Add 64-bit XIDs into PostgreSQL 15