Re: Optimize kernel readahead using buffer access strategy

Поиск
Список
Период
Сортировка
От KONDO Mitsumasa
Тема Re: Optimize kernel readahead using buffer access strategy
Дата
Msg-id 52858333.9030905@lab.ntt.co.jp
обсуждение исходный текст
Ответ на Re: Optimize kernel readahead using buffer access strategy  (Claudio Freire <klaussfreire@gmail.com>)
Ответы Re: Optimize kernel readahead using buffer access strategy  (Claudio Freire <klaussfreire@gmail.com>)
Список pgsql-hackers
Hi Claudio,

(2013/11/14 22:53), Claudio Freire wrote:
> On Thu, Nov 14, 2013 at 9:09 AM, KONDO Mitsumasa
> <kondo.mitsumasa@lab.ntt.co.jp> wrote:
>> I create a patch that is improvement of disk-read and OS file caches. It can
>> optimize kernel readahead parameter using buffer access strategy and
>> posix_fadvice() in various disk-read situations.
>>
>> In general OS, readahead parameter was dynamically decided by disk-read
>> situations. If long time disk-read was happened, readahead parameter becomes big.
>> However it is based on experienced or heuristic algorithm, it causes waste
>> disk-read and throws out useful OS file caches in some case. It is bad for
>> disk-read performance a lot.
>
> It would be relevant to know which kernel did you use for those tests.
I use CentOS 6.4 which kernel version is 2.6.32-358.23.2.el6.x86_64 in this test.


> A while back, I tried to use posix_fadvise to prefetch index pages.
I search your past work. Do you talk about this ML-thread? Or is there another 
latest discussion? I see your patch is interesting, but it wasn't submitted to CF 
and stopping discussions.
http://www.postgresql.org/message-id/CAGTBQpZzf70n0PYJ=VQLd+jb3wJGo=2TXmY+SkJD6G_vjC5QNg@mail.gmail.com

>I ended up finding out that interleaving posix_fadvise with I/O like
> that severly hinders (ie: completely disables) the kernel's read-ahead
> algorithm.
Your patch becomes maximum readahead, when a sql is selected index range scan. Is 
it right? I think that your patch assumes that pages are ordered by index-data. 
This assumption is partially wrong. If your assumption is true, we don't need 
CLUSTER command. In actuary, CLUSTER command becomes better performance than nothing.

> How exactly did you set up those benchmarks? pg_bench defaults?
My detail test setting is under following,
* Server info  CPU: Intel(R) Xeon(R) CPU E5645  @ 2.40GHz (2U/12C)  RAM: 6GB    -> I reduced it intentionally in OS
paraemter,because large memory tests       have long time.  HDD: SEAGATE  Model: ST2000NM0001 @ 7200rpm * 1  RAID:
none.

* postgresql.conf(summarized)  shared_buffers = 600MB (10% of RAM = 6GB)  work_mem = 1MB  maintenance_work_mem = 64MB
wal_level= archive  fsync = on  archive_mode = on  checkpoint_segments = 300  checkpoint_timeout = 15min
checkpoint_completion_target= 0.7
 

* pgbench settings
pgbench -j 4 -c 32 -T 600 pgbench


> pg_bench does not exercise heavy sequential access patterns, or long
> index scans. It performs many single-page index lookups per
> transaction and that's it.
Yes, your argument is right. And it is also a fact that performance becomes 
better in these situations.

> You may want to try your patch with more
> real workloads, and maybe you'll confirm what I found out last time I
> messed with posix_fadvise. If my experience is still relevant, those
> patterns will have suffered a severe performance penalty with this
> patch, because it will disable kernel read-ahead on sequential index
> access. It may still work for sequential heap scans, because the
> access strategy will tell the kernel to do read-ahead, but many other
> access methods will suffer.
The decisive difference with your patch is that my patch uses buffer hint control 
architecture, so it can control readahaed smarter in some cases.
However, my patch is on the way and needed to more improvement. I am going to add 
method of controlling readahead by GUC, for user can freely select readahed 
parameter in their transactions.

> Try OLAP-style queries.
I have DBT-3(TPC-H) benchmark tools. If you don't like TPC-H, could you tell me 
good OLAP benchmark tools?

Regards,
-- 
Mitsumasa KONDO
NTT Open Source Software




В списке pgsql-hackers по дате отправления:

Предыдущее
От: Bruce Momjian
Дата:
Сообщение: Re: Review: Patch insert throw error when year field len > 4 for timestamptz datatype
Следующее
От: KONDO Mitsumasa
Дата:
Сообщение: Re: Optimize kernel readahead using buffer access strategy