gaussian distribution pgbench

Поиск
Список
Период
Сортировка
От KONDO Mitsumasa
Тема gaussian distribution pgbench
Дата
Msg-id 523BEE6D.3080500@lab.ntt.co.jp
обсуждение исходный текст
Ответы Re: gaussian distribution pgbench  (Kevin Grittner <kgrittn@ymail.com>)
Re: gaussian distribution pgbench  (Fabien COELHO <coelho@cri.ensmp.fr>)
Re: gaussian distribution pgbench  (Peter Eisentraut <peter_e@gmx.net>)
Список pgsql-hackers
Hello,

I create gaussinan distribution pgbench patch that can access records with
gaussian frequency. And I submit this commit fest.

* Purpose this patch
In the general transaction situation, clients access for all records equally is
hard to happen. I think gaussian distribution access patterns are most of
transaction petterns in general. My patch realizes neary this access pattern.

I think that not only it can simulate a general access pattern as an effect of
this patch, but also it is useful for new development features such as effective
use and free of shared_buffers, the readahead optimization in the OS, and the
speed-up of the tuple level lock.


* Usage
It is easy to use, only put -g with standard deviation threshold parameter.
If we set larger standard deviation threshold, pgbench access patern limited
more specific records. Min standard deviation threshold is 2.

Execution example command is here.
> [mitsu-ko@localhost postgresql]$ bin/pgbench -g 10 -c 16 -j 8 -T 300
> starting vacuum...end.
> transaction type: TPC-B (sort of)
> scaling factor: 1
> standard deviation threshold: 10.00000
> access probability of top 20%, 10% and 5% records: 0.95450 0.68269 0.38292
> query mode: simple
> number of clients: 16
> number of threads: 8
> duration: 300 s
> number of transactions actually processed: 566367
> tps = 1887.821409 (including connections establishing)
> tps = 1887.949390 (excluding connections establishing)

"access probability" indicates top N access probability in this benchmark.
If we set larger standard deviation threshold parameter, it become more large.

Attached png files which are "gausian_2.png" and "gaussian_10.png" indicate
gaussian distribution access patern by my patch. "no_gaussian.png" is not with -g
option (normal). I think my patch realize gaussian distribution access patern.


* Approach
It replaces uniform random number generator to gaussian distribution random
number generator using by box-muller tansform method. Then, I use standard
deviation threshold parameter for mapping a normal distribution access pattern in
each record and normalization. It is linear mappping method that is a floating
point to an integer value.


* other
I also create another patches that can get more accurate benchmark result in
pgbench, and will submit them this commit fest. They are like that I submitted
checkpoint patch in the past. They are all right, too!


Any question?

Best regards,
--
Mitsumasa KONDO
NTT Open Source Software Center

Вложения

В списке pgsql-hackers по дате отправления:

Предыдущее
От: samthakur74
Дата:
Сообщение: Re: pg_stat_statements: calls under-estimation propagation
Следующее
От: Albe Laurenz
Дата:
Сообщение: Re: FW: REVIEW: Allow formatting in log_line_prefix