Обсуждение: pgbench results arent accurate

Поиск
Список
Период
Сортировка

pgbench results arent accurate

От
Mariel Cherkassky
Дата:
Hey,
I installed a new postgres 9.6 on both of my machines. I'm trying to measure the differences between the performances in each machine but it seems that the results arent accurate.
I did 2 tests : 

1)In the first test the scale was set to 100 :
pgbench -i -s 100 -U postgres -d bench -h machine_name
pgbench -U postgres -d bench  -h machine_name -j 2 -c 16 -T 300
RUN   TPS - machine1TPS-machine2
1697555
2732861
3784842


2)In this test the scale was set to 10000 : 
pgbench -i -s 10000 -U postgres -d bench -h machine_name
pgbench -U postgres -d bench --progress=30 -h machine_name -j 2 -c 16 -T 300 
RUNTPS-MACHINE1 TPS-MACHINE2
110360
26366
37483
45661
57553
67360
76253

In both cases after the initalization I restarted the database and cleared the cashe(echo 1 > /proc/sys/vm/drop_caches) one time. During all the runs I didnt shutdown the machine.

Now, I was hopping the the tps will be almost the same in each machine for all the runs. In other words, I wanted to see that the tps in machine1 during all the tps are almost the same but I see that the values arent accurate.

Any idea what might cause the differences in every run ?

RE: pgbench results arent accurate

От
Greg Clough
Дата:

> I installed a new postgres 9.6 on both of my machines.

 

Where is your storage?  Is it local, or on a SAN?  A SAN will definitely have a cache, so possibly there is another layer of cache that you’re not accounting for.

 

Greg Clough.




This e-mail, including accompanying communications and attachments, is strictly confidential and only for the intended recipient. Any retention, use or disclosure not expressly authorised by IHSMarkit is prohibited. This email is subject to all waivers and other terms at the following link: https://ihsmarkit.com/Legal/EmailDisclaimer.html

Please visit www.ihsmarkit.com/about/contact-us.html for contact information on our offices worldwide.

Re: pgbench results arent accurate

От
Mark Kirkwood
Дата:
If you have not amended any Postgres config parameters, then you'll get 
checkpoints approx every 5 min or so. Thus using a Pgbench run time of 
5min is going sometimes miss/sometimes hit a checkpoint in progress - 
which will hugely impact test results.

I tend to do Pgbench runs of about 2x checkpoint_timeout - (i.e 10 min 
for default configurations). Also for increased repeatability, I do a 
manually triggered checkpoint immediately before each run.

regards

Mark

On 13/12/18 1:53 AM, Mariel Cherkassky wrote:
> Hey,
> I installed a new postgres 9.6 on both of my machines. I'm trying to 
> measure the differences between the performances in each machine but 
> it seems that the results arent accurate.
> I did 2 tests :
>
> 1)In the first test the scale was set to 100 :
> pgbench -i -s 100 -U postgres -d bench -h machine_name
> pgbench -U postgres -d bench  -h machine_name -j 2 -c 16 -T 300
> RUN        TPS - machine1     TPS-machine2
> 1     697     555
> 2     732     861
> 3     784     842
>
>     
>     
>
>
> 2)In this test the scale was set to 10000 :
> pgbench -i -s 10000 -U postgres -d bench -h machine_name
> pgbench -U postgres -d bench --progress=30 -h machine_name -j 2 -c 16 
> -T 300
> RUN     TPS-MACHINE1     TPS-MACHINE2
> 1     103     60
> 2     63     66
> 3     74     83
> 4     56     61
> 5     75     53
> 6     73     60
> 7     62     53
>
>
> In both cases after the initalization I restarted the database and 
> cleared the cashe(echo 1 > /proc/sys/vm/drop_caches) one time. During 
> all the runs I didnt shutdown the machine.
>
> Now, I was hopping the the tps will be almost the same in each machine 
> for all the runs. In other words, I wanted to see that the tps in 
> machine1 during all the tps are almost the same but I see that the 
> values arent accurate.
>
> Any idea what might cause the differences in every run ?


Re: pgbench results arent accurate

От
Mariel Cherkassky
Дата:
As Greg suggested, update you all that each vm has its own dedicated esx. Every esx has it`s own local disks.
I run it one time on two different servers that has the same hardware and same postgresql db (version and conf). The results : 
pgbench -i -s 6  pgbench -p 5432 -U postgres
 pgbench -c 16 -j 4 -T 5 -U postgres pgbench
MACHINE 1
starting vacuum...end.
transaction type: <builtin: TPC-B (sort of)>
scaling factor: 6
query mode: simple
number of clients: 16
number of threads: 4
duration: 5 s
number of transactions actually processed: 669
latency average = 122.633 ms
tps = 130.470828 (including connections establishing)
tps = 130.620286 (excluding connections establishing)

MACHINE 2

pgbench -c 16 -j 4 -T 600 -U postgres -p 5433 pgbench
starting vacuum...end.
transaction type: <builtin: TPC-B (sort of)>
scaling factor: 6
query mode: simple
number of clients: 16
number of threads: 4
duration: 600 s
number of transactions actually processed: 2393723
latency average = 4.011 ms
tps = 3989.437514 (including connections establishing)
tps = 3989.473036 (excluding connections establishing)

any idea what can cause such a difference ? Both of the machines have 20core and 65GB of ram.

‫בתאריך יום ה׳, 13 בדצמ׳ 2018 ב-15:54 מאת ‪Mariel Cherkassky‬‏ <‪mariel.cherkassky@gmail.com‬‏>:‬
Ok, I'll do that.  Thanks .

‫בתאריך יום ה׳, 13 בדצמ׳ 2018 ב-15:54 מאת ‪Greg Clough‬‏ <‪Greg.Clough@ihsmarkit.com‬‏>:‬

Hmmm... sounds like you’ve got most of it covered.  It may be a good idea to send that last message back to the list, as maybe others will have better ideas.

 

Greg.

 

From: Mariel Cherkassky <mariel.cherkassky@gmail.com>
Sent: Thursday, December 13, 2018 1:45 PM
To: Greg Clough <Greg.Clough@ihsmarkit.com>
Subject: Re: pgbench results arent accurate

 

Both of the machines are the only vms in a dedicated esx for each one. Each esx has local disks.

 

On Thu, Dec 13, 2018, 3:05 PM Greg Clough <Greg.Clough@ihsmarkit.com wrote:

> I installed a new postgres 9.6 on both of my machines.

 

Where is your storage?  Is it local, or on a SAN?  A SAN will definitely have a cache, so possibly there is another layer of cache that you’re not accounting for.

 

Greg Clough.

 



This e-mail, including accompanying communications and attachments, is strictly confidential and only for the intended recipient. Any retention, use or disclosure not expressly authorised by IHSMarkit is prohibited. This email is subject to all waivers and other terms at the following link: https://ihsmarkit.com/Legal/EmailDisclaimer.html

Please visit www.ihsmarkit.com/about/contact-us.html for contact information on our offices worldwide.




This e-mail, including accompanying communications and attachments, is strictly confidential and only for the intended recipient. Any retention, use or disclosure not expressly authorised by IHSMarkit is prohibited. This email is subject to all waivers and other terms at the following link: https://ihsmarkit.com/Legal/EmailDisclaimer.html

Please visit www.ihsmarkit.com/about/contact-us.html for contact information on our offices worldwide.

Re: pgbench results arent accurate

От
Mark Kirkwood
Дата:
Hi, I can see two issues making you get variable results:

1/ Number of clients > scale factor

Using -c16 and -s 6 means you are largely benchmarking lock contention 
for a row in the branches table (it has 6 rows in your case). So 
randomness in *which* rows each client tries to lock will make for 
unwanted variation.


2/ Short run times

That 1st run is 5s duration. This will be massively influenced by the 
above point about randomness for locking a branches row.


I'd recommend:

- always run at least -T600

- use -s of at least 1.5x your largest -c setting (I usually use -s 100 
for testing 1-32 clients).

regards

Mark

On 17/12/18 12:58 AM, Mariel Cherkassky wrote:
> As Greg suggested, update you all that each vm has its own dedicated 
> esx. Every esx has it`s own local disks.
> I run it one time on two different servers that has the same hardware 
> and same postgresql db (version and conf). The results :
> pgbench -i -s 6  pgbench -p 5432 -U postgres
>  pgbench -c 16 -j 4 -T 5 -U postgres pgbench
> MACHINE 1
> starting vacuum...end.
> transaction type: <builtin: TPC-B (sort of)>
> scaling factor: 6
> query mode: simple
> number of clients: 16
> number of threads: 4
> duration: 5 s
> number of transactions actually processed: 669
> latency average = 122.633 ms
> tps = 130.470828 (including connections establishing)
> tps = 130.620286 (excluding connections establishing)
>
> MACHINE 2
>
> pgbench -c 16 -j 4 -T 600 -U postgres -p 5433 pgbench
> starting vacuum...end.
> transaction type: <builtin: TPC-B (sort of)>
> scaling factor: 6
> query mode: simple
> number of clients: 16
> number of threads: 4
> duration: 600 s
> number of transactions actually processed: 2393723
> latency average = 4.011 ms
> tps = 3989.437514 (including connections establishing)
> tps = 3989.473036 (excluding connections establishing)
>
> any idea what can cause such a difference ? Both of the machines have 
> 20core and 65GB of ram.
>
> ‫בתאריך יום ה׳, 13 בדצמ׳ 2018 ב-15:54 מאת ‪Mariel Cherkassky‬‏ 
> <‪mariel.cherkassky@gmail.com <mailto:mariel.cherkassky@gmail.com>‬‏>:‬
>
>     Ok, I'll do that.  Thanks .
>
>     ‫בתאריך יום ה׳, 13 בדצמ׳ 2018 ב-15:54 מאת ‪Greg Clough‬‏
>     <‪Greg.Clough@ihsmarkit.com <mailto:Greg.Clough@ihsmarkit.com>‬‏>:‬
>
>         Hmmm... sounds like you’ve got most of it covered.  It may be
>         a good idea to send that last message back to the list, as
>         maybe others will have better ideas.
>
>         Greg.
>
>         *From:* Mariel Cherkassky <mariel.cherkassky@gmail.com
>         <mailto:mariel.cherkassky@gmail.com>>
>         *Sent:* Thursday, December 13, 2018 1:45 PM
>         *To:* Greg Clough <Greg.Clough@ihsmarkit.com
>         <mailto:Greg.Clough@ihsmarkit.com>>
>         *Subject:* Re: pgbench results arent accurate
>
>         Both of the machines are the only vms in a dedicated esx for
>         each one. Each esx has local disks.
>
>         On Thu, Dec 13, 2018, 3:05 PM Greg Clough
>         <Greg.Clough@ihsmarkit.com <mailto:Greg.Clough@ihsmarkit.com>
>         wrote:
>
>             > I installed a new postgres 9.6 on both of my machines.
>
>             Where is your storage?  Is it local, or on a SAN?  A SAN
>             will definitely have a cache, so possibly there is another
>             layer of cache that you’re not accounting for.
>
>             Greg Clough.
>
>             ------------------------------------------------------------------------
>
>
>             This e-mail, including accompanying communications and
>             attachments, is strictly confidential and only for the
>             intended recipient. Any retention, use or disclosure not
>             expressly authorised by IHSMarkit is prohibited. This
>             email is subject to all waivers and other terms at the
>             following link:
>             https://ihsmarkit.com/Legal/EmailDisclaimer.html
>
>             Please visit www.ihsmarkit.com/about/contact-us.html
>             <http://www.ihsmarkit.com/about/contact-us.html> for
>             contact information on our offices worldwide.
>
>
>         ------------------------------------------------------------------------
>
>         This e-mail, including accompanying communications and
>         attachments, is strictly confidential and only for the
>         intended recipient. Any retention, use or disclosure not
>         expressly authorised by IHSMarkit is prohibited. This email is
>         subject to all waivers and other terms at the following link:
>         https://ihsmarkit.com/Legal/EmailDisclaimer.html
>
>         Please visit www.ihsmarkit.com/about/contact-us.html
>         <http://www.ihsmarkit.com/about/contact-us.html> for contact
>         information on our offices worldwide.
>


Re: pgbench results arent accurate

От
Merlin Moncure
Дата:
On Wed, Dec 12, 2018 at 6:54 AM Mariel Cherkassky <mariel.cherkassky@gmail.com> wrote:
Hey,
I installed a new postgres 9.6 on both of my machines. I'm trying to measure the differences between the performances in each machine but it seems that the results arent accurate.
I did 2 tests : 

Better phrased, I'd say the results aren't _stable_ -- 'inaccurate' suggests that pgbench is giving erroneous results; you've provided no evidence of that.

Storage performance can seem random; there are numerous complex processes and caching that are involved between the software layer and the storage.  Some are within the database, some are within the underlying operating system, and some are within the storage itself.  Spinning media is also notoriously capricious, various hard to control for factors (such as where the data precisely exists on the platter) can influence data seek and fetch times.

I think we can look ahead to a not too distant future where storage performance will be less important with regards to typical database performance than it is today.  Clever people that are willing and able to buy appropriate hardware already live in this world essentially, but the enterprise storage industry seems strongly inclined to postpone this day of reckoning as long as possible for obviously selfish reasons.


merlin