On Wed, Sep 20, 2017 at 11:45 PM, Robert Haas <
robertmhaas@gmail.com> wrote:
> On Tue, Aug 29, 2017 at 1:57 AM, Mithun Cy <
mithun.cy@enterprisedb.com> wrote:
>> All TPS are median of 3 runs
>> Clients TPS-With Patch 05 TPS-Base %Diff
>> 1 752.461117 755.186777 -0.3%
>> 64 32171.296537 31202.153576 +3.1%
>> 128 41059.660769 40061.929658 +2.49%
>>
>> I will do some profiling and find out why this case is not costing us
>> some performance due to caching overhead.
>
> So, this shows only a 2.49% improvement at 128 clients but in the
> earlier message you reported a 39% speedup at 256 clients. Is that
> really correct? There's basically no improvement up to threads = 2 x
> CPU cores, and then after that it starts to improve rapidly? What
> happens at intermediate points, like 160, 192, 224 clients?
I think there is some confusion above results is for pgbench simple update (-N) tests where cached snapshot gets invalidated, I have run this to check if there is any regression due to frequent cache invalidation and did not find any. The target test for the above patch is read-only case [1] where we can see the performance improvement as high as 39% (@256 threads) on Cthulhu(a 8 socket numa machine with 64 CPU cores). At 64 threads ( = CPU cores) we have 5% improvement and at clients 128 = (2 * CPU cores = hyperthreads) we have 17% improvement.
Clients BASE CODE With patch %Imp
64 452475.929144 476195.952736 5.2422730281
128 556207.727932 653256.029012 17.4482115595
256 494336.282804 691614.000463 39.9075941867
[1]
cache_the_snapshot_performance.odsThanks and Regards
Mithun C Y
EnterpriseDB:
http://www.enterprisedb.com