Re: Speed up Clog Access by increasing CLOG buffers

Поиск
Список
Период
Сортировка
От Dilip Kumar
Тема Re: Speed up Clog Access by increasing CLOG buffers
Дата
Msg-id CAFiTN-tCqDucbikxue0x0dwNOAa6yzjMxJNgXGQm90jVVRSj0Q@mail.gmail.com
обсуждение исходный текст
Ответ на Re: Speed up Clog Access by increasing CLOG buffers  (Tomas Vondra <tomas.vondra@2ndquadrant.com>)
Ответы Re: Speed up Clog Access by increasing CLOG buffers  (Dilip Kumar <dilipbalaut@gmail.com>)
Список pgsql-hackers
On Thu, Oct 20, 2016 at 9:03 PM, Tomas Vondra
<tomas.vondra@2ndquadrant.com> wrote:

> In the results you've posted on 10/12, you've mentioned a regression with 32
> clients, where you got 52k tps on master but only 48k tps with the patch (so
> ~10% difference). I have no idea what scale was used for those tests,

That test was with scale factor 300 on POWER 4 socket machine. I think
I need to repeat this test with multiple reading to confirm it was
regression or run to run variation. I will do that soon and post the
results.

> and I
> see no such regression in the current results (but you only report results
> for some of the client counts).

This test is on X86 8 socket machine, At 1000 scale factor I have
given reading with all client counts (32,64,96,192), but at 300 scale
factor I posted only with 192 because on this machine (X86 8 socket
machine) I did not see much load on ClogControlLock at 300 scale
factor.
>
> Also, which of the proposed patches have you been testing?
I tested with GroupLock patch.

> Can you collect and share a more complete set of data, perhaps based on the
> scripts I use to do tests on the large machine with 36/72 cores, available
> at https://bitbucket.org/tvondra/hp05-results ?

I think from my last run I did not share data for -> X86 8 socket
machine, 300 scale factor, 32,64,96 client. I already have those data
so I ma sharing it. (Please let me know if you want to see at some
other client count, for that I need to run another test.)

Head:
scaling factor: 300
query mode: prepared
number of clients: 32
number of threads: 32
duration: 1800 s
number of transactions actually processed: 77233356
latency average: 0.746 ms
tps = 42907.363243 (including connections establishing)
tps = 42907.546190 (excluding connections establishing)
[dilip.kumar@cthulhu bin]$ cat 300_32_ul.txt111757                  |  3666  1289  LWLockNamed     | ProcArrayLock
1142 Lock            | transactionid   318  LWLockNamed     | CLogControlLock   299  Lock            | extend   109
LWLockNamed    | XidGenLock    70  LWLockTranche   | buffer_content    35  Lock            | tuple    29  LWLockTranche
 | lock_manager    14  LWLockTranche   | wal_insert     1 Tuples only is on.     1  LWLockNamed     |
CheckpointerCommLock

Group Lock Patch:

scaling factor: 300
query mode: prepared
number of clients: 32
number of threads: 32
duration: 1800 s
number of transactions actually processed: 77544028
latency average: 0.743 ms
tps = 43079.783906 (including connections establishing)
tps = 43079.960331 (excluding connections establishing112209                  |  3718  1402  LWLockNamed     |
ProcArrayLock 1070  Lock            | transactionid   245  LWLockNamed     | CLogControlLock   188  Lock            |
extend   80  LWLockNamed     | XidGenLock    76  LWLockTranche   | buffer_content    39  LWLockTranche   | lock_manager
  31  Lock            | tuple     7  LWLockTranche   | wal_insert     1 Tuples only is on.     1  LWLockTranche   |
buffer_mapping

Head:
number of clients: 64
number of threads: 64
duration: 1800 s
number of transactions actually processed: 76211698
latency average: 1.512 ms
tps = 42339.731054 (including connections establishing)
tps = 42339.930464 (excluding connections establishing)
[dilip.kumar@cthulhu bin]$ cat 300_64_ul.txt215734                  |  5106  Lock            | transactionid  3754
LWLockNamed    | ProcArrayLock  3669  3267  LWLockNamed     | CLogControlLock   661  Lock            | extend   339
LWLockNamed    | XidGenLock   310  Lock            | tuple   289  LWLockTranche   | buffer_content   205  LWLockTranche
 | lock_manager    50  LWLockTranche   | wal_insert     2  LWLockTranche   | buffer_mapping     1 Tuples only is on.
1  LWLockTranche   | proc
 

GroupLock patch:
scaling factor: 300
query mode: prepared
number of clients: 64
number of threads: 64
duration: 1800 s
number of transactions actually processed: 76629309
latency average: 1.503 ms
tps = 42571.704635 (including connections establishing)
tps = 42571.905157 (excluding connections establishing)
[dilip.kumar@cthulhu bin]$ cat 300_64_ul.txt217840                  |  5197  Lock            | transactionid  3744
LWLockNamed    | ProcArrayLock  3663   966  Lock            | extend   849  LWLockNamed     | CLogControlLock   372
Lock           | tuple   305  LWLockNamed     | XidGenLock   199  LWLockTranche   | buffer_content   184  LWLockTranche
 | lock_manager    35  LWLockTranche   | wal_insert     1 Tuples only is on.     1  LWLockTranche   | proc     1
LWLockTranche  | buffer_mapping
 

Head:
scaling factor: 300
query mode: prepared
number of clients: 96
number of threads: 96
duration: 1800 s
number of transactions actually processed: 77663593
latency average: 2.225 ms
tps = 43145.624864 (including connections establishing)
tps = 43145.838167 (excluding connections establishing)
302317                  | 18836  Lock            | transactionid 12912  LWLockNamed     | CLogControlLock  4120
LWLockNamed    | ProcArrayLock  3662  1700  Lock            | tuple  1305  Lock            | extend  1030
LWLockTranche  | buffer_content   828  LWLockTranche   | lock_manager   730  LWLockNamed     | XidGenLock   107
LWLockTranche  | wal_insert     4  LWLockTranche   | buffer_mapping     1 Tuples only is on.     1  LWLockTranche   |
proc    1  BufferPin       | BufferPin
 

Group Lock Patch:
scaling factor: 300
query mode: prepared
number of clients: 96
number of threads: 96
duration: 1800 s
number of transactions actually processed: 61608756
latency average: 2.805 ms
tps = 44385.885080 (including connections establishing)
tps = 44386.297364 (excluding connections establishing)
[dilip.kumar@cthulhu bin]$ cat 300_96_ul.txt237842                  | 14379  Lock            | transactionid  3335
LWLockNamed    | ProcArrayLock  2850  1374  LWLockNamed     | CLogControlLock  1200  Lock            | tuple   992
Lock           | extend   717  LWLockNamed     | XidGenLock   625  LWLockTranche   | lock_manager   259  LWLockTranche
| buffer_content   105  LWLockTranche   | wal_insert     4  LWLockTranche   | buffer_mapping     2  LWLockTranche   |
proc

Head:
scaling factor: 300
query mode: prepared
number of clients: 192
number of threads: 192
duration: 1800 s
number of transactions actually processed: 65930726
latency average: 5.242 ms
tps = 36621.827041 (including connections establishing)
tps = 36622.064081 (excluding connections establishing)
[dilip.kumar@cthulhu bin]$ cat 300_192_ul.txt437848                  |118966  Lock            | transactionid 88869
LWLockNamed    | CLogControlLock 18558  Lock            | tuple  6183  LWLockTranche   | buffer_content  5664
LWLockTranche  | lock_manager  3995  LWLockNamed     | ProcArrayLock  3646  1748  Lock            | extend  1635
LWLockNamed    | XidGenLock   401  LWLockTranche   | wal_insert    33  BufferPin       | BufferPin     5  LWLockTranche
 | proc     3  LWLockTranche   | buffer_mapping
 

GroupLock Patch:
scaling factor: 300
query mode: prepared
number of clients: 192
number of threads: 192
duration: 1800 s
number of transactions actually processed: 82616270
latency average: 4.183 ms
tps = 45894.737813 (including connections establishing)
tps = 45894.995634 (excluding connections establishing)
120372  Lock            | transactionid 16346  Lock            | tuple  7489  LWLockTranche   | lock_manager  4514
LWLockNamed    | ProcArrayLock  3632  3310  LWLockNamed     | CLogControlLock  2287  LWLockNamed     | XidGenLock  2271
Lock            | extend   709  LWLockTranche   | buffer_content   490  LWLockTranche   | wal_insert    30  BufferPin
   | BufferPin    10  LWLockTranche   | proc     6  LWLockTranche   | buffer_mapping
 

Summary: On (X86 8 Socket machine, 300 S.F), I did not observe
significant wait on ClogControlLock upto 96 clients. However at 192 we
can see significant wait on ClogControlLock, but still not as bad as
we see on POWER.


>
> I've taken some time to build a simple web-based reports from the results
> collected so far (also included in the git repository), and pushed them
> here:
>
>     http://tvondra.bitbucket.org
>
> For each of the completed runs, there's a report comparing tps for different
> client counts with master and the three patches (average tps, median and
> stddev), and it's possible to download a more thorough text report with wait
> event stats, comparison of individual runs etc.

I saw your report, I think presenting it this way can give very clear idea.
>
> If you want to cooperate on this, I'm available - i.e. I can help you get
> the tooling running, customize it etc.

That will be really helpful, then next time I can also present my
reports in same format.

-- 
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com



В списке pgsql-hackers по дате отправления:

Предыдущее
От: "David G. Johnston"
Дата:
Сообщение: Re: Renaming of pg_xlog and pg_clog
Следующее
От: Dilip Kumar
Дата:
Сообщение: Re: Speed up Clog Access by increasing CLOG buffers