Обсуждение: why after increase the hash table partitions, TPMC decrease
<div class="WordSection1"><p class="MsoNormal"><span lang="EN-US"> </span><p class="MsoNormal"><span lang="EN-US">We usebenchmarksql to start tpcc test in postgresql 9.3.3.</span><p class="MsoNormal"><span lang="EN-US">Before test we setbenchmarksql client number about 800. And we increase the hash partitions from 16 to 1024 , in order to reduce the hashpartition locks competition.</span><p class="MsoNormal"><span lang="EN-US">We expect that after increase the numberof partitions, reduces lock competition, TPMC should be increased. But the test results on the contrary, after modifiedto 1024, TPMC did not increase, but decrease. </span><p class="MsoNormal"><span lang="EN-US">Why such result?</span><pclass="MsoNormal"><span lang="EN-US"> </span><p class="MsoNormal"><span lang="EN-US">We modify the followingmacro definition:</span><p class="MsoNormal"><span lang="EN-US">NUM_BUFFER_PARTITIONS 1024</span><p class="MsoNormal"><spanlang="EN-US">LOG2_NUM_PREDICATELOCK_PARTITIONS 10</span><p class="MsoNormal"><span lang="EN-US">LOG2_NUM_LOCK_PARTITIONS10</span><p class="MsoNormal"><span lang="EN-US"> </span></div>
>
>
>
> We use benchmarksql to start tpcc test in postgresql 9.3.3.
>
> Before test we set benchmarksql client number about 800. And we increase the hash partitions from 16 to 1024 , in order to reduce the hash partition locks competition.
>
> We expect that after increase the number of partitions, reduces lock competition, TPMC should be increased.
>
> Why such result?
>
> We modify the following macro definition:
>
> NUM_BUFFER_PARTITIONS 1024
>
> LOG2_NUM_PREDICATELOCK_PARTITIONS 10
>
> LOG2_NUM_LOCK_PARTITIONS 10
Increasing these numbers might lead to error
I already modified MAX_SIMUL_LWLOCKS to make sure it is enough.
Total RAM is 130G, and I set shared_buffers 16G, CPU and IO is not full. 50% CPUs are idle. So I think maybe pg is blocked by some place in itself.
发件人: Amit Kapila [mailto:amit.kapila16@gmail.com]
发送时间: 2014年9月2日 19:31
收件人: Xiaoyulei
抄送: pgsql-hackers@postgresql.org
主题: Re: [HACKERS] why after increase the hash table partitions, TPMC decrease
On Tue, Sep 2, 2014 at 2:09 PM, Xiaoyulei <xiaoyulei@huawei.com> wrote:
>
>
>
> We use benchmarksql to start tpcc test in postgresql 9.3.3.
>
> Before test we set benchmarksql client number about 800. And we increase the hash partitions from 16 to 1024 , in order to reduce the hash partition locks competition.
>
> We expect that after increase the number of partitions, reduces lock competition, TPMC should be increased.
I think you can expect some increase mainly if your test is
read only and you have sufficient RAM such that it can contain
all the data, for other cases there can be I/O due to which you
might not see any increase.
> But the test results on the contrary, after modified to 1024, TPMC did not increase, but decrease.
>
> Why such result?
>
> We modify the following macro definition:
>
> NUM_BUFFER_PARTITIONS 1024
>
> LOG2_NUM_PREDICATELOCK_PARTITIONS 10
>
> LOG2_NUM_LOCK_PARTITIONS 10
Increasing these numbers might lead to error
"too many LWLocks taken", unless you increase
MAX_SIMUL_LWLOCKS. Once you can check the server
log if it contains any errors, that might lead to decrease in
performance.
Also another side effect would be that increasing above numbers
will lead to increase in shared memory usage.
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com
>
> I already modified MAX_SIMUL_LWLOCKS to make sure it is enough.
Okay.
>
>
> Total RAM is 130G, and I set shared_buffers 16G, CPU and IO is not full. 50% CPUs are idle.
We use benchmarksql to start tpcc test in postgresql 9.3.3.
Before test we set benchmarksql client number about 800. And we increase the hash partitions from 16 to 1024 , in order to reduce the hash partition locks competition.
We expect that after increase the number of partitions, reduces lock competition, TPMC should be increased. But the test results on the contrary, after modified to 1024, TPMC did not increase, but decrease.
Why such result?
benchmarSQL has about half reads. So I think it should be effective. I don't think BufFreelistLock take much time, it just get a buffer from list. It should be very fast. The test server has 2 CPUs and 12 cores in each CPU. 24 processor totally. CPU Idle time is over 50%. IO only 10%(data isin SSD) I perf one process of pg. The hot spot is hash search. Attachment is perf data file. 3.63% postgres postgres [.] hash_search_with_hash_value 3.10% postgres postgres [.] AllocSetAlloc 3.04% postgres postgres [.] LWLockAcquire 2.73% postgres postgres [.] _bt_compare 2.66% postgres postgres [.] SearchCatCache 2.18% postgres postgres [.] ExecInitExpr 2.11% postgres postgres [.] GetSnapshotData 1.57% postgres postgres [.] PinBuffer 1.41% postgres postgres [.] XLogInsert 1.36% postgres libc-2.11.3.so [.] _int_malloc 1.31% postgres postgres [.] LWLockRelease 1.09% postgres libc-2.11.3.so [.] __GI_memcpy 0.89% postgres postgres [.] _bt_checkkeys 0.82% postgres libc-2.11.3.so [.] __strncpy_ssse3 0.81% postgres postgres [.] palloc 0.81% postgres postgres [.] fmgr_info_cxt_security 0.76% postgres postgres [.] equal 0.75% postgres postgres [.] s_lock 0.73% postgres postgres [.] heap_hot_search_buffer >From: Amit Kapila [mailto:amit.kapila16@gmail.com] >Sent: Tuesday, September 02, 2014 10:44 PM >To: Xiaoyulei >Cc: pgsql-hackers@postgresql.org >Subject: Re: 答复: [HACKERS] why after increase the hash table partitions, TPMC decrease > >On Tue, Sep 2, 2014 at 5:20 PM, Xiaoyulei <xiaoyulei@huawei.com> wrote: >> >> I already modified MAX_SIMUL_LWLOCKS to make sure it is enough. > >Okay. > >> >> >> Total RAM is 130G, and I set shared_buffers 16G, CPU and IO is not full. 50% CPUs are idle. > >As far as I understand, benchmarkSQL measures an OLTP >workload performance which means it contains mix of reads >and writes, now I am not sure how you have identified that >increasing buffer partitions can improve the performance. >Have you used any profiling? > >> So I think maybe pg is blocked by some place in itself. > >Yeah, there's another lock BufFreelistLock which is a major >cause of contention in buffer allocation and for which already >work is in progress for 9.5. However as mentioned previously, >that will be useful mainly for Read only loads. > > > > >With Regards, >Amit Kapila. >EnterpriseDB: http://www.enterprisedb.com
Вложения
On Tue, Sep 2, 2014 at 11:02 PM, Xiaoyulei <xiaoyulei@huawei.com> wrote: > benchmarSQL has about half reads. So I think it should be effective. > > I don't think BufFreelistLock take much time, it just get a buffer from list. It should be very fast. You're wrong. That list is usually empty right now; so it does a linear scan of the buffer pool looking for a good eviction candidate. > The test server has 2 CPUs and 12 cores in each CPU. 24 processor totally. CPU Idle time is over 50%. IO only 10%(datais in SSD) > > I perf one process of pg. The hot spot is hash search. Attachment is perf data file. I think you need to pass -g to perf so that you get a call-graph profile. Then you should be able to expand the entry for hash_search_with_hash_value() and see what's calling it. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company