Re: testing ProcArrayLock patches
От | Andres Freund |
---|---|
Тема | Re: testing ProcArrayLock patches |
Дата | |
Msg-id | 201111190408.16385.andres@anarazel.de обсуждение исходный текст |
Ответ на | Re: testing ProcArrayLock patches ("Kevin Grittner" <Kevin.Grittner@wicourts.gov>) |
Список | pgsql-hackers |
On Saturday, November 19, 2011 12:18:07 AM Kevin Grittner wrote: > Andres Freund <andres@anarazel.de> wrote: > > I think opannotate -a -s produces output with instructions/code > > intermingled. > > Thanks. I'll check out perf later (thanks for the tips!), but for > now, here's the function which was at the top of my oprofile > results, annotated with those options. I'm afraid it's a bit > intimidating to me -- the last time I did much with X86 assembly > language was in the mid-80s, on an 80286. :-/ While my assembly knoweldge surely isn't from the 80s be assured that I find it intimidating as well ;) > Hopefully, since > this is at the top of the oprofile results when running with > prepared statements, it will be of use to somebody. I think in quite many situations hash_search_with_hash_value is rather noticeable in the profiles. Even without concurrency... Looking at your annotation output the code seems to be almost entirely stalled waiting for memory. The first stall is after the first reading memory access which is likely to be uncached (the first cacheline of the HTAB is accessed before but that will be in the cache). The interesting thing is that I would have expected a higher likelihood for this to stay in the cache. 2225 0.0165 : 70b543: mov (%rdi),%r15 :static inline uint32 :calc_bucket(HASHHDR *hctl, uint32 hash_val) :{ : uint32 bucket; : : bucket = hash_val & hctl->high_mask; 4544 0.0337 : 70b546: and 0x2c(%r15),%ebx : if (bucket > hctl->max_bucket)53409 0.3958 : 70b54a: cmp 0x28(%r15),%ebx : 70b54e: jbe 70b554 <hash_search_with_hash_value+0x34> So a stall here is not that surprising. Here we fetch data from memory which is unlikely to be prefetchable and then require the result from that fetch. Note how segp = hashp->dir[segment_num]; is distributed over line 52, 64, 83. : segp = hashp->dir[segment_num]; 2062 0.0153 : 70b562: shr %cl,%eax 309 0.0023 : 70b564: mov %eax,%eax 643 0.0048 : 70b566: mov (%rdx,%rax,8),%rbp : : if (segp == NULL)43329 0.3211 : 70b56a: test %rbp,%rbp The next cacheline is referenced here. Again a fetch from memory which is soon after needed to continue. Unless I misunderstood the code-flow this disproves my theory that we might have many collisions as that test seems to be outside the test ( : prevBucketPtr = &segp[segment_ndx]; : currBucket = *prevBucketPtr; 122 9.0e-04 : 70b586: mov 0x0(%rbp),%rbx : : /* : * Follow collision chain looking for matching key : */ : match= hashp->match; /* save one fetch in inner loop */ : keysize = hashp->keysize; /* ditto */99903 0.7404 : 70b58a: mov %rax,0x18(%rsp) : : while (currBucket != NULL) 1066 0.0079 : 70b58f: test %rbx,%rbx line 136 is the first time the contents of the current bucket is needed. Thats why the test is so noticeable. : currBucket = *prevBucketPtr; 655 0.0049 : 70b5a3: mov (%rbx),%rbx : * Follow collision chain looking for matching key : */ : match= hashp->match; /* save one fetch in inner loop */ : keysize = hashp->keysize; /* ditto */ : : while (currBucket !=NULL) 608 0.0045 : 70b5a6: test %rbx,%rbx : 70b5a9: je 70b5d0 <hash_search_with_hash_value+0xb0> : { : if (currBucket->hashvalue == hashvalue && 3504 0.0260 : 70b5ab: cmp %r12d,0x8(%rbx)98486 0.7299 : 70b5af: nop 1233 0.0091 : 70b5b0: jne 70b5a0<hash_search_with_hash_value+0x80> That covers all the slow points in the function. And unless I am missing something those are all the fetched cachelines of that function... For HASH_FIND that is. So I think that reinforces my belive that ordinary cachemisses are the culprit here. Which is to be excepted in a hashtable... Andres PS: No idea whether that rambling made sense to anyone... But I looked at that function fo the first time ;)
В списке pgsql-hackers по дате отправления: