"Simon Riggs" <simon@2ndquadrant.com> writes:
> But why MemoryContextSwitchTo ?
Because (a) it's so small that inlining it will probably be a net code
savings rather than expenditure, and (b) it does have noticeable cost.
For example, in this gprof profile taken Saturday:
% cumulative self self total time seconds seconds calls ms/call ms/call name
31.25 22.40 22.40 _mcount 3.31 24.77 2.37 704032 0.00 0.02
IndexNext2.82 26.79 2.02 2112850 0.00 0.00 AllocSetAlloc 2.48 28.57 1.78 2821112 0.00
0.00 LockBuffer 2.13 30.10 1.53 701932 0.00 0.01 heap_release_fetch 1.97 31.51 1.41
6310394 0.00 0.00 MemoryContextSwitchTo 1.97 32.92 1.41 699632 0.00 0.00 int8inc 1.66
34.11 1.19 1886388 0.00 0.00 LWLockAcquire 1.62 35.27 1.16 474244 0.00 0.00 hash_search
1.56 36.39 1.12 2109900 0.00 0.00 AllocSetReset 1.46 37.44 1.05 701901 0.00 0.00
_bt_restscan1.42 38.46 1.02 2109079 0.00 0.00 memset 1.39 39.46 1.00 701901 0.00
0.00 _bt_step 1.24 40.35 0.89 701833 0.00 0.00 ExecEvalExprSwitchContext 1.20 41.21 0.86
704143 0.00 0.00 _bt_checkkeys 1.17 42.05 0.84 1886388 0.00 0.00 LWLockRelease 1.17 42.89
0.84 701901 0.00 0.00 _bt_next 1.05 43.64 0.75 701833 0.00 0.00
HeapTupleSatisfiesSnapshot1.03 44.38 0.74 704144 0.00 0.01 btgettuple 1.03 45.12 0.74
$$dyncall 1.02 45.85 0.73 2110119 0.00 0.00 AllocSetCheck 0.91 46.50 0.65
706412 0.00 0.01 ReleaseAndReadBuffer
(all else below 1%)
the only thing I see in that list that looks reasonable to inline is
MemoryContextSwitchTo. (This is ye olde test_setup/test_run case on
a single processor, which is not very interesting lock-wise but I wanted
to reconfirm that we weren't spending a large fraction of the runtime
inside bufmgr.)
regards, tom lane