I have a moderately loaded postgres server running 7.2beta4 (i wanted to
try out the live vacuum) that turns out to spend the majority of its cpu
time in kernel land. With only a handful of running processes, postgres
induces tens of thousands of context switches per second. Practically the
only thing postgres does with all this CPU time is semop() in a tight
loop. Here is a snippet of strace:
[pid 11410] 0.000064 <... semop resumed> ) = 0
[pid 11409] 0.000020 <... semop resumed> ) = 0
[pid 11410] 0.000024 semop(1179648, 0xbfffe658, 1 <unfinished ...>
[pid 11409] 0.000027 semop(1179648, 0xbfffe488, 1 <unfinished ...>
[pid 11407] 0.000027 semop(1179648, 0xbfffe8b8, 1 <unfinished ...>
[pid 11409] 0.000022 <... semop resumed> ) = 0
[pid 11406] 0.000018 <... semop resumed> ) = 0
[pid 11409] 0.000023 semop(1179648, 0xbfffe468, 1 <unfinished ...>
[pid 11406] 0.000026 semop(1179648, 0xbfffe958, 1) = 0
[pid 11406] 0.000057 semop(1179648, 0xbfffe9f8, 1 <unfinished ...>
[pid 11408] 0.000037 <... semop resumed> ) = 0
[pid 11408] 0.000029 semop(1179648, 0xbfffe4d8, 1) = 0
[pid 11411] 0.000038 <... semop resumed> ) = 0
[pid 11408] 0.000023 semop(1179648, 0xbfffe4d8, 1 <unfinished ...>
[pid 11411] 0.000026 semop(1179648, 0xbfffe498, 1) = 0
[pid 11407] 0.000040 <... semop resumed> ) = 0
[pid 11411] 0.000024 semop(1179648, 0xbfffe658, 1 <unfinished ...>
[pid 11407] 0.000027 semop(1179648, 0xbfffe8a8, 1) = 0
[pid 11410] 0.000038 <... semop resumed> ) = 0
[pid 11407] 0.000024 semop(1179648, 0xbfffe918, 1 <unfinished ...>
[pid 11410] 0.000026 semop(1179648, 0xbfffe618, 1) = 0
[pid 11410] 0.000058 semop(1179648, 0xbfffe6a8, 1 <unfinished ...>
[pid 11409] 0.000024 <... semop resumed> ) = 0
[pid 11409] 1.214166 semop(1179648, 0xbfffe428, 1) = 0
[pid 11406] 0.000063 <... semop resumed> ) = 0
[pid 11406] 0.000031 semop(1179648, 0xbfffe9f8, 1) = 0
[pid 11406] 0.000051 semop(1179648, 0xbfffe8f8, 1 <unfinished ...>
Performance on this database kind of sucks. Since there is little or no
block I/O, I assume this is because postgres is wasting its CPU
allocations.
Does anyone else see this? Is there a config option to tune the locking
behavior? Any other workarounds?
The machine is a 2-way x86 running Linux 2.4. I brought this up on
linux-kernel and they don't seem to think it is the scheduler's problem.
-jwb