Occasional giant spikes in CPU load

Поиск
Список
Период
Сортировка
Most of the time Postgres runs nicely, but two or three times a day we get a huge spike in the CPU load that lasts just
ashort time -- it jumps to 10-20 CPU loads.  Today it hit 100 CPU loads.  Sometimes days go by with no spike events.
Duringthese spikes, the system is completely unresponsive (you can't even login via ssh). 

I managed to capture one such event using top(1) with the "batch" option as a background process.  See output below -
itshows 19 active postgress processes, but I think it missed the bulk of the spike. 

For some reason, every postgres backend suddenly decides (is told?) to do something.  When this happens, the system
becomeunusable for anywhere from ten seconds to a minute or so, depending on how much web traffic stacks up behind this
event. We have two servers, one offline and one public, and they both do this, so it's not caused by actual web traffic
(andthe Apache logs don't show any HTTP activity correlated with the spikes). 

I thought based on other posts that this might be a background-writer problem, but it's not I/O, it's all CPU as far as
Ican tell. 

Any ideas where I can look to find what's triggering this?

8 CPUs, 8 GB memory
8-disk RAID10 (10k SATA)
Postgres 8.3.0
Fedora 8, kernel is 2.6.24.4-64.fc8
Diffs from original postgres.conf:

max_connections = 1000
shared_buffers = 2000MB
work_mem = 256MB
max_fsm_pages = 16000000
max_fsm_relations = 625000
synchronous_commit = off
wal_buffers = 256kB
checkpoint_segments = 30
effective_cache_size = 4GB
escape_string_warning = off

Thanks,
Craig


top - 11:24:59 up 81 days, 20:27,  4 users,  load average: 0.98, 0.83, 0.92
Tasks: 366 total,  20 running, 346 sleeping,   0 stopped,   0 zombie
Cpu(s): 30.6%us,  1.5%sy,  0.0%ni, 66.3%id,  1.5%wa,  0.0%hi,  0.0%si,  0.0%st
Mem:   8194800k total,  8118688k used,    76112k free,       36k buffers
Swap:  2031608k total,   169348k used,  1862260k free,  7313232k cached

PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+  COMMAND
18972 postgres  20   0 2514m  11m 8752 R   11  0.1   0:00.35 postmaster
10618 postgres  20   0 2514m  12m 9456 R    9  0.2   0:00.54 postmaster
10636 postgres  20   0 2514m  11m 9192 R    9  0.1   0:00.45 postmaster
25903 postgres  20   0 2514m  11m 8784 R    9  0.1   0:00.21 postmaster
10626 postgres  20   0 2514m  11m 8716 R    6  0.1   0:00.45 postmaster
10645 postgres  20   0 2514m  12m 9352 R    6  0.2   0:00.42 postmaster
10647 postgres  20   0 2514m  11m 9172 R    6  0.1   0:00.51 postmaster
18502 postgres  20   0 2514m  11m 9016 R    6  0.1   0:00.23 postmaster
10641 postgres  20   0 2514m  12m 9296 R    5  0.2   0:00.36 postmaster
10051 postgres  20   0 2514m  13m  10m R    4  0.2   0:00.70 postmaster
10622 postgres  20   0 2514m  12m 9216 R    4  0.2   0:00.39 postmaster
10640 postgres  20   0 2514m  11m 8592 R    4  0.1   0:00.52 postmaster
18497 postgres  20   0 2514m  11m 8804 R    4  0.1   0:00.25 postmaster
18498 postgres  20   0 2514m  11m 8804 R    4  0.1   0:00.22 postmaster
10341 postgres  20   0 2514m  13m   9m R    2  0.2   0:00.57 postmaster
10619 postgres  20   0 2514m  12m 9336 R    1  0.2   0:00.38 postmaster
15687 postgres  20   0 2321m  35m  35m R    0  0.4   8:36.12 postmaster



В списке pgsql-performance по дате отправления:

Предыдущее
От: Robert Haas
Дата:
Сообщение: Re: indexes in partitioned tables - again
Следующее
От: "Joshua D. Drake"
Дата:
Сообщение: Re: Occasional giant spikes in CPU load