Re: Server hitting 100% CPU usage, system comes to a crawl.
| От | Brian Fehrle |
|---|---|
| Тема | Re: Server hitting 100% CPU usage, system comes to a crawl. |
| Дата | |
| Msg-id | 4EA9BBDD.5050109@consistentstate.com обсуждение исходный текст |
| Ответ на | Server hitting 100% CPU usage, system comes to a crawl. (Brian Fehrle <brianf@consistentstate.com>) |
| Список | pgsql-general |
Also, I'm not having any issue with the database restarting itself, simply becoming unresponsive / slow to respond, to the point where just sshing to the box takes about 30 seconds if not longer. Performing a pg_ctl restart on the cluster resolves the issue. I looked through the logs for any segmentation faults, none found. In fact the only thing in my log that seems to be 'bad' are the following. Oct 27 08:53:18 <snip> postgres[17517]: [28932839-1] user=<snip>,db=<snip> ERROR: deadlock detected Oct 27 11:49:22 <snip> postgres[608]: [19-1] user=<snip>,db=<snip> ERROR: could not serialize access due to concurrent update I don't believe these occurred too close to the slowdown. - Brian F On 10/27/2011 02:09 PM, Brian Fehrle wrote: > On 10/27/2011 01:48 PM, Scott Marlowe wrote: >> On Thu, Oct 27, 2011 at 12:39 PM, Brian Fehrle >> <brianf@consistentstate.com> wrote: >>> Looking at top, I see no SWAP usage, very little IOWait, and there >>> are a >>> large number of postmaster processes at 100% cpu usage (makes sense, >>> at this >>> point there are 150 or so queries currently executing on the database). >>> >>> Tasks: 713 total, 44 running, 668 sleeping, 0 stopped, 1 zombie >>> Cpu(s): 4.4%us, 92.0%sy, 0.0%ni, 3.0%id, 0.0%wa, 0.0%hi, 0.3%si, >>> 0.2%st >>> Mem: 134217728k total, 131229972k used, 2987756k free, 462444k >>> buffers >>> Swap: 8388600k total, 296k used, 8388304k free, 119029580k >>> cached >> OK, a few points. 1: You've got a zombie process. Find out what's >> causing that, it could be a trigger of some type for this behaviour. >> 2: You're 92% sys. That's bad. It means the OS is chewing up 92% of >> your 32 cores doing something. what tasks are at the top of the list >> in top? >> > Out of the top 50 processes in top, 48 of them are postmasters, one is > syslog, and one is psql. Each of the postmasters have a high %CPU, the > top ones being 80% and higher, the rest being anywhere between 30% - > 60%. Would postmaster 'queries' that are running attribute to the sys > CPU usage, or should they be under the 'us' CPU usage? > > >> Try running vmstat 10 for a a minute or so then look at cs and int >> columns. If cs or int is well over 100k there could be an issue with >> thrashing, where your app is making some change to the db that >> requires all backends to be awoken at once and the machine just falls >> over under the load. > > We've restarted the postgresql cluster, so the issue is not happening > at this moment. but running a vmstat 10 had my 'cs' average at 3K and > 'in' averaging around 9.5K. > > - Brian F
В списке pgsql-general по дате отправления: