Re: Server hitting 100% CPU usage, system comes to a crawl.

Поиск
Список
Период
Сортировка
От Brian Fehrle
Тема Re: Server hitting 100% CPU usage, system comes to a crawl.
Дата
Msg-id 4EA9CB99.8090808@consistentstate.com
обсуждение исходный текст
Ответ на Server hitting 100% CPU usage, system comes to a crawl.  (Brian Fehrle <brianf@consistentstate.com>)
Ответы Re: Server hitting 100% CPU usage, system comes to a crawl.
Список pgsql-general
On 10/27/2011 02:50 PM, Tom Lane wrote:
> Brian Fehrle<brianf@consistentstate.com>  writes:
>> Hi all, need some help/clues on tracking down a performance issue.
>> PostgreSQL version: 8.3.11
>> I've got a system that has 32 cores and 128 gigs of ram. We have
>> connection pooling set up, with about 100 - 200 persistent connections
>> open to the database. Our applications then use these connections to
>> query the database constantly, but when a connection isn't currently
>> executing a query, it's<IDLE>. On average, at any given time, there are
>> 3 - 6 connections that are actually executing a query, while the rest
>> are<IDLE>.
>> About once a day, queries that normally take just a few seconds slow way
>> down, and start to pile up, to the point where instead of just having
>> 3-6 queries running at any given time, we get 100 - 200. The whole
>> system comes to a crawl, and looking at top, the CPU usage is 99%.
> This is jumping to a conclusion based on insufficient data, but what you
> describe sounds a bit like the sinval queue contention problems that we
> fixed in 8.4.  Some prior reports of that:
> http://archives.postgresql.org/pgsql-performance/2008-01/msg00001.php
> http://archives.postgresql.org/pgsql-performance/2010-06/msg00452.php
>
> If your symptoms match those, the best fix would be to update to 8.4.x
> or later, but a stopgap solution would be to cut down on the number of
> idle backends.
>
>             regards, tom lane
That sounds somewhat close to the same issue I am seeing. Main
differences being that my spike lasts for much longer than a few
minutes, and can only be resolved when the cluster is restarted. Also,
that second link shows TOP where much of the CPU is via the 'user',
rather than the 'sys' like mine.

Is there anything I can look at more to get more info on this 'sinval
que contention problem'?

Also, having my cpu usage high in 'sys' rather than 'us', could that be
a red flag? Or is that normal?

- Brian F

В списке pgsql-general по дате отправления:

Предыдущее
От: Scott Marlowe
Дата:
Сообщение: Re: Server hitting 100% CPU usage, system comes to a crawl.
Следующее
От: Scott Mead
Дата:
Сообщение: Re: Server hitting 100% CPU usage, system comes to a crawl.