Re: High SYS CPU - need advise

Поиск
Список
Период
Сортировка
От Merlin Moncure
Тема Re: High SYS CPU - need advise
Дата
Msg-id CAHyXU0y8-XeBY9h1snUvnfuASYMa37md5kpOwSj+=g1t6XQOMw@mail.gmail.com
обсуждение исходный текст
Ответ на Re: High SYS CPU - need advise  (John R Pierce <pierce@hogranch.com>)
Ответы Re: High SYS CPU - need advise  (Vlad <marchenko@gmail.com>)
Список pgsql-general
On Mon, Nov 19, 2012 at 10:50 AM, Vlad <marchenko@gmail.com> wrote:
> I just did a little experiment: extracted top four queries that were
> executed the longest during stall times and launched  pgbench test with 240
> clients. Yet I wasn't able to put the server into a stall with that. Also
> load average was hitting 120+, it was all user cpu, single digit % system.
> The system remained pretty responsive (on a keypress), in contrary to when
> high-sys-cpu stall happens, showing similar LA numbers.
>
> This makes me think that there is probably a different (than ones I tried)
> query or condition that is responsible for creating high-sys-cpu. Also, from
> my limited knowledge of postgresql and kernel internals, this experiment
> puts under question idea that linux scheduler might be the corner stone....
>
> Here is a longer log, first column is PID. High-sys-cpu stall happened
> around 10:42:37
> https://dl.dropbox.com/u/109778/postgresql-2012-11-19_103425-stripped.log
>
> I'll try to put pgbouncer in place and repeat, also though this will take
> longer time to setup.


yeah.  interesting -- contention was much higher this time and that
changes things.  strange how it was missed earlier.

you're getting bounced around a lot in lwlock especially
(unfortunately we don't know which one).  I'm going to hazard another
guess:  maybe the trigger here is when the number of contending
backends exceeds some critical number (probably based on the number of
cores) you see a quick cpu spike (causing more backends to lock and
pile up) as cache line bouncing sets in.  That spike doesn't last
long, because the spinlocks quickly accumulate delay counts then punt
to the scheduler which is unable to cope.  The exact reason why this
is happening to you in exactly this way (I've never seen it) is
unclear.  Also the line between symptom and cause is difficult to
draw.

unfortunately, in your case spinlock re-scheduling isn't helping.  log
entries like this one:
18764 [2012-11-19 10:43:50.124 CST] LOG:  JJ spin delay from file
sinvaladt.c line 512 delay 212, pointer 0x7f514959a394 at character 29

are suggesting major problems.  you're dangerously close to a stuck
spinlock which is lights out for the database.

merlin


В списке pgsql-general по дате отправления:

Предыдущее
От: Merlin Moncure
Дата:
Сообщение: Re: High SYS CPU - need advise
Следующее
От: Vlad
Дата:
Сообщение: Re: High SYS CPU - need advise