Re: High SYS CPU - need advise

Поиск
Список
Период
Сортировка
От Merlin Moncure
Тема Re: High SYS CPU - need advise
Дата
Msg-id CAHyXU0yshER2j=U5ahC443btRWLcvOFiB=R9iWO0Y=dKRiveNA@mail.gmail.com
обсуждение исходный текст
Ответ на Re: High SYS CPU - need advise  (Alvaro Herrera <alvherre@2ndquadrant.com>)
Ответы Re: High SYS CPU - need advise  (Tom Lane <tgl@sss.pgh.pa.us>)
Re: High SYS CPU - need advise  (Jeff Janes <jeff.janes@gmail.com>)
Список pgsql-general
On Thu, Nov 15, 2012 at 4:29 PM, Alvaro Herrera
<alvherre@2ndquadrant.com> wrote:
> Merlin Moncure escribió:
>
>> ok, excellent.   reviewing the log, this immediately caught my eye:
>>
>> recvfrom(8, "\27\3\1\0@", 5, 0, NULL, NULL) = 5
>> recvfrom(8, "\327\327\nl\231LD\211\346\243@WW\254\244\363C\326\247\341\177\255\263~\327HDv-\3466\353"...,
>> 64, 0, NULL, NULL) = 64
>> select(0, NULL, NULL, NULL, {0, 1000})  = 0 (Timeout)
>> select(0, NULL, NULL, NULL, {0, 1000})  = 0 (Timeout)
>> select(0, NULL, NULL, NULL, {0, 1000})  = 0 (Timeout)
>> select(0, NULL, NULL, NULL, {0, 2000})  = 0 (Timeout)
>> select(0, NULL, NULL, NULL, {0, 3000})  = 0 (Timeout)
>> select(0, NULL, NULL, NULL, {0, 4000})  = 0 (Timeout)
>> select(0, NULL, NULL, NULL, {0, 6000})  = 0 (Timeout)
>> select(0, NULL, NULL, NULL, {0, 7000})  = 0 (Timeout)
>> select(0, NULL, NULL, NULL, {0, 8000})  = 0 (Timeout)
>> select(0, NULL, NULL, NULL, {0, 9000})  = 0 (Timeout)
>> semop(41713721, {{2, 1, 0}}, 1)         = 0
>> lseek(295, 0, SEEK_END)                 = 0
>> lseek(296, 0, SEEK_END)                 = 8192
>>
>> this is definitely pointing to spinlock issue.
>
> I met Rik van Riel (Linux kernel hacker) recently and we chatted about
> this briefly.  He strongly suggested that we should consider using
> futexes on Linux instead of spinlocks; the big advantage being that
> futexes sleep instead of spinning when contention is high.  That would
> reduce the system load in this scenario.

Well, so do postgres spinlocks right?  When we overflow
spins_per_delay we go to pg_usleep which proxies to select() --
postgres spinlocks are a hybrid implementation.  Moving to futex is
possible improvement (that's another discussion) in degenerate cases
but I'm not sure that I've exactly zeroed in on the problem.  Or am I
missing something?

What I've been scratching my head over is what code exactly would
cause an iterative sleep like the above.  The code is here:

  pg_usleep(cur_delay * 1000L);

  /* increase delay by a random fraction between 1X and 2X */
  cur_delay += (int) (cur_delay *
        ((double) random() / (double) MAX_RANDOM_VALUE) + 0.5);
  /* wrap back to minimum delay when max is exceeded */
  if (cur_delay > MAX_DELAY_MSEC)
    cur_delay = MIN_DELAY_MSEC;

...so cur_delay is supposed to increase in non linear fashion.  I've
looked around the sleep, usleep, and latch calls as of yet haven't
found anything that advances timeout just like that (yet, need to do
another pass). And we don't know for sure if this is directly related
to OP's problem.

merlin


В списке pgsql-general по дате отправления:

Предыдущее
От: Paul Jungwirth
Дата:
Сообщение: Percent of Total in Histogram Query
Следующее
От: "David Johnston"
Дата:
Сообщение: Re: Percent of Total in Histogram Query