Re: High SYS CPU - need advise

Поиск
Список
Период
Сортировка
От Merlin Moncure
Тема Re: High SYS CPU - need advise
Дата
Msg-id CAHyXU0wJzGCdg9gKdRdnWvXaATia42BZa=DCLYhQ=u-qLtG++w@mail.gmail.com
обсуждение исходный текст
Ответ на Re: High SYS CPU - need advise  (Jeff Janes <jeff.janes@gmail.com>)
Ответы Re: High SYS CPU - need advise  (Vlad <marchenko@gmail.com>)
Список pgsql-general
On Thu, Nov 15, 2012 at 6:07 PM, Jeff Janes <jeff.janes@gmail.com> wrote:
> On Thu, Nov 15, 2012 at 2:44 PM, Merlin Moncure <mmoncure@gmail.com> wrote:
>
>>>> select(0, NULL, NULL, NULL, {0, 1000})  = 0 (Timeout)
>>>> select(0, NULL, NULL, NULL, {0, 1000})  = 0 (Timeout)
>>>> select(0, NULL, NULL, NULL, {0, 1000})  = 0 (Timeout)
>>>> select(0, NULL, NULL, NULL, {0, 2000})  = 0 (Timeout)
>>>> select(0, NULL, NULL, NULL, {0, 3000})  = 0 (Timeout)
>>>> select(0, NULL, NULL, NULL, {0, 4000})  = 0 (Timeout)
>>>> select(0, NULL, NULL, NULL, {0, 6000})  = 0 (Timeout)
>>>> select(0, NULL, NULL, NULL, {0, 7000})  = 0 (Timeout)
>>>> select(0, NULL, NULL, NULL, {0, 8000})  = 0 (Timeout)
>>>> select(0, NULL, NULL, NULL, {0, 9000})  = 0 (Timeout)
>
> This is not entirely inconsistent with the spinlock.  Note that 1000
> is repeated 3 times, and 5000 is missing.
>
> This might just be a misleading random sample we got here.  I've seen
> similar close spacing in some simulations I've run.
>
> It is not clear to me why we use a resolution of 1 msec here.  If the
> OS's implementation of select() eventually rounds to the nearest msec,
> that is its business.  But why do we have to lose intermediate
> precision due to its decision?

Yeah -- you're right, this is definitely spinlock issue.  Next steps:

*) in mostly read workloads, we have a couple of known frequent
offenders.  In particular the 'BufFreelistLock'.  One way we can
influence that guy is to try and significantly lower/raise shared
buffers.  So this is one thing to try.

*) failing that, LWLOCK_STATS macro can be compiled in to give us some
information about the particular lock(s) we're binding on.  Hopefully
it's a lwlock -- this will make diagnosing the problem easier.

*) if we're not blocking on lwlock, it's possibly a buffer pin related
issue? I've seen this before, for example on an index scan that is
dependent on an seq scan.  This long thread:

"http://postgresql.1045698.n5.nabble.com/9-2beta1-parallel-queries-ReleasePredicateLocks-CheckForSerializableConflictIn-in-the-oprofile-td5709812i100.html"
has a lot information about that case and deserves a review.

*) we can consider experimenting with futex
(http://archives.postgresql.org/pgsql-hackers/2012-06/msg01588.php)
to see if things improve.  This is dangerous, and could crash your
server/eat your data, so fair warning.

merlin


В списке pgsql-general по дате отправления:

Предыдущее
От: Ryan Kelly
Дата:
Сообщение: Set returning functions in the SELECT list
Следующее
От: Tom Lane
Дата:
Сообщение: Re: Set returning functions in the SELECT list