Re: High SYS CPU - need advise

Поиск
Список
Период
Сортировка
От Vlad
Тема Re: High SYS CPU - need advise
Дата
Msg-id 50A66150.2080805@gmail.com
обсуждение исходный текст
Ответ на Re: High SYS CPU - need advise  (Merlin Moncure <mmoncure@gmail.com>)
Ответы Re: High SYS CPU - need advise  (Merlin Moncure <mmoncure@gmail.com>)
Список pgsql-general
Merlin,

> Yeah -- you're right, this is definitely spinlock issue.  Next steps:
>
> *) in mostly read workloads, we have a couple of known frequent
> offenders.  In particular the 'BufFreelistLock'.  One way we can
> influence that guy is to try and significantly lower/raise shared
> buffers.  So this is one thing to try.

server has 32gb with 3.2gb dedicated for share buffers. I've increased
it to 13.2gb, the stall still happened (I have a way of controlling
number of queries hitting postgresql by shutting down own memcache-based
app cache, so to test for stall I temporary shut down few memcached
servers).

> *) failing that, LWLOCK_STATS macro can be compiled in to give us some
> information about the particular lock(s) we're binding on.  Hopefully
> it's a lwlock -- this will make diagnosing the problem easier.

I've enabled that macro, seeing flying lwlock messages in the log (see
below), even when there is no high-sys-cpu stall observed at the moment.
Should I be looking for something in particular?


PID 17293 lwlock 5906: shacq 1 exacq 0 blk 0
PID 17293 lwlock 5932: shacq 1 exacq 0 blk 0
PID 17293 lwlock 5934: shacq 1 exacq 0 blk 0
PID 17293 lwlock 10854: shacq 1 exacq 0 blk 0
PID 17293 lwlock 10856: shacq 4 exacq 0 blk 0
PID 17293 lwlock 10858: shacq 15 exacq 0 blk 0
PID 17293 lwlock 16442: shacq 4 exacq 0 blk 0
PID 17293 lwlock 16596: shacq 1 exacq 0 blk 0
PID 17293 lwlock 20458: shacq 3 exacq 0 blk 0
PID 17293 lwlock 20460: shacq 10 exacq 0 blk 0
PID 17293 lwlock 20464: shacq 2 exacq 0 blk 0
PID 17293 lwlock 20466: shacq 1 exacq 0 blk 0
PID 17293 lwlock 20480: shacq 2 exacq 0 blk 0
PID 17293 lwlock 20482: shacq 1 exacq 0 blk 0
PID 17293 lwlock 20484: shacq 1 exacq 0 blk 0
PID 17293 lwlock 20508: shacq 12 exacq 0 blk 0
PID 17293 lwlock 20510: shacq 6 exacq 0 blk 0
PID 17293 lwlock 20938: shacq 11 exacq 0 blk 0
PID 17293 lwlock 20940: shacq 22 exacq 0 blk 0
PID 17293 lwlock 20942: shacq 28 exacq 0 blk 0
PID 17293 lwlock 20944: shacq 56 exacq 0 blk 0
PID 17293 lwlock 20946: shacq 1 exacq 0 blk 0
PID 17293 lwlock 20948: shacq 2 exacq 0 blk 0
PID 17293 lwlock 20950: shacq 11 exacq 0 blk 0
PID 17293 lwlock 20952: shacq 3 exacq 0 blk 0
PID 17293 lwlock 20954: shacq 7 exacq 0 blk 0
PID 17293 lwlock 20956: shacq 2 exacq 0 blk 0
PID 17293 lwlock 20958: shacq 1 exacq 0 blk 0
PID 17293 lwlock 20960: shacq 1 exacq 0 blk 0
PID 17293 lwlock 20962: shacq 2 exacq 0 blk 0
PID 17293 lwlock 20964: shacq 2 exacq 0 blk 0
PID 17293 lwlock 46624: shacq 24 exacq 0 blk 0
PID 17293 lwlock 68126: shacq 6 exacq 0 blk 0


> *) if we're not blocking on lwlock, it's possibly a buffer pin related
> issue? I've seen this before, for example on an index scan that is
> dependent on an seq scan.  This long thread:
>
"http://postgresql.1045698.n5.nabble.com/9-2beta1-parallel-queries-ReleasePredicateLocks-CheckForSerializableConflictIn-in-the-oprofile-td5709812i100.html"
> has a lot information about that case and deserves a review.
>
> *) we can consider experimenting with futex
> (http://archives.postgresql.org/pgsql-hackers/2012-06/msg01588.php)
> to see if things improve.  This is dangerous, and could crash your
> server/eat your data, so fair warning.
>
> merlin


-- vlad


В списке pgsql-general по дате отправления:

Предыдущее
От: Tom Lane
Дата:
Сообщение: Re: Set returning functions in the SELECT list
Следующее
От: Lee Hachadoorian
Дата:
Сообщение: Check table storage parameters