Re: CPU spikes and transactions

Поиск
Список
Период
Сортировка
От Jeff Janes
Тема Re: CPU spikes and transactions
Дата
Msg-id CAMkU=1xdDOEfGCAOX09DBh3n5r_=j+S9mpqrJnV02x3RZkUdBA@mail.gmail.com
обсуждение исходный текст
Ответ на Re: CPU spikes and transactions  (Dave Owens <dave@teamunify.com>)
Список pgsql-performance
On Tue, May 13, 2014 at 4:04 PM, Dave Owens <dave@teamunify.com> wrote:
Hi,

Apologies for resurrecting this old thread, but it seems like this is better than starting a new conversation.

We are now running 9.1.13 and have doubled the CPU and memory.  So 2x 16 Opteron 6276 (32 cores total), and 64GB memory.  shared_buffers set to 20G, effective_cache_size set to 40GB.

We were able to record perf data during the latest incident of high CPU utilization. perf report is below:

Samples: 31M of event 'cycles', Event count (approx.): 16289978380877 
 44.74%       postmaster  [kernel.kallsyms]             [k] _spin_lock_irqsave                                     
 15.03%       postmaster  postgres                      [.] 0x00000000002ea937                                     
  3.14%       postmaster  postgres                      [.] s_lock                                                 
  2.30%       postmaster  [kernel.kallsyms]             [k] compaction_alloc                                       
  2.21%       postmaster  postgres                      [.] HeapTupleSatisfiesMVCC                                 


compaction_alloc points to "transparent huge pages" kernel problem, while HeapTupleSatisfiesMVCC points to the problem with each backend taking a ProcArrayLock for every not-yet-committed tuple it encounters.  I don't know which of those leads to the _spin_lock_irqsave.  It seems more likely to be transparent huge pages that does that, but perhaps both of them do.

If it is the former, you can find other message on this list about disabling it.  If it is the latter, your best bet is to commit your bulk inserts as soon as possible (this might be improved for 9.5, if we can figure out how to test the alternatives). Please let us know what works.  

If lowering shared_buffers works, I wonder if disabling the transparent huge page compaction issue might let you bring shared_buffers back up again.  


Cheers,

Jeff

В списке pgsql-performance по дате отправления:

Предыдущее
От: Merlin Moncure
Дата:
Сообщение: Re: CPU spikes and transactions
Следующее
От: Craig James
Дата:
Сообщение: Stats collector constant I/O