Обсуждение: Weird CPU utilization patterns with Postgres

Поиск
Список
Период
Сортировка

Weird CPU utilization patterns with Postgres

От
István
Дата:
Hi,

We are having a really interesting problem with our Postgres 9.3 instance in our infrastructure.

Few days ago our box started to show huge CPU spikes while the IO Wait is negligible on the box. After a while I have installed perf and started to monitor the Postgres master process and here is what I have found:

Samples: 372K of event 'cycles', Event count (approx.): 110095222173, ThreaSamples: 372K of event 'cycles', Event count (approx.): 1100 93.65%  libc-2.12.so  [.] __strcoll_l
  0.97%  libc-2.12.so  [.] memcpy
  0.90%  postgres      [.] slot_getattr
  0.88%  postgres      [.] nocachegetattr
  0.64%  postgres      [.] varstr_cmp
  0.52%  libc-2.12.so  [.] __strcmp_sse42
  0.43%  postgres      [.] hash_any
  0.32%  postgres      [.] pg_detoast_datum_packed
  0.31%  libc-2.12.so  [.] __strlen_sse2
  0.22%  postgres      [.] bttextcmp
  0.18%  postgres      [.] ExecStoreTuple
  0.14%  postgres      [.] MemoryContextReset
  0.09%  postgres      [.] pgstat_end_function_usage
  0.08%  libc-2.12.so  [.] strcoll
  0.08%  postgres      [.] heap_hot_search_buffer
  0.07%  postgres      [.] lc_collate_is_c
  0.06%  [kernel]      [k] sys_semtimedop
  0.06%  postgres      [.] heap_page_prune_opt
  0.05%  postgres      [.] slot_getsomeattrs
  0.05%  postgres      [.] heap_fill_tuple
  0.04%  postgres      [.] hash_search
  0.03%  postgres      [.] GetMemoryChunkSpace
  0.03%  postgres      [.] heap_form_minimal_tuple
  0.03%  [kernel]      [k] update_queue
  0.02%  postgres      [.] ReadBufferExtended
  0.02%  postgres      [.] memcpy@plt

It seems that the box is using __strcoll a lot. The query performance is down, while previously the box was able to sustain with ~20 clients right now it is hardly able to keep up with 5.

I am wondering why the root cause might be here.

Let me know if anybody has seen this before.

Regards,
Istvan





--
the sun shines for all


Re: Weird CPU utilization patterns with Postgres

От
Peter Geoghegan
Дата:
On Fri, Dec 5, 2014 at 5:14 PM, István <leccine@gmail.com> wrote:
> I am wondering why the root cause might be here.

My guess would be that an important text-based sort operation began to
go to disk. The external sort code (tapesort) is known to do far more
comparisons than quicksort. With text sorts, you tend to see tapesort
very CPU bound, where that might not be the case with integer sorts.

I'm currently trying to fix this across the board [1], but my first
suggestion is to try enabling log_temp_files to see if external sorts
can be correlated with these stalls.

[1] https://commitfest.postgresql.org/action/patch_view?id=1462
--
Regards,
Peter Geoghegan


Re: Weird CPU utilization patterns with Postgres

От
Peter Geoghegan
Дата:
On Tue, Dec 9, 2014 at 5:46 PM, Peter Geoghegan
<peter.geoghegan86@gmail.com> wrote:
> I'm currently trying to fix this across the board [1], but my first
> suggestion is to try enabling log_temp_files to see if external sorts
> can be correlated with these stalls.

See also: http://www.postgresql.org/message-id/CAM3SWZTijoBPpqFF7mN3021Vvtu+5Fd1ymABQ8tLoV4zhfAqxA@mail.gmail.com

--
Regards,
Peter Geoghegan