long wait times in ProcessCatchupEvent()

От: bock@openit.de (Julian v. Bock)
Тема: long wait times in ProcessCatchupEvent()
Дата: ,
Msg-id: lptyhw1wml.fsf@warpcore.i.openit.de
(см: обсуждение, исходный текст)
Ответы: Re: long wait times in ProcessCatchupEvent()  (Tom Lane)
Re: long wait times in ProcessCatchupEvent()  (Craig James)
Список: pgsql-performance

Скрыть дерево обсуждения

long wait times in ProcessCatchupEvent()  ( (Julian v. Bock), )
 Re: long wait times in ProcessCatchupEvent()  (Tom Lane, )
  Re: long wait times in ProcessCatchupEvent()  ( (Julian v. Bock), )
 Re: long wait times in ProcessCatchupEvent()  (Craig James, )
  Re: long wait times in ProcessCatchupEvent()  (Tom Lane, )
   Re: long wait times in ProcessCatchupEvent()  (Craig James, )
   Re: long wait times in ProcessCatchupEvent()  (Mladen Gogala, )
    Re: long wait times in ProcessCatchupEvent()  ("Kevin Grittner", )

Hi

I have the problem that on our servers it happens regularly under a
certain workload (several times per minute) that all backend processes
get a SIGUSR1 and spend several seconds in ProcessCatchupEvent(). At
100-200 connections (most of them idle) this causes the system load to
skyrocket. I am not really familiar with the code but my wild guess is
that the processes spend most of their time waiting for spinlocks.

We have reduced the number of connections as much as possible for now
but it still makes up for roughly 50% of the total CPU time.  Has
anyone experienced a similar problem?

I can reproduce the issue on a test system with production data but it
is not so easy to pinpoint what exactly causes the problem. The queries
are basically tsearch2 full text searches over moderately big tables
(~35GB). The queries are performed by functions which aggregate data
from partitions in temporary tables, cache some data, and perform
calculations before returning it to the user.

The PostgreSQL version is 8.3.12, the test server has 8 amd64 cores
and 16GB of ram. I experimented with shared_buffers between 1GB and
4GB but it doesn't make much of a difference. Disk IO doesn't seem to
be an issue here.

Regards,
Julian v. Bock

--
Julian v. Bock               Projektleitung Software-Entwicklung
OpenIT GmbH                  Tel +49 211 239 577-0
In der Steele 33a-41         Fax +49 211 239 577-10
D-40599 Düsseldorf           http://www.openit.de
________________________________________________________________
HRB 38815 Amtsgericht Düsseldorf             USt-Id DE 812951861
Geschäftsführer: Oliver Haakert, Maurice Kemmann


В списке pgsql-performance по дате сообщения:

От: Tom Lane
Дата:
Сообщение: Re: long wait times in ProcessCatchupEvent()
От: Craig James
Дата:
Сообщение: Re: long wait times in ProcessCatchupEvent()