Обсуждение: semop hanging - Postgres 9.4.4

Поиск

Список

Период

Сортировка

semop hanging - Postgres 9.4.4

От

Michał Nowak

Дата:

15 декабря 2015 г., 19:34:35

Hello,

Since couple of days on my Postgres cluster i've seen few times a hanged process. It looks like one of postgres workers
stuckon simple INSERT query. 

  PID USERNAME PRI NICE  SIZE   RES STATE   TIME   WCPU    CPU COMMAND
12173 postgres  20    0 1372M  345M run   166:18 24.47% 99.90% postgres: db01 db01 10.0.0.101(58713) INSERT


It utilizes 100% of one CPU and after 1-2 hours it starts to utilize memory. After another 1-2 hours OOM-Killer have to
takeaction. At the same time other postgres workers are properly processing queries.  

There's no error on syslog/dmesg. Strace:


semop(58261738, {{12, -1, 0}}, 1)       = 0
semop(58261738, {{12, -1, 0}}, 1)       = 0
semop(58261738, {{12, -1, 0}}, 1)       = 0
semop(58851580, {{0, 1, 0}}, 1)         = 0
semop(58261738, {{12, -1, 0}}, 1)       = 0
semop(58261738, {{12, -1, 0}}, 1)       = 0
semop(58261738, {{12, -1, 0}}, 1)       = 0
semop(57802972, {{1, 1, 0}}, 1)         = 0
select(0, NULL, NULL, NULL, {0, 1000})  = 0 (Timeout)
semop(58261738, {{12, -1, 0}}, 1)       = 0
semop(58261738, {{12, -1, 0}}, 1)       = 0
semop(58261738, {{12, -1, 0}}, 1)       = 0


Is this related to incorrect semaphores configuration?



Here's my current configuration:

sql01:~# ipcs -l

------ Shared Memory Limits --------
max number of segments = 4096
max seg size (kbytes) = 18014398509465599
max total shared memory (kbytes) = 18446744073642442748
min seg size (bytes) = 1

------ Semaphore Limits --------
max number of arrays = 2048
max semaphores per array = 250
max semaphores system wide = 512000
max ops per semop call = 100
semaphore max value = 32767

------ Messages Limits --------
max queues system wide = 32000
max size of message (bytes) = 8192
default max size of queue (bytes) = 16384


sql01:~# ipcs -u

------ Shared Memory Status --------
segments allocated 1
pages allocated 1
pages resident  1
pages swapped   0
Swap performance: 0 attempts     0 successes

------ Semaphore Status --------
used arrays = 257
allocated semaphores = 4369

------ Messages Status --------
allocated queues = 0
used headers = 0
used space = 0 bytes


Some values from postgresql.conf:
max_connections = 4096
effective_cache_size = 12GB
work_mem = 16MB
shared_buffers = 1GB


postgres=# select version();
                                               version
------------------------------------------------------------------------------------------------------
 PostgreSQL 9.4.4 on x86_64-unknown-linux-gnu, compiled by gcc (Ubuntu 4.8.2-19ubuntu1) 4.8.2, 64-bit


I would appreciate any help.

Best Regards,

Marcin

Re: semop hanging - Postgres 9.4.4

От

Shreeyansh Dba

Дата:

16 декабря 2015 г., 10:44:44

On Wed, Dec 16, 2015 at 1:04 AM, Michał Nowak <minowack@wp.pl> wrote:

Hello,

Since couple of days on my Postgres cluster i've seen few times a hanged process. It looks like one of postgres workers stuck on simple INSERT query.

PID USERNAME PRI NICE SIZE RES STATE TIME WCPU CPU COMMAND
12173 postgres 20 0 1372M 345M run 166:18 24.47% 99.90% postgres: db01 db01 10.0.0.101(58713) INSERT

It utilizes 100% of one CPU and after 1-2 hours it starts to utilize memory. After another 1-2 hours OOM-Killer have to take action. At the same time other postgres workers are properly processing queries.

There's no error on syslog/dmesg. Strace:

semop(58261738, {{12, -1, 0}}, 1) = 0
semop(58261738, {{12, -1, 0}}, 1) = 0
semop(58261738, {{12, -1, 0}}, 1) = 0
semop(58851580, {{0, 1, 0}}, 1) = 0
semop(58261738, {{12, -1, 0}}, 1) = 0
semop(58261738, {{12, -1, 0}}, 1) = 0
semop(58261738, {{12, -1, 0}}, 1) = 0
semop(57802972, {{1, 1, 0}}, 1) = 0
select(0, NULL, NULL, NULL, {0, 1000}) = 0 (Timeout)
semop(58261738, {{12, -1, 0}}, 1) = 0
semop(58261738, {{12, -1, 0}}, 1) = 0
semop(58261738, {{12, -1, 0}}, 1) = 0

Is this related to incorrect semaphores configuration?

Here's my current configuration:

sql01:~# ipcs -l

------ Shared Memory Limits --------
max number of segments = 4096
max seg size (kbytes) = 18014398509465599
max total shared memory (kbytes) = 18446744073642442748
min seg size (bytes) = 1

------ Semaphore Limits --------
max number of arrays = 2048
max semaphores per array = 250
max semaphores system wide = 512000
max ops per semop call = 100
semaphore max value = 32767

------ Messages Limits --------
max queues system wide = 32000
max size of message (bytes) = 8192
default max size of queue (bytes) = 16384

sql01:~# ipcs -u

------ Shared Memory Status --------
segments allocated 1
pages allocated 1
pages resident 1
pages swapped 0
Swap performance: 0 attempts 0 successes

------ Semaphore Status --------
used arrays = 257
allocated semaphores = 4369

------ Messages Status --------
allocated queues = 0
used headers = 0
used space = 0 bytes

Some values from postgresql.conf:
max_connections = 4096
effective_cache_size = 12GB
work_mem = 16MB
shared_buffers = 1GB

postgres=# select version();
version
------------------------------------------------------------------------------------------------------
PostgreSQL 9.4.4 on x86_64-unknown-linux-gnu, compiled by gcc (Ubuntu 4.8.2-19ubuntu1) 4.8.2, 64-bit

I would appreciate any help.

Best Regards,

Marcin

--
Sent via pgsql-admin mailing list (pgsql-admin@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-admin

Hi Miachal,

May I know what is the RAM of server and how much shmmax is configured ?

You can check if any Zombi or defunct process are running on the server.

I want to know if any maintenance activity performed same time?

Regards,

Sagar Jadhav
Database Administrator

www.shreeyansh.com

Re: semop hanging - Postgres 9.4.4

От

Michał Nowak

Дата:

16 декабря 2015 г., 11:31:47

Hello,

> There is no zombi nor defunct process. Now i have 3 stucked processes.

When i try to pg_terminate_backend nothing happens:

strace:

semop(57802972, {{9, -1, 0}}, 1)        = 0
semop(57802972, {{9, -1, 0}}, 1)        = 0
select(0, NULL, NULL, NULL, {0, 1000})  = 0 (Timeout)
select(0, NULL, NULL, NULL, {0, 1000})  = 0 (Timeout)
semop(57802972, {{9, -1, 0}}, 1)        = 0
--- SIGTERM {si_signo=SIGTERM, si_code=SI_USER, si_pid=4431, si_uid=105} ---
rt_sigreturn()                          = 139718030872480
select(0, NULL, NULL, NULL, {0, 1000})  = 0 (Timeout)
semop(57802972, {{9, -1, 0}}, 1)        = 0
semop(57802972, {{9, -1, 0}}, 1)        = 0
select(0, NULL, NULL, NULL, {0, 1000})  = 0 (Timeout)
select(0, NULL, NULL, NULL, {0, 1000})  = 0 (Timeout)
select(0, NULL, NULL, NULL, {0, 1000})  = 0 (Timeout)


Best Regards,
Michal

Re: semop hanging - Postgres 9.4.4

От

Kevin Grittner

Дата:

16 декабря 2015 г., 13:41:44

On Tue, Dec 15, 2015 at 1:34 PM, Michał Nowak <minowack@wp.pl> wrote:

> postgres=# select version();
>                                                version
> ------------------------------------------------------------------------------------------------------
>  PostgreSQL 9.4.4 on x86_64-unknown-linux-gnu, compiled by gcc (Ubuntu 4.8.2-19ubuntu1) 4.8.2, 64-bit

There is a bug fix in 9.4.5 which can cause a deadlock on
lightweight locking.

http://git.postgresql.org/gitweb/?p=postgresql.git;a=commit;h=bab959906911c97437f410a03b0346e6dd28d528

http://www.postgresql.org/support/versioning/

From a subsequent post:

> perf top:
>
>   17.04%  postgres                 [.] _bt_moveright
>   13.39%  postgres                 [.] LWLockAcquire
 >   8.82%  postgres                 [.] LWLockRelease
 >   8.11%  postgres                 [.] _bt_checkpage

The perf top output suggests the possibility of a corrupted index,
causing an endless loop.

Can you get a few stack traces from "stuck" backends?

--
Kevin Grittner
EDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Re: semop hanging - Postgres 9.4.4

От

Michał Nowak

Дата:

17 декабря 2015 г., 17:21:33

Hello,

Thanks for all replies.


> The perf top output suggests the possibility of a corrupted index,
> causing an endless loop.


Yes! Rebuilding indexes solves problem for this table. :)

But after 20h problem is back on another table. First table has about 2m records, second has only 200k.



> There is a bug fix in 9.4.5 which can cause a deadlock on
> lightweight locking.


I've just upgraded to 9.4.5 and hope that helps. I'll keep you posted.


Best Regards,
Michal

Re: semop hanging - Postgres 9.4.4

От

Michał Nowak

Дата:

21 декабря 2015 г., 13:18:25

Hello,

> I've just upgraded to 9.4.5 and hope that helps. I'll keep you posted.

After upgrade everything works fine for almost 4 days.

Thank you all for help.


Best Regards,
Michal

Вход в личный кабинет

Восстановление пароля

Подтверждение аккаунта

Изменение пароля

Обсуждение: semop hanging - Postgres 9.4.4

semop hanging - Postgres 9.4.4

Re: semop hanging - Postgres 9.4.4

Re: semop hanging - Postgres 9.4.4

Re: semop hanging - Postgres 9.4.4

Re: semop hanging - Postgres 9.4.4

Re: semop hanging - Postgres 9.4.4