Обсуждение: Odp: Re: semop hanging - Postgres 9.4.4
Hi Sagar, > May I know what is the RAM of server and how much shmmax is configured ? > You can check if any Zombi or defunct process are running on the server. > I want to know if any maintenance activity performed same time? Thanks for you reply. Server has 16GB of RAM. Kernel shmmax has strange value, i think it's default on Ubuntu - i haven't changed that. kernel.shmall = 18446744073692774399 kernel.shmmax = 18446744073692774399 kernel.shmmni = 4096 I changed kermel.sem from defaults to: 250 512000 100 2048 There is no zombi nor defunct process. Now i have 3 stucked processes. First: psql01:~# strace -ffp 14135 Process 14135 attached select(0, NULL, NULL, NULL, {0, 1000}) = 0 (Timeout) select(0, NULL, NULL, NULL, {0, 1000}) = 0 (Timeout) select(0, NULL, NULL, NULL, {0, 1000}) = 0 (Timeout) select(0, NULL, NULL, NULL, {0, 1000}) = 0 (Timeout) semop(58261738, {{12, -1, 0}}, 1) = 0 semop(58261738, {{12, -1, 0}}, 1) = 0 semop(58261738, {{12, -1, 0}}, 1) = 0 select(0, NULL, NULL, NULL, {0, 1000}) = 0 (Timeout) select(0, NULL, NULL, NULL, {0, 1000}) = 0 (Timeout) semop(58261738, {{12, -1, 0}}, 1) = 0 semop(58261738, {{12, -1, 0}}, 1) = 0 semop(58261738, {{12, -1, 0}}, 1) = 0 semop(58261738, {{12, -1, 0}}, 1) = 0 semop(58261738, {{12, -1, 0}}, 1) = 0 select(0, NULL, NULL, NULL, {0, 1000}) = 0 (Timeout) select(0, NULL, NULL, NULL, {0, 1000}) = 0 (Timeout) Second: psql01:~# strace -ffp 12712 Process 12712 attached select(0, NULL, NULL, NULL, {0, 1000}) = 0 (Timeout) select(0, NULL, NULL, NULL, {0, 1000}) = 0 (Timeout) semop(57737434, {{6, 1, 0}}, 1) = 0 semop(57934048, {{2, -1, 0}}, 1) = 0 semop(57802972, {{7, 1, 0}}, 1) = 0 semop(57934048, {{2, -1, 0}}, 1) = 0 select(0, NULL, NULL, NULL, {0, 1000}) = 0 (Timeout) select(0, NULL, NULL, NULL, {0, 1000}) = 0 (Timeout) select(0, NULL, NULL, NULL, {0, 1000}) = 0 (Timeout) semop(57934048, {{2, -1, 0}}, 1) = 0 select(0, NULL, NULL, NULL, {0, 1000}) = 0 (Timeout) semop(57934048, {{2, -1, 0}}, 1) = 0 semop(57934048, {{2, -1, 0}}, 1) = 0 Third: psql02:~# strace -ffp 18283 Process 18283 attached semop(58523890, {{11, 1, 0}}, 1) = 0 semop(57802972, {{9, -1, 0}}, 1) = 0 select(0, NULL, NULL, NULL, {0, 1000}) = 0 (Timeout) semop(57802972, {{9, -1, 0}}, 1) = 0 semop(58818811, {{13, 1, 0}}, 1) = 0 semop(57802972, {{9, -1, 0}}, 1) = 0 select(0, NULL, NULL, NULL, {0, 1000}) = 0 (Timeout) select(0, NULL, NULL, NULL, {0, 1000}) = 0 (Timeout) semop(57802972, {{9, -1, 0}}, 1) = 0 select(0, NULL, NULL, NULL, {0, 1000}) = 0 (Timeout) select(0, NULL, NULL, NULL, {0, 1000}) = 0 (Timeout) select(0, NULL, NULL, NULL, {0, 1000}) = 0 (Timeout) select(0, NULL, NULL, NULL, {0, 1000}) = 0 (Timeout) perf top: 17.04% postgres [.] _bt_moveright 13.39% postgres [.] LWLockAcquire 8.82% postgres [.] LWLockRelease 8.11% postgres [.] _bt_checkpage 6.10% postgres [.] hash_search_with_hash_value 4.35% libc-2.19.so [.] __strcoll_l 3.16% postgres [.] 0x00000000002989ec 2.06% postgres [.] s_lock 1.99% postgres [.] _bt_compare 1.49% postgres [.] hash_any 1.41% postgres [.] 0x0000000000298a10 1.14% postgres [.] varstr_cmp 1.07% [kernel] [k] _raw_spin_unlock_irqrestore 1.00% libc-2.19.so [.] strlen 0.99% postgres [.] 0x0000000000298fdb 0.96% postgres [.] 0x00000000002989fc 0.80% postgres [.] LockBuffer 0.73% libc-2.19.so [.] 0x000000000009478e 0.60% postgres [.] ReadBufferExtended 0.49% postgres [.] bttextcmp 0.47% libc-2.19.so [.] 0x0000000000094787 0.44% [kernel] [k] finish_task_switch 0.40% postgres [.] _bt_relandgetbuf 0.40% postgres [.] ResourceOwnerForgetBuffer 0.40% libc-2.19.so [.] 0x0000000000094782 0.39% postgres [.] FunctionCall2Coll 0.38% postgres [.] ReleaseAndReadBuffer 0.31% postgres [.] pg_detoast_datum_packed Best regards, Michal
Hi Michal,
Sometimes terminating a query via normal methods of pg_cancel_backend() and pg_terminate_backend() fail and additional steps need to be taken.
ps -eflyCpostgres |grep PID(stuck pid)
OS level kill is not recommended but we don't have alternate solution in some scenarios we need to from OS
kill -11 'stuck pid'
You could decide shmmax depending upon the size of DB. We recommend to set shmmax to 6 GB and increase shared_buffers size to 2 GB.
Best regards,
Sagar(DBA)
Shreeyansh Technologies.
On Wed, Dec 16, 2015 at 4:44 PM, Michał Nowak <minowack@wp.pl> wrote:
Hi Sagar,
> May I know what is the RAM of server and how much shmmax is configured ?
> You can check if any Zombi or defunct process are running on the server.
> I want to know if any maintenance activity performed same time?
Thanks for you reply.
Server has 16GB of RAM. Kernel shmmax has strange value, i think it's default on Ubuntu - i haven't changed that.
kernel.shmall = 18446744073692774399
kernel.shmmax = 18446744073692774399
kernel.shmmni = 4096
I changed kermel.sem from defaults to: 250 512000 100 2048
There is no zombi nor defunct process. Now i have 3 stucked processes.
First:
psql01:~# strace -ffp 14135
Process 14135 attached
select(0, NULL, NULL, NULL, {0, 1000}) = 0 (Timeout)
select(0, NULL, NULL, NULL, {0, 1000}) = 0 (Timeout)
select(0, NULL, NULL, NULL, {0, 1000}) = 0 (Timeout)
select(0, NULL, NULL, NULL, {0, 1000}) = 0 (Timeout)
semop(58261738, {{12, -1, 0}}, 1) = 0
semop(58261738, {{12, -1, 0}}, 1) = 0
semop(58261738, {{12, -1, 0}}, 1) = 0
select(0, NULL, NULL, NULL, {0, 1000}) = 0 (Timeout)
select(0, NULL, NULL, NULL, {0, 1000}) = 0 (Timeout)
semop(58261738, {{12, -1, 0}}, 1) = 0
semop(58261738, {{12, -1, 0}}, 1) = 0
semop(58261738, {{12, -1, 0}}, 1) = 0
semop(58261738, {{12, -1, 0}}, 1) = 0
semop(58261738, {{12, -1, 0}}, 1) = 0
select(0, NULL, NULL, NULL, {0, 1000}) = 0 (Timeout)
select(0, NULL, NULL, NULL, {0, 1000}) = 0 (Timeout)
Second:
psql01:~# strace -ffp 12712
Process 12712 attached
select(0, NULL, NULL, NULL, {0, 1000}) = 0 (Timeout)
select(0, NULL, NULL, NULL, {0, 1000}) = 0 (Timeout)
semop(57737434, {{6, 1, 0}}, 1) = 0
semop(57934048, {{2, -1, 0}}, 1) = 0
semop(57802972, {{7, 1, 0}}, 1) = 0
semop(57934048, {{2, -1, 0}}, 1) = 0
select(0, NULL, NULL, NULL, {0, 1000}) = 0 (Timeout)
select(0, NULL, NULL, NULL, {0, 1000}) = 0 (Timeout)
select(0, NULL, NULL, NULL, {0, 1000}) = 0 (Timeout)
semop(57934048, {{2, -1, 0}}, 1) = 0
select(0, NULL, NULL, NULL, {0, 1000}) = 0 (Timeout)
semop(57934048, {{2, -1, 0}}, 1) = 0
semop(57934048, {{2, -1, 0}}, 1) = 0
Third:
psql02:~# strace -ffp 18283
Process 18283 attached
semop(58523890, {{11, 1, 0}}, 1) = 0
semop(57802972, {{9, -1, 0}}, 1) = 0
select(0, NULL, NULL, NULL, {0, 1000}) = 0 (Timeout)
semop(57802972, {{9, -1, 0}}, 1) = 0
semop(58818811, {{13, 1, 0}}, 1) = 0
semop(57802972, {{9, -1, 0}}, 1) = 0
select(0, NULL, NULL, NULL, {0, 1000}) = 0 (Timeout)
select(0, NULL, NULL, NULL, {0, 1000}) = 0 (Timeout)
semop(57802972, {{9, -1, 0}}, 1) = 0
select(0, NULL, NULL, NULL, {0, 1000}) = 0 (Timeout)
select(0, NULL, NULL, NULL, {0, 1000}) = 0 (Timeout)
select(0, NULL, NULL, NULL, {0, 1000}) = 0 (Timeout)
select(0, NULL, NULL, NULL, {0, 1000}) = 0 (Timeout)
perf top:
17.04% postgres [.] _bt_moveright
13.39% postgres [.] LWLockAcquire
8.82% postgres [.] LWLockRelease
8.11% postgres [.] _bt_checkpage
6.10% postgres [.] hash_search_with_hash_value
4.35% libc-2.19.so [.] __strcoll_l
3.16% postgres [.] 0x00000000002989ec
2.06% postgres [.] s_lock
1.99% postgres [.] _bt_compare
1.49% postgres [.] hash_any
1.41% postgres [.] 0x0000000000298a10
1.14% postgres [.] varstr_cmp
1.07% [kernel] [k] _raw_spin_unlock_irqrestore
1.00% libc-2.19.so [.] strlen
0.99% postgres [.] 0x0000000000298fdb
0.96% postgres [.] 0x00000000002989fc
0.80% postgres [.] LockBuffer
0.73% libc-2.19.so [.] 0x000000000009478e
0.60% postgres [.] ReadBufferExtended
0.49% postgres [.] bttextcmp
0.47% libc-2.19.so [.] 0x0000000000094787
0.44% [kernel] [k] finish_task_switch
0.40% postgres [.] _bt_relandgetbuf
0.40% postgres [.] ResourceOwnerForgetBuffer
0.40% libc-2.19.so [.] 0x0000000000094782
0.39% postgres [.] FunctionCall2Coll
0.38% postgres [.] ReleaseAndReadBuffer
0.31% postgres [.] pg_detoast_datum_packed
Best regards,
Michal
--
Sent via pgsql-admin mailing list (pgsql-admin@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-admin
Hi Michal,
You could try using pg_ctl option as given below
pg_ctl kill TERM <stuck pid>
Regards,
Sagar(DBA)
Shreeyansh Technologies.
On Wed, Dec 16, 2015 at 5:20 PM, Shreeyansh Dba <shreeyansh2014@gmail.com> wrote:
Hi Michal,Sometimes terminating a query via normal methods of pg_cancel_backend() and pg_terminate_backend() fail and additional steps need to be taken.ps -eflyCpostgres |grep PID(stuck pid)OS level kill is not recommended but we don't have alternate solution in some scenarios we need to from OSkill -11 'stuck pid'You could decide shmmax depending upon the size of DB. We recommend to set shmmax to 6 GB and increase shared_buffers size to 2 GB.Best regards,Sagar(DBA)Shreeyansh Technologies.On Wed, Dec 16, 2015 at 4:44 PM, Michał Nowak <minowack@wp.pl> wrote:Hi Sagar,
> May I know what is the RAM of server and how much shmmax is configured ?
> You can check if any Zombi or defunct process are running on the server.
> I want to know if any maintenance activity performed same time?
Thanks for you reply.
Server has 16GB of RAM. Kernel shmmax has strange value, i think it's default on Ubuntu - i haven't changed that.
kernel.shmall = 18446744073692774399
kernel.shmmax = 18446744073692774399
kernel.shmmni = 4096
I changed kermel.sem from defaults to: 250 512000 100 2048
There is no zombi nor defunct process. Now i have 3 stucked processes.
First:
psql01:~# strace -ffp 14135
Process 14135 attached
select(0, NULL, NULL, NULL, {0, 1000}) = 0 (Timeout)
select(0, NULL, NULL, NULL, {0, 1000}) = 0 (Timeout)
select(0, NULL, NULL, NULL, {0, 1000}) = 0 (Timeout)
select(0, NULL, NULL, NULL, {0, 1000}) = 0 (Timeout)
semop(58261738, {{12, -1, 0}}, 1) = 0
semop(58261738, {{12, -1, 0}}, 1) = 0
semop(58261738, {{12, -1, 0}}, 1) = 0
select(0, NULL, NULL, NULL, {0, 1000}) = 0 (Timeout)
select(0, NULL, NULL, NULL, {0, 1000}) = 0 (Timeout)
semop(58261738, {{12, -1, 0}}, 1) = 0
semop(58261738, {{12, -1, 0}}, 1) = 0
semop(58261738, {{12, -1, 0}}, 1) = 0
semop(58261738, {{12, -1, 0}}, 1) = 0
semop(58261738, {{12, -1, 0}}, 1) = 0
select(0, NULL, NULL, NULL, {0, 1000}) = 0 (Timeout)
select(0, NULL, NULL, NULL, {0, 1000}) = 0 (Timeout)
Second:
psql01:~# strace -ffp 12712
Process 12712 attached
select(0, NULL, NULL, NULL, {0, 1000}) = 0 (Timeout)
select(0, NULL, NULL, NULL, {0, 1000}) = 0 (Timeout)
semop(57737434, {{6, 1, 0}}, 1) = 0
semop(57934048, {{2, -1, 0}}, 1) = 0
semop(57802972, {{7, 1, 0}}, 1) = 0
semop(57934048, {{2, -1, 0}}, 1) = 0
select(0, NULL, NULL, NULL, {0, 1000}) = 0 (Timeout)
select(0, NULL, NULL, NULL, {0, 1000}) = 0 (Timeout)
select(0, NULL, NULL, NULL, {0, 1000}) = 0 (Timeout)
semop(57934048, {{2, -1, 0}}, 1) = 0
select(0, NULL, NULL, NULL, {0, 1000}) = 0 (Timeout)
semop(57934048, {{2, -1, 0}}, 1) = 0
semop(57934048, {{2, -1, 0}}, 1) = 0
Third:
psql02:~# strace -ffp 18283
Process 18283 attached
semop(58523890, {{11, 1, 0}}, 1) = 0
semop(57802972, {{9, -1, 0}}, 1) = 0
select(0, NULL, NULL, NULL, {0, 1000}) = 0 (Timeout)
semop(57802972, {{9, -1, 0}}, 1) = 0
semop(58818811, {{13, 1, 0}}, 1) = 0
semop(57802972, {{9, -1, 0}}, 1) = 0
select(0, NULL, NULL, NULL, {0, 1000}) = 0 (Timeout)
select(0, NULL, NULL, NULL, {0, 1000}) = 0 (Timeout)
semop(57802972, {{9, -1, 0}}, 1) = 0
select(0, NULL, NULL, NULL, {0, 1000}) = 0 (Timeout)
select(0, NULL, NULL, NULL, {0, 1000}) = 0 (Timeout)
select(0, NULL, NULL, NULL, {0, 1000}) = 0 (Timeout)
select(0, NULL, NULL, NULL, {0, 1000}) = 0 (Timeout)
perf top:
17.04% postgres [.] _bt_moveright
13.39% postgres [.] LWLockAcquire
8.82% postgres [.] LWLockRelease
8.11% postgres [.] _bt_checkpage
6.10% postgres [.] hash_search_with_hash_value
4.35% libc-2.19.so [.] __strcoll_l
3.16% postgres [.] 0x00000000002989ec
2.06% postgres [.] s_lock
1.99% postgres [.] _bt_compare
1.49% postgres [.] hash_any
1.41% postgres [.] 0x0000000000298a10
1.14% postgres [.] varstr_cmp
1.07% [kernel] [k] _raw_spin_unlock_irqrestore
1.00% libc-2.19.so [.] strlen
0.99% postgres [.] 0x0000000000298fdb
0.96% postgres [.] 0x00000000002989fc
0.80% postgres [.] LockBuffer
0.73% libc-2.19.so [.] 0x000000000009478e
0.60% postgres [.] ReadBufferExtended
0.49% postgres [.] bttextcmp
0.47% libc-2.19.so [.] 0x0000000000094787
0.44% [kernel] [k] finish_task_switch
0.40% postgres [.] _bt_relandgetbuf
0.40% postgres [.] ResourceOwnerForgetBuffer
0.40% libc-2.19.so [.] 0x0000000000094782
0.39% postgres [.] FunctionCall2Coll
0.38% postgres [.] ReleaseAndReadBuffer
0.31% postgres [.] pg_detoast_datum_packed
Best regards,
Michal
--
Sent via pgsql-admin mailing list (pgsql-admin@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-admin
Michal Nowak wrote: > There is no zombi nor defunct process. Now i have 3 stucked processes. Can you take a stack trace of the stuck processes? https://wiki.postgresql.org/wiki/Getting_a_stack_trace_of_a_running_PostgreSQL_backend_on_Linux/BSD Yours, Laurenz Albe