Re: Help troubleshooting SubtransControlLock problems

Поиск
Список
Период
Сортировка
От Scott Frazer
Тема Re: Help troubleshooting SubtransControlLock problems
Дата
Msg-id CA+ey=a=gw5MW8aK+UZVFKzYD3xoHZf2-qZzj+CCGAw9AgTWODA@mail.gmail.com
обсуждение исходный текст
Ответ на Re: Help troubleshooting SubtransControlLock problems  (Rene Romero Benavides <rene.romero.b@gmail.com>)
Список pgsql-general
These don't seem like normal locks. Nothing shows up in a  "SELECT relation::regclass, * FROM pg_locks WHERE NOT GRANTED;"

These processes are all active but the wait_event and wait_event_type fields indicate they are waiting on (I believe) shared memory locks.

  pid   | usesysid |   usename    |              application_name               |         state_change          | wait_e

vent_type |     wait_event      | state  

--------+----------+--------------+---------------------------------------------+-------------------------------+-------

----------+---------------------+--------

 165024 |    16392 | content_user | application | nymapp01.nym | 6192  | 2018-03-07 09:19:09.770155-06 | LWLock

Named     | SubtransControlLock | active





On Tue, Mar 6, 2018 at 11:43 PM, Rene Romero Benavides <rene.romero.b@gmail.com> wrote:
For such issues, I find this view very useful (the first one):

Examine blocking_pid's ,  and tell us what kind of operation is blocking the other processes . Also, are there many long running transactions in your server?


2018-03-06 21:24 GMT-06:00 Scott Frazer <sfrazer@couponcabin.com>:
Hi, we have a Postgres 9.6 setup using replication that has recently started seeing a lot of processes stuck in "SubtransControlLock" as a wait_event on the read-replicas. Like this, only usually about 300-800 of them:


 179706 | LWLockNamed     | SubtransControlLock

 186602 | LWLockNamed     | SubtransControlLock

 186606 | LWLockNamed     | SubtransControlLock

 180947 | LWLockNamed     | SubtransControlLock

 186621 | LWLockNamed     | SubtransControlLock
The server then begins to crawl, with some queries just never finishing until I finally shut the server down.


Searching for that particular combo of wait_event_type and wait_event only seems to turn up the page about statistics collection, but no helpful information on troubleshooting this lock.

Restarting the replica server clears the locks and allows us to start working again, but it's happened twice now in 12 hours and I'm worried it will happen again.

Does anyone have any advice on where to start looking?

Thanks,
Scott




--
El genio es 1% inspiración y 99% transpiración.
Thomas Alva Edison
http://pglearn.blogspot.mx/


В списке pgsql-general по дате отправления:

Предыдущее
От: Stephen Frost
Дата:
Сообщение: Re: Authentication?
Следующее
От: Benedict Holland
Дата:
Сообщение: Re: Authentication?