Add LWLock blocker(s) information

Поиск
Список
Период
Сортировка
От Drouvot, Bertrand
Тема Add LWLock blocker(s) information
Дата
Msg-id 71779d05-3a9b-b897-f54d-16ffedcc896b@amazon.com
обсуждение исходный текст
Ответы Re: Add LWLock blocker(s) information  ("Drouvot, Bertrand" <bdrouvot@amazon.com>)
Re: Add LWLock blocker(s) information  (Robert Haas <robertmhaas@gmail.com>)
Список pgsql-hackers

Hi hackers,


I've attached a patch to add blocker(s) information for LW Locks.
The motive behind is to be able to get some blocker(s) information (if any) in the context of LW Locks.

Motivation:

We have seen some cases with heavy contention on some LW Locks (large number of backends waiting on the same LW Lock).

Adding some blocker information would make the investigations easier, it could help answering questions like:

  • how many PIDs are holding the LWLock (could be more than one in case of LW_SHARED)?
  • Is the blocking PID changing?
  • Is the number of blocking PIDs changing?
  • What is the blocking PID doing?
  • Is the blocking PID waiting?
  • In which mode request is the blocked PID?
  • in which mode is the blocker PID holding the lock?

Technical context and proposal: There is 2 points in this patch:

  • Add the instrumentation:
    • the patch adds into the LWLock struct:

                    last_holding_pid: last pid owner of the lock
                    last_mode: last holding mode of the last pid owner of the lock
                    nholders: number of holders (could be >1 in case of LW_SHARED)

    • the patch adds into the PGPROC struct:

                    lwLastHoldingPid: last holder of the LW lock the PID is waiting for
                    lwHolderMode;  LW lock mode of last holder of the LW lock the PID is waiting for
                    lwNbHolders: number of holders of the LW lock the PID is waiting for

            and what is necessary to update this new information.

  • Provide a way to display the information: the patch also adds a function pg_lwlock_blocking_pid to display this new information.

Outcome Example:

# select * from pg_lwlock_blocking_pid(10259);
 requested_mode | last_holder_pid | last_holder_mode | nb_holders
----------------+-----------------+------------------+------------
 LW_EXCLUSIVE   |           10232 | LW_EXCLUSIVE     |          1
(1 row)
 
 # select query,pid,state,wait_event,wait_event_type,pg_lwlock_blocking_pid(pid),pg_blocking_pids(pid) from pg_stat_activity where state='active' and pid != pg_backend_pid();
             query              |  pid  | state  |  wait_event   | wait_event_type |          pg_lwlock_blocking_pid           | pg_blocking_pids
--------------------------------+-------+--------+---------------+-----------------+-------------------------------------------+------------------
 insert into bdtlwa values (1); | 10232 | active |               |                 | (,,,)                                     | {}
 insert into bdtlwb values (1); | 10254 | active | WALInsert     | LWLock          | (LW_WAIT_UNTIL_FREE,10232,LW_EXCLUSIVE,1) | {}
 create table bdtwt (a int);    | 10256 | active | WALInsert     | LWLock          | (LW_WAIT_UNTIL_FREE,10232,LW_EXCLUSIVE,1) | {}
 insert into bdtlwa values (2); | 10259 | active | BufferContent | LWLock          | (LW_EXCLUSIVE,10232,LW_EXCLUSIVE,1)       | {}
 drop table bdtlwd;             | 10261 | active | WALInsert     | LWLock          | (LW_WAIT_UNTIL_FREE,10232,LW_EXCLUSIVE,1) | {}
(5 rows)


So, should a PID being blocked on a LWLock we could see:

  • in which mode request it is waiting
  • the last pid holding the lock
  • the mode of the last PID holding the lock
  • the number of PID(s) holding the lock

Remarks:

I did a few benchmarks so far and did not observe notable performance degradation (can share more details if needed).

I did some quick attempts to get an exhaustive list of blockers (in case of LW_SHARED holders), but I think that would be challenging as:

  • There is about 40 000 calls to LWLockInitialize and all my attempts to init a list here produced “ FATAL: out of shared memory” or similar.
  • One way to get rid of using a list in LWLock could be to use proc_list (with proclist_head in LWLock and proclist_node in PGPROC). This is the current implementation for the “waiters” list. But this would not work for the blockers as one PGPROC can hold multiples LW locks so it could mean having a list of about 40K proclist_node per PGPROC.
  • I also have concerns about possible performance impact by using such a huge list in this context.

Those are the reasons why this patch does not provide an exhaustive list of blockers.

While this patch does not provide an exhaustive list of blockers (in case of LW_SHARED holders), the information it delivers could already be useful to get insights during LWLock contention scenario.

I will add this patch to the next commitfest. I look forward to your feedback about the idea and/or implementation.

Regards, Bertrand

Вложения

В списке pgsql-hackers по дате отправления:

Предыдущее
От: Amit Kapila
Дата:
Сообщение: Re: PATCH: logical_work_mem and logical streaming of largein-progress transactions
Следующее
От: Andy Fan
Дата:
Сообщение: Re: Index Skip Scan