Re: Configurable FP_LOCK_SLOTS_PER_BACKEND

Поиск
Список
Период
Сортировка
От Matt Smiley
Тема Re: Configurable FP_LOCK_SLOTS_PER_BACKEND
Дата
Msg-id CA+eRB3qn8crRpykquMd4VO-LdKcqEPQRW6k_XWih7N0CeODfvw@mail.gmail.com
обсуждение исходный текст
Ответ на Re: Configurable FP_LOCK_SLOTS_PER_BACKEND  (Tomas Vondra <tomas.vondra@enterprisedb.com>)
Ответы Re: Configurable FP_LOCK_SLOTS_PER_BACKEND
Re: Configurable FP_LOCK_SLOTS_PER_BACKEND
Список pgsql-hackers
I thought it might be helpful to share some more details from one of the case studies behind Nik's suggestion.

Bursty contention on lock_manager lwlocks recently became a recurring cause of query throughput drops for GitLab.com, and we got to study the behavior via USDT and uprobe instrumentation along with more conventional observations (see https://gitlab.com/gitlab-com/gl-infra/scalability/-/issues/2301).  This turned up some interesting finds, and I thought sharing some of that research might be helpful.

Results so far suggest that increasing FP_LOCK_SLOTS_PER_BACKEND would have a much larger positive impact than any other mitigation strategy we have evaluated.  Rather than reducing hold duration or collision rate, adding fastpath slots reduces the frequency of even having to acquire those lock_manager lwlocks.  I suspect this would be helpful for many other workloads, particularly those having high frequency queries whose tables collectively have more than about 16 or indexes.

Lowering the lock_manager lwlock acquisition rate means lowering its contention rate (and probably also its contention duration, since exclusive mode forces concurrent lockers to queue).

I'm confident this would help our workload, and I strongly suspect it would be generally helpful by letting queries use fastpath locking more often.

> However, the lmgr/README says this is meant to alleviate contention on
> the lmgr partition locks. Wouldn't it be better to increase the number
> of those locks, without touching the PGPROC stuff?

That was my first thought too, but growing the lock_manager lwlock tranche isn't nearly as helpful.

On the slowpath, each relation's lock tag deterministically hashes onto a specific lock_manager lwlock, so growing the number of lock_manager lwlocks just makes it less likely for two or more frequently locked relations to hash onto the same lock_manager.

In contrast, growing the number of fastpath slots completely avoids calls to the slowpath (i.e. no need to acquire a lock_manager lwlock).

The saturation condition we'd like to solve is heavy contention on one or more of the lock_manager lwlocks.  Since that is driven by the slowpath acquisition rate of heavyweight locks, avoiding the slowpath is better than just moderately reducing the contention on the slowpath.

To be fair, increasing the number of lock_manager locks definitely can help to a certain extent, but it doesn't cover an important general case.  As a thought experiment, suppose we increase the lock_manager tranche to some arbitrarily large size, larger than the number of relations in the db.  This unrealistically large size means we have the best case for avoiding collisions -- each relation maps uniquely onto its own lock_manager lwlock.  That helps a lot in the case where the workload is spread among many non-overlapping sets of relations.  But it doesn't help a workload where any one table is accessed frequently via slowpath locking.

Continuing the thought experiment, if that frequently queried table has 16 or more indexes, or if it is joined to other tables that collectively add up to over 16 relations, then each of those queries is guaranteed to have to use the slowpath and acquire the deterministically associated lock_manager lwlocks.

So growing the tranche of lock_manager lwlocks would help for some workloads, while other workloads would not be helped much at all.  (As a concrete example, a workload at GitLab has several frequently queried tables with over 16 indexes that consequently always use at least some slowpath locks.)

For additional context:

https://gitlab.com/gitlab-com/gl-infra/scalability/-/issues/2301#what-influences-lock_manager-lwlock-acquisition-rate
Summarizes the pathology and its current mitigations.

https://gitlab.com/gitlab-com/gl-infra/scalability/-/issues/2301#note_1357834678
Documents the supporting research methodology.

https://gitlab.com/gitlab-com/gl-infra/scalability/-/issues/2301#note_1365370510
What code paths require an exclusive mode lwlock for lock_manager?

https://gitlab.com/gitlab-com/gl-infra/scalability/-/issues/2301#note_1365595142
Comparison of fastpath vs. slowpath locking, including quantifying the rate difference.

https://gitlab.com/gitlab-com/gl-infra/scalability/-/issues/2301#note_1365630726
Confirms the acquisition rate of lock_manager locks is not uniform.  The sampled workload has a 3x difference in the most vs. least frequently acquired lock_manager lock, corresponding to the workload's most frequently accessed relations.

> Well, that has a cost too, as it makes PGPROC larger, right? At the
> moment that struct is already ~880B / 14 cachelines, adding 48 XIDs
> would make it +192B / +3 cachelines. I doubt that won't impact other
> common workloads ...

That's true; growing the data structure may affect L2/L3 cache hit rates when touching PGPROC.  Is that cost worth the benefit of using fastpath for a higher percentage of table locks?  The answer may be workload- and platform-specific.  Exposing this as a GUC gives the admin a way to make a different choice if our default (currently 16) is bad for them.

I share your reluctance to add another low-level tunable, but like many other GUCs, having a generally reasonable default that can be adjusted is better than forcing folks to fork postgres to adjust a compile-time constant.  And unfortunately I don't see a better way to solve this problem.  Growing the lock_manager lwlock tranche isn't as effective, because it doesn't help workloads where one or more relations are locked frequently enough to hit this saturation point.

Handling a larger percentage of heavyweight lock acquisitions via fastpath instead of slowpath seems likely to help many high-throughput workloads, since it avoids having to exclusively acquire an lwlock.  It seemed like the least intrusive general-purpose solution we've come up with so far.  That's why we wanted to solicit feedback or new ideas from the community.  Currently, the only options folks have to solve this class of saturation are through some combination of schema changes, application changes, vertical scaling, and spreading the query rate among more postgres instances.  Those are not feasible and efficient options.  Lacking a better solution, exposing a GUC that rarely needs tuning seems reasonable to me.

Anyway, hopefully the extra context is helpful!  Please do share your thoughts.

--
Matt Smiley | Staff Site Reliability Engineer at GitLab

В списке pgsql-hackers по дате отправления:

Предыдущее
От: Tom Lane
Дата:
Сообщение: First draft of back-branch release notes is done
Следующее
От: Andy Fan
Дата:
Сообщение: Re: Extract numeric filed in JSONB more effectively