Re: Add progressive backoff to XactLockTableWait functions
От | Xuneng Zhou |
---|---|
Тема | Re: Add progressive backoff to XactLockTableWait functions |
Дата | |
Msg-id | CABPTF7W9CqbisVWT7DwfcnmUW7iUJXd2h9VFd20HmDBD6Ni45A@mail.gmail.com обсуждение исходный текст |
Ответ на | Re: Add progressive backoff to XactLockTableWait functions (Xuneng Zhou <xunengzhou@gmail.com>) |
Список | pgsql-hackers |
Hi, On Wed, Jul 30, 2025 at 11:42 PM Xuneng Zhou <xunengzhou@gmail.com> wrote: > > Hi, > > On Mon, Jul 28, 2025 at 7:14 PM Xuneng Zhou <xunengzhou@gmail.com> wrote: > > > > Hi, > > > > On Thu, Jul 17, 2025 at 10:54 PM Xuneng Zhou <xunengzhou@gmail.com> wrote: > > > > > > Hi, > > > > > > After studying proarray and lmgr more closely, I have found several > > > critical issues in the two patches and underestimated their complexity > > > and subtlety. Sorry for posting immature patches that may have created > > > noise. > > > > > > I now realize that making lock acquisition— (void) LockAcquire(&tag, > > > ShareLock, false, false); in XactLockTableWait > > > work on a standby, following Andres’ first suggestion, may be simpler > > > and require fewer new helpers. > > > > > > XactLockTableWait fails on a standby simply because we never call > > > XactLockTableInsert. I am considering invoking XactLockTableInsert > > > when an XID is added to KnownAssignedXids, and XactLockTableDelete(the > > > constraint for delete now is not used for main xid in primary) when it > > > is removed. A preliminary implementation and test shows this approach > > > kinda works, but still need to confirm it is safe on a standby and > > > does not break other things. > > > > Adding an xact lock for every tracked assigned XID on a standby—by > > enabling XactLockTableInsert and XactLockTableDelete in the > > KnownAssignedXids infrastructure— appears less attractive than I first > > thought. The main concern is that the actual rate of xact-lock usage > > may not justify the added overhead, even if that overhead is modest. > > > > 1) Usage > > On a primary server, a xact lock is taken for every assigned XID, and > > the cost is justified because xact locks are referenced in many code > > paths. On a standby, however—especially in versions where logical > > decoding is disabled—xact locks are used far less, if at all. The > > current call stacks for XactLockTableWait on a standby (HEAD) are > > listed in [1]. The only high-frequency caller is the logical walsender > > in a cascading standby; all other callers are infrequent. Does that > > single use case warrant creating a xact lock for every known assigned > > XID? I don't think so, given that typical configurations have few > > cascading standbys. In practice, most xact locks may never be touched > > after they are created. > > > > 2) Cost > > Low usage would be fine if the additional cost were negligible, but > > that does not appear to be the case AFAICS. > > > > * There are 5 main uses of the KnownAssignedXids data structure: > > * * backends taking snapshots - all valid XIDs need to be copied out > > * * backends seeking to determine presence of a specific XID > > * * startup process adding new known-assigned XIDs > > * * startup process removing specific XIDs as transactions end > > * * startup process pruning array when special WAL records arrive > > * > > * This data structure is known to be a hot spot during Hot Standby, so we > > * go to some lengths to make these operations as efficient and as concurrent > > * as possible. > > > > KnownAssignedXids is a hot spot like the comments state. The existing > > design replaces a hash table with a single array and minimizes > > exclusive locks and memory barriers to preserve concurrency. To > > respect that design, calls to XactLockTableInsert and > > XactLockTableDelete would need to occur outside any exclusive lock. > > Unfortunately, that re-introduces the polling we are trying to > > eliminate in XactLockTableWait: there is a window between registering > > the XIDs and adding their xact locks. During that window, a backend > > calling XactLockTableWait may need to poll because > > TransactionIdIsInProgress shows the transaction as running while the > > xact lock is still missing. If the window were short, this might be > > acceptable, but it can be lengthy because XIDs are inserted and > > deleted from the array in bulk. Some unlucky backends will therefore > > spin before they can actually wait. > > > > Placing the XactLockTableInsert/Delete calls inside the exclusive lock > > removes the race window but hurts concurrency—the very issue this > > module strives to avoid. Operations on the shared-memory hash table in > > lmgr are slower and hold the lock longer than simple array updates, so > > backends invoking GetSnapshotData could experience longer wait times. > > > > 3) Alternative > > Adopt the hash-table + condition-variable design from patch v6, with > > additional fixes and polishes. > > Back-ends that call XactLockTableWait() sleep on the condition > > variable. The startup process broadcasts a wake-up as the XID is > > removed from KnownAssignedXids. The broadcast happens after releasing > > the exclusive lock. This may be safe—sleeping back-ends remain blocked > > until the CV signal arrives, so no extra polling occurs even though > > the XID has already vanished from the array. For consistency, all > > removal paths (including those used during shutdown or promotion, > > where contention is minimal) follow the same pattern. The hash table > > is sized at 2 × MaxBackends, which is already conservative for today’s > > workload. For more potential concurrent use cases in the future, it > > could be switched to larger numbers like TOTAL_MAX_CACHED_SUBXIDS if > > necessary; even that would consume only a few megabytes. > > > > Feedbacks welcome. > > > > [1] https://www.postgresql.org/message-id/CABPTF7Wbp7MRPGsqd9NA4GbcSzUcNz1ymgWfir=Yf+N0oDRbjA@mail.gmail.com > > Because XactLocks are not maintained on a standby server, and the > current waiting approach does not rely on Xact locks as well; I > therefore added assertions to both functions to prevent their use on a > standby and introduced an explicit primary-versus-standby branch in > SnapBuildWaitSnapshot in v8. > > In procarray.c, two corrections are made compared with v7: > Removed the seemingly redundant initialized field from XidWaitEntry. > Removed the erroneous InHotStandby checks in the wait and wake helpers. > > With these changes applied, here's some perf stats for v8, which are > about one order of magnitude smaller than those of head. > > Performance counter stats for process id '3273840': > > 351,183,876 cycles > 126,586,090 instructions # 0.36 > insn per cycle > 16,670,633 cache-misses > 10,030 context-switches > > 100.001419856 seconds time elapsed > V9 replaces the original partitioned xid-wait htab with a single, unified one, reflecting the modest entry count and rare contention for waiting. To prevent possible races when multiple backends wait on the same XID for the first time in XidWaitOnStandby, a dedicated lock has been added to protect the hash table. Best, Xuneng
Вложения
В списке pgsql-hackers по дате отправления: