Hi, hackers!
*** The problem ***
I'm investigating some cases of reduced database performance due to MultiXactOffsetLock contention (80%
MultiXactOffsetLock,20% IO DataFileRead).
The problem manifested itself during index repack and constraint validation. Both being effectively full table scans.
The database workload contains a lot of select for share\select for update queries. I've tried to construct synthetic
worldgenerator and could not achieve similar lock configuration: I see a lot of different locks in wait events,
particularlya lot more MultiXactMemberLocks. But from my experiments with synthetic workload, contention of
MultiXactOffsetLockcan be reduced by increasing NUM_MXACTOFFSET_BUFFERS=8 to bigger numbers.
*** Question 1 ***
Is it safe to increase number of buffers of MultiXact\All SLRUs, recompile and run database as usual?
I cannot experiment much with production. But I'm mostly sure that bigger buffers will solve the problem.
*** Question 2 ***
Probably, we could do GUCs for SLRU sizes? Are there any reasons not to do them configurable? I think multis, clog,
subtransactionsand others will benefit from bigger buffer. But, probably, too much of knobs can be confusing.
*** Question 3 ***
MultiXact offset lock is always taken as exclusive lock. It turns MultiXact Offset subsystem to single threaded. If
someonehave good idea how to make it more concurrency-friendly, I'm willing to put some efforts into this.
Probably, I could just add LWlocks for each offset buffer page. Is it something worth doing? Or are there any hidden
caversand difficulties?
Thanks!
Best regards, Andrey Borodin.