[PATCH] Optimize ProcSignal to avoid redundant SIGUSR1 signals
От | Joel Jacobson |
---|---|
Тема | [PATCH] Optimize ProcSignal to avoid redundant SIGUSR1 signals |
Дата | |
Msg-id | a0b12a70-8200-4bd4-9e24-56796314bdce@app.fastmail.com обсуждение исходный текст |
Ответы |
Re: [PATCH] Optimize ProcSignal to avoid redundant SIGUSR1 signals
|
Список | pgsql-hackers |
Hi hackers, In the work of trying to optimize async.c, I came across a surprisingly seemingly low hanging fruit in procsignal.c, to $subject. This optimization improves not only LISTEN/NOTIFY, but procsignal.c in general, for all ProcSignalReasons, by avoiding to send redundant signals in the case when the backend hasn't received and handled them yet. --- PATCH --- Previously, ProcSignal used an array of volatile sig_atomic_t flags, one per signal reason. A sender would set a flag and then unconditionally send a SIGUSR1 to the target process. This could result in a storm of redundant signals if multiple processes signaled the same target before it had a chance to run its signal handler. Change this to use a single pg_atomic_uint32 as a bitmask of pending signals. When sending, use pg_atomic_fetch_or_u32 to set the appropriate signal bit and inspect the prior state of the flags word. Then only issue a SIGUSR1 if the previous flags state was zero. This works safely because the receiving backend's signal handler atomically resets the entire bitmask upon receipt, thus processing all pending signals at once. Consequently, subsequent senders seeing a nonzero prior state know a signal is already in flight, significantly reducing redundant kill(pid, SIGUSR1) system calls under heavy contention. On the receiving end, the SIGUSR1 handler now atomically fetches and clears the entire bitmask with a single pg_atomic_exchange_u32, then calls the appropriate sub-handlers. The further optimization to only check if the old flags word was zero is due to Andreas Karlsson. --- BENCHMARK --- The attached benchmark script does LISTEN on one connection, and then uses pgbench to send NOTIFY on a varying number of connections and jobs, to cause a high procsignal load. I've run the benchmark on my MacBook Pro M3 Max, 10 seconds per run, 3 runs. Connections=Jobs | TPS (master) | TPS (patch) | Relative Diff (%) | StdDev (master) | StdDev (patch) ------------------+--------------+-------------+-------------------+-----------------+---------------- 1 | 118833 | 118902 | 0.06% | 484 | 520 2 | 156005 | 194873 | 24.91% | 3145 | 631 4 | 177351 | 190672 | 7.51% | 4305 | 1439 8 | 116597 | 124793 | 7.03% | 1549 | 1011 16 | 40835 | 113312 | 177.49% | 2695 | 1155 32 | 37940 | 108469 | 185.90% | 2533 | 487 64 | 35495 | 104994 | 195.80% | 1837 | 318 128 | 40193 | 100246 | 149.41% | 2254 | 393 (8 rows) Raw Data Summary: Version | Connections | Runs | Min TPS | Max TPS | Avg TPS ----------+-------------+------+---------+---------+--------- master | 1 | 3 | 118274 | 119119 | 118833 master | 2 | 3 | 152803 | 159090 | 156005 master | 4 | 3 | 174381 | 182288 | 177351 master | 8 | 3 | 115021 | 118117 | 116597 master | 16 | 3 | 39048 | 43935 | 40835 master | 32 | 3 | 35754 | 40716 | 37940 master | 64 | 3 | 33417 | 36906 | 35495 master | 128 | 3 | 37925 | 42433 | 40193 patch-v1 | 1 | 3 | 118589 | 119503 | 118902 patch-v1 | 2 | 3 | 194204 | 195457 | 194873 patch-v1 | 4 | 3 | 189771 | 192332 | 190672 patch-v1 | 8 | 3 | 123929 | 125904 | 124793 patch-v1 | 16 | 3 | 112328 | 114584 | 113312 patch-v1 | 32 | 3 | 107975 | 108949 | 108469 patch-v1 | 64 | 3 | 104649 | 105275 | 104994 patch-v1 | 128 | 3 | 99792 | 100479 | 100246 (16 rows) /Joel
Вложения
В списке pgsql-hackers по дате отправления: