Re: Optimize LISTEN/NOTIFY
| От | Chao Li |
|---|---|
| Тема | Re: Optimize LISTEN/NOTIFY |
| Дата | |
| Msg-id | 0BCA1C2D-B92C-459E-B1A6-6D06BA4C62CF@gmail.com обсуждение исходный текст |
| Ответ на | Re: Optimize LISTEN/NOTIFY (Arseniy Mukhin <arseniy.mukhin.dev@gmail.com>) |
| Ответы |
Re: Optimize LISTEN/NOTIFY
|
| Список | pgsql-hackers |
> On Oct 23, 2025, at 18:02, Arseniy Mukhin <arseniy.mukhin.dev@gmail.com> wrote: > > Hi, > > On Thu, Oct 23, 2025 at 11:17 AM Chao Li <li.evan.chao@gmail.com> wrote: >> >> >> >>> On Oct 21, 2025, at 00:43, Arseniy Mukhin <arseniy.mukhin.dev@gmail.com> wrote: >>> >>> >>> I managed to reproduce the race with v20-alt3. I tried to write a TAP >>> test reproducing the issue, so it was easier to validate changes. >>> Please find the attached TAP test. I added it to some random package >>> for simplicity. >>> >> >> With alt3, as we have acquired the notification lock after reading every message to update the POS, I think we can doa little bit more optimization: >> >> The notifier: in SignalBackend() >> * Now we check if a listener’s pos equals to beforeWritePos, then we do “directly advancement” >> * We can change to if a listener’s pos is between beforeWritePos and afterWritePos, then we can do the advancement. >> >> The listener: in asyncQueueReadAllNotifications(): >> * With alt3, we only lock and update pos >> * We can do more. If current pos in shared memory is after that local pos, then meaning some notifier has done an advancement,so it can stop reading. >> > > I think this would be a reasonable optimization if it weren't for the > race condition mentioned above. The problem is that if the local pos > lags behind the shared memory pos, it could point to a truncated queue > segment, so we shouldn't allow that. > I figured out a way to resolve the race condition for alt3: * add an awakening flag for every listener, this flag is only set by listeners * add an advisory pos for every listener, similar to alt1 * if a listener is awaken, notify only updates the listener’s advisory pos; otherwise directly advance its position. * If a running listener see current pos is behind advisory pos, then stop reading See more details in attach patch file, I added code comments for my changes. Now the TAP test won’t hit the race condition. ``` # +++ tap check in src/test/authentication +++ t/008_listen-pos-race.pl .. skipped: Injection points not supported by this build Files=1, Tests=0, 0 wallclock secs ( 0.00 usr 0.00 sys + 0.03 cusr 0.01 csys = 0.04 CPU) Result: NOTESTS ``` And with my solution, listeners longer will still use local pos, so that no longer need to acquire notification lock in everyloop. The patch stack is: v20 patch -> alt3 patch -> tap patch -> my patch. Please see if my solution works. I also made a tiny change in the TAP script to allow it to terminate gracefully. Best regards, -- Chao Li (Evan) HighGo Software Co., Ltd. https://www.highgo.com/
Вложения
В списке pgsql-hackers по дате отправления: