Обсуждение: NOTIFY queue is at 66% and climbing...
Hi all,
On 10.15 I'm getting the following on a logically replicated server.
From the CSV logs:
2021-10-13 18:49:39.792 EDT,,,213601,,6143c257.34261,64243,,2021-09-16 18:16:55 EDT,4/3914851,60709901,WARNING,01000,"NOTIFY queue is 66% full",,,,,,,,,""
2021-10-13 18:49:46.058 EDT,,,213601,,6143c257.34261,64244,,2021-09-16 18:16:55 EDT,4/3914855,60709905,WARNING,01000,"NOTIFY queue is 66% full",,,,,,,,,""
2021-10-13 18:49:51.934 EDT,,,213601,,6143c257.34261,64245,,2021-09-16 18:16:55 EDT,4/3914862,60709915,WARNING,01000,"NOTIFY queue is 66% full",,,,,,,,,""
2021-10-13 18:50:00.516 EDT,,,213601,,6143c257.34261,64246,,2021-09-16 18:16:55 EDT,4/3914864,60709917,WARNING,01000,"NOTIFY queue is 66% full",,,,,,,,,""
2021-10-13 18:50:08.003 EDT,,,213601,,6143c257.34261,64247,,2021-09-16 18:16:55 EDT,4/3914871,60709926,WARNING,01000,"NOTIFY queue is 66% full",,,,,,,,,""
The very fine docs for 10 say
"There is a queue that holds notifications that have been sent but not yet processed by all listening sessions. If this queue becomes full, transactions calling
We do have a client that has access to this logically replicated server and I have no idea what they have done regards LISTEN/NOTIFY, and my query to them about this has not yielded a response.
In the CSV logs above what part points to "the session that is preventing cleanup" so that I can kill it? pg_stat_activity yields no clues.
Thanks,
Jeff
On 10.15 I'm getting the following on a logically replicated server.
From the CSV logs:
2021-10-13 18:49:39.792 EDT,,,213601,,6143c257.34261,64243,,2021-09-16 18:16:55 EDT,4/3914851,60709901,WARNING,01000,"NOTIFY queue is 66% full",,,,,,,,,""
2021-10-13 18:49:46.058 EDT,,,213601,,6143c257.34261,64244,,2021-09-16 18:16:55 EDT,4/3914855,60709905,WARNING,01000,"NOTIFY queue is 66% full",,,,,,,,,""
2021-10-13 18:49:51.934 EDT,,,213601,,6143c257.34261,64245,,2021-09-16 18:16:55 EDT,4/3914862,60709915,WARNING,01000,"NOTIFY queue is 66% full",,,,,,,,,""
2021-10-13 18:50:00.516 EDT,,,213601,,6143c257.34261,64246,,2021-09-16 18:16:55 EDT,4/3914864,60709917,WARNING,01000,"NOTIFY queue is 66% full",,,,,,,,,""
2021-10-13 18:50:08.003 EDT,,,213601,,6143c257.34261,64247,,2021-09-16 18:16:55 EDT,4/3914871,60709926,WARNING,01000,"NOTIFY queue is 66% full",,,,,,,,,""
The very fine docs for 10 say
"There is a queue that holds notifications that have been sent but not yet processed by all listening sessions. If this queue becomes full, transactions calling
NOTIFY
will fail at commit. The queue is quite large (8GB in a standard installation) and should be sufficiently sized for almost every use case. However, no cleanup can take place if a session executes LISTEN
and then enters a transaction for a very long time. Once the queue is half full you will see warnings in the log file pointing you to the session that is preventing cleanup. In this case you should make sure that this session ends its current transaction so that cleanup can proceed."We do have a client that has access to this logically replicated server and I have no idea what they have done regards LISTEN/NOTIFY, and my query to them about this has not yielded a response.
In the CSV logs above what part points to "the session that is preventing cleanup" so that I can kill it? pg_stat_activity yields no clues.
Thanks,
Jeff
Jeff Ross <jross@openvistas.net> writes: > On 10.15 I'm getting the following on a logically replicated server. > 2021-10-13 18:49:39.792 EDT,,,213601,,6143c257.34261,64243,,2021-09-16 > 18:16:55 EDT,4/3914851,60709901,WARNING,01000,"NOTIFY queue is 66% > full",,,,,,,,,"" > In the CSV logs above what part points to "the session that is > preventing cleanup" so that I can kill it? Normally there's a DETAIL entry citing the session's PID. Looking at the code, the reason for the lack of any such entry must be that there is no session whose current notify queue position exactly matches the supposed global minimum position. This corresponds to a known bug that was fixed in 10.16, so I'd suggest upgrading. As a temporary workaround you could restart that server, but likely the problem would recur after awhile. regards, tom lane
On 10/13/21 5:50 PM, Tom Lane wrote: > Jeff Ross <jross@openvistas.net> writes: >> On 10.15 I'm getting the following on a logically replicated server. >> 2021-10-13 18:49:39.792 EDT,,,213601,,6143c257.34261,64243,,2021-09-16 >> 18:16:55 EDT,4/3914851,60709901,WARNING,01000,"NOTIFY queue is 66% >> full",,,,,,,,,"" >> In the CSV logs above what part points to "the session that is >> preventing cleanup" so that I can kill it? > Normally there's a DETAIL entry citing the session's PID. Looking > at the code, the reason for the lack of any such entry must be that > there is no session whose current notify queue position exactly > matches the supposed global minimum position. This corresponds to > a known bug that was fixed in 10.16, so I'd suggest upgrading. > As a temporary workaround you could restart that server, but > likely the problem would recur after awhile. > > regards, tom lane > Thanks as always, Tom. I'll schedule the upgrade to 10.18 and restart for this weekend. Jeff