Re: weird issue with occasional stuck queries

Поиск
Список
Период
Сортировка
От Adam Scott
Тема Re: weird issue with occasional stuck queries
Дата
Msg-id CA+s62-MExP8HqTsdddbsSXNLHBRD0ABR81fJQ8zJnDMeQyVMug@mail.gmail.com
обсуждение исходный текст
Ответ на weird issue with occasional stuck queries  (spiral <spiral@spiral.sh>)
Ответы Re: weird issue with occasional stuck queries  (spiral <spiral@spiral.sh>)
Список pgsql-general
If you get a chance, showing the `top` output might be useful as well. For instance if you are low on memory, it can slow down the allocation of buffers.   Another thing to look at is `iostat -x -y` and look at disk util %.  This is an indicator, but not definitive, of how much disk access is going on.  It may be your drives are just saturated although your IOWait looks ok in your attachment.

That wait event according to documentation is "Waiting to access the multixact member SLRU cache."  SLRU = segmented least recently used cache

Do you have a query that is a "select for update" running somewhere?

If your disk is low on space `df -h` that might explain the issue. 

Is there an ERROR: multixact  something in your postgres log?

Adam






On Fri, Apr 1, 2022 at 6:28 AM spiral <spiral@spiral.sh> wrote:
Hey,

I'm having a weird issue where a few times a day, any query that hits a
specific index (specifically a `unique` column index) gets stuck for
anywhere between 1 and 15 minutes on a LWLock (mostly
MultiXactOffsetSLRU - not sure what that is, I couldn't find anything
about it except for a pgsql-hackers list thread that I didn't really
understand).
Checking netdata history, these stuck queries coincide with massive
disk read; we average ~2MiB/s disk read and it got to 40MiB/s earlier
today.

These queries used to get stuck for ~15 minutes at worst, but I turned
down the query timeout. I assume the numbers above would be worse if I
let the queries run for as long as they need, but I don't have any logs
from before that change and I don't really want to try that again as it
would impact production.

I asked on the IRC a few days ago and got the suggestion to increase
shared_buffers, but that doesn't seem to have helped at all. I also
tried deleting and recreating the index, but that seems to have changed
nothing as well.

Any suggestions are appreciated since I'm really not sure how to debug
this further. I'm also attaching a couple screenshots that might be
useful.

spiral

В списке pgsql-general по дате отправления:

Предыдущее
От: Adrian Klaver
Дата:
Сообщение: Re: Does PGDG apt repository support ARM64?
Следующее
От: Shaozhong SHI
Дата:
Сообщение: How long does iteration over 4-5 million rows usually take?