Re: Auto-vacuum timing out and preventing connections

Поиск
Список
Период
Сортировка
От Masahiko Sawada
Тема Re: Auto-vacuum timing out and preventing connections
Дата
Msg-id CAD21AoDL6eAY7+JY6wJ3-N2DY04GCXtWZ+tZNF9OYJeUnPkQEQ@mail.gmail.com
обсуждение исходный текст
Ответ на Re: Auto-vacuum timing out and preventing connections  (David Johansen <davejohansen@gmail.com>)
Список pgsql-bugs
On Wed, Jun 29, 2022 at 5:05 AM David Johansen <davejohansen@gmail.com> wrote:
>
> On Tue, Jun 28, 2022 at 1:31 PM Jeff Janes <jeff.janes@gmail.com> wrote:
>>
>> On Mon, Jun 27, 2022 at 4:38 PM David Johansen <davejohansen@gmail.com> wrote:
>>>
>>> We're running into an issue where the database can't be connected to. It appears that the auto-vacuum is timing out
andthen that prevents new connections from happening. This assumption is based on these logs showing up in the logs:
 
>>> WARNING:  worker took too long to start; canceled
>>> The log appears about every 5 minutes and eventually nothing can connect to it and it has to be rebooted.
>>
>>
>> As Julien suggested, this sounds like another victim, not the cause.  Is there anything else in the log files?
>
>
> That's the only thing in the logs for the 12-24 hours before the database becomes inaccessible.
>
>>
>> What version are you using?
>
>
> 13.6
>
>>>
>>> These are the most similarly related previous posts, but the CPU usage isn't high when this happens, so I don't
believethat's the problem
 
>>> https://www.postgresql.org/message-id/20081105185206.GS4114%40alvh.no-ip.org
>>> https://www.postgresql.org/message-id/AANLkTinsGLeRc26RT5Kb4_HEhow5e97p0ZBveg=p9xqS@mail.gmail.com
>>
>>
>> But, I don't see high CPU described as a symptom in either of those threads.
>
>
> I was referring to the "I've seen this happen under heavy load" statement. Not sure that's the cause or related in
thoseposts, but it doesn't appear to be the issue here.
 
>
>>
>>  If you can't reproduce the problem locally, there probably isn't much we can do.  Maybe ask Amazon to look into it,
sincethey are the only ones with sufficient access to do so.
 
>
>
> We've opened a support case, but I was trying to be proactive and seeing what we could dig into on our end. Is there
away to tell which table the auto-vacuum is trying to run on and timing out with?
 

Autovacuum workers launch per database. The situation where the
warning "worker took too long to start; canceled" occurs is that an
autovacuum worker for the particular database took a long time for its
startup phase (initializing and setting parameters etc.). There is no
way to know neither which table nor which database.

If it's reproducible, it may help investigate it if you could collect
the contents of pg_stat_activity when the issue is happening in order
to see if there is another process waiting.

Regards,

-- 
Masahiko Sawada
EDB:  https://www.enterprisedb.com/



В списке pgsql-bugs по дате отправления:

Предыдущее
От: Michael Paquier
Дата:
Сообщение: Re: BUG #17385: "RESET transaction_isolation" inside serializable transaction causes Assert at the transaction end
Следующее
От: "396934406@qq.com"
Дата:
Сообщение: pg15 beta2 bug:cause by logcial replation