Re: confusion about this commit "Revert "Skip redundant anti-wraparound vacuums""

Поиск
Список
Период
Сортировка
От Robert Haas
Тема Re: confusion about this commit "Revert "Skip redundant anti-wraparound vacuums""
Дата
Msg-id CA+TgmoYPjEqZ7bi-oOQq=qRCw_Z+nxvwZSR6+Z=Xe60-NLTF=w@mail.gmail.com
обсуждение исходный текст
Ответ на Re: confusion about this commit "Revert "Skip redundant anti-wraparound vacuums""  (jiye <jiye_sw@126.com>)
Ответы Re: confusion about this commit "Revert "Skip redundant anti-wraparound vacuums""  (Julien Rouhaud <rjuju123@gmail.com>)
Список pgsql-hackers
On Mon, Jun 5, 2023 at 1:50 AM jiye <jiye_sw@126.com> wrote:
we can not get determinate test case as this issue reproduce only once, and currently autovaccum can works as we using vacuum freeze for each tables of each database.

our client's application is real online bank business, and have serveral customer database, do a majority of update opertaion as  result trigger some table dead_tup_ratio nealy 100%, but can not find any autovacuum process work for a very long time before we do vacuum freeze manally.

I tend to doubt that this is caused by the commit you're blaming, because that commit purports to skip autovacuum operations only if some other vacuum has already done the work. Here you are saying that you see no autovacuum tasks at all.

The screenshot that you posted of XID ages exceeding 200 million is not evidence of a problem. It's pretty normal for some table XID ages to temporarily exceed autovacuum_freeze_max_age, especially if you have a lot of tables with about the same XID age, as seems to be the case here. When a table's XID age reaches autovacuum_freeze_max_age, the system will start trying harder to reduce the XID age, but that process isn't instantaneous.

On the other hand, your statement that you have very high numbers of dead tuples *is* evidence of a problem. It's very likely caused by vacuum not running aggressively enough. Remember that autovacuum is limited by the number of workers (autovacuum_max_workers) but even more importantly by the cost delay system. It's *extremely* common to need to raise vacuum_cost_limit on large or busy database systems, often by large multiples (e.g. 10x or more).

I'd strongly suggest that you carefully monitor how many autovacuum processes are running and what they are doing. If I were a betting man, I'd bet that you'd find that in the situation where you had this problem, the number of running processes was always 3 -- which is the configured maximum -- and if you looked at the wait event in pg_stat_activity I bet you would see VacuumDelay showing up a lot. If so, raise vacuum_cost_limit considerably and over time the problem should get better. It won't be instantaneous.

Or maybe I'm wrong and you'd see something else, but whatever you did see would probably give a hint as to what the problem here is.

--

В списке pgsql-hackers по дате отправления:

Предыдущее
От: Joe Conway
Дата:
Сообщение: Re: pg_collation.collversion for C.UTF-8
Следующее
От: Robert Haas
Дата:
Сообщение: Re: Order changes in PG16 since ICU introduction