Re: What specific circumstances trigger Autovacuum wraparound

Поиск
Список
Период
Сортировка
От Daniel Lagerman
Тема Re: What specific circumstances trigger Autovacuum wraparound
Дата
Msg-id CAAw8gvhNvsfRN15XeN1+OSyZOWatjwtc3m=EAF+euSKCqozdKA@mail.gmail.com
обсуждение исходный текст
Ответ на Re: What specific circumstances trigger Autovacuum wraparound  ("David G. Johnston" <david.g.johnston@gmail.com>)
Список pgsql-general
Hello David!

Thanks for taking the time to respond. Upgrading is not in scope right now for this server even though that is certainly something I will do down the line.
I talked to a colleague of mine and he dug up something very interesting and probably answers the behavior I'm seeing, from PG archives. So I thought I would post it here so that if this question comes up again its more easy to find it in the future for PG 9.4 specifically.

There is another setting that controls Autovacuum called autovacuum_multixact_freeze_max_age, in PG 9.5 it is easy to get the correct value for the table to see if you hit the default limit of 400 million, however in PG 9.4 and earlier the function mxid_age() does not exist, so you can't query this value directly, hence this reference from this PG Archive post is crucial:

And in that thread we can see Alvaro Herrera write the following paragraph:

"A better way not involving mxid_age() would be to use pg_controldata to
extract the current value of the mxid counter, then subtract the current
relminmxid from that value."

Given that statement we took a look at this server and the table in question. We could see that NextMultiXactId:  2640838929 and the relminmxid for that specific table was 1791982183. Now if you take 2640838929 - 1791982183 = 848856746 you can see that this value is the double of the default value of 400 million and would explain why vacuum is running:

"As a safety device, a whole-table vacuum scan will occur for any table whose multixact-age is greater than autovacuum_multixact_freeze_max_age. Whole-table vacuum scans will also occur progressively for all tables, starting with those that have the oldest multixact-age, if the amount of used member storage space exceeds the amount 50% of the addressible storage space. Both of these kinds of whole-table scans will occur even if autovacuum is nominally disabled."

This calculation is also confirmed by looking at the git commit of mxid_age() this is exactly what this function returns and even though the 5 other tables are not over this value we suspect that due to that we are using more than 50% of the addressable storage space Autovacuum takes to "opportunity" to run for these tables also.

Now you could of course increase this value also, not sure what the max is, but I rather want to complete vacuum. So the plan for me now is that during the weekend, where there is no activity at all except for a backup, is to increase the work memory from relatively low 1024 MB to use almost all memory I have on this server, set the server to use one worker and then see if Autovacuum by itself can solve the cleanup by itself or at least close the gap so that are confident that we will catch up.

Best regards

Daniel


On Mon, Jun 4, 2018 at 11:47 PM, David G. Johnston <david.g.johnston@gmail.com> wrote:
On Monday, June 4, 2018, Daniel Lagerman <spjheruur@gmail.com> wrote:
 I have a pg 9.4.3 server that has one table with calculate age of about 320 million, this is the oldest table in this database and therefore the age of the DB is the same as the table.

It would be advisable to simply upgrade to 9.4.18 and see if any of the various vacuum related bug fixes solves your problem.  That's about the extent of my useful advice for this...

David J.

В списке pgsql-general по дате отправления:

Предыдущее
От: Adrian Klaver
Дата:
Сообщение: Re: Code of Conduct plan
Следующее
От: Jan Claeys
Дата:
Сообщение: Re: Code of Conduct plan