Обсуждение: Managing autovacuum freezing
PG 12.4
We're observing I/O contention (including long COMMIT times) that are coinciding with anti-wraparound autovacuums of a particularly busy table. This table sees roughly 100K updates per hour. There are a handful (~500) inserts per hour and rarely any deletes at all. Very update-heavy. The table has just around 5Million rows, so it's not really very big at all, but there are around 10Million dead rows as well, so there's bloat involved.
Autovacuum has been disabled on this table for a couple years now (before my time here) and nightly VACUUM ANALYZE jobs are run via crontab. However every 2-3 days we'll see the anti-wraparound autovacuum run for this table and really tank I/O.
My understanding is that these manual VACUUM ANALYZE jobs are not freezing rows that regular autovacuuming would otherwise be doing, which leads up to the big anti-wraparound job.
We haven't changed any of the vacuum/autovacuum GUC parameters (although yesterday I noticed our cookbook is using the old autovacuum_vacuum_cost_delay=20ms default from pre-PG12). I'm assuming we'll need to do some table-specific tuning for this very active table if we were to re-enable autovacuum on it. I'm assuming we'll want to do more frequent (and less intensive) autovacuums throughout the day.
My concern first is whether or not autovacuum would be able to keep up throughout the day, no matter how well it is tuned. Not sure how I can gauge that before pulling the trigger though.
If we exhaust all tuning options, we could look at making those nightly VACUUM ANALYZE jobs into VACUUM FREEZE ANALYZE jobs so at least we take the I/O hit during off-peak hours.
Anyway, I'm very interested in what others on the list may have experienced in this area and what solutions you came up with. Thanks!
Don.
--
PS - Here are some stats I got yesterday in 5 minute samples:
now | inserts | updates | deletes | live_tuples | dead_tuples
-------------------------------+---------+------------+---------+-------------+-------------
2021-02-10 17:01:30.625908+00 | 1058911 | 3076183381 | 0 | 4950291 | 10479817
2021-02-10 17:06:30.940216+00 | 1058956 | 3076277221 | 0 | 4950336 | 10543123
2021-02-10 17:11:31.246943+00 | 1058995 | 3076370779 | 0 | 4950375 | 10604466
2021-02-10 17:16:31.556107+00 | 1059038 | 3076465333 | 0 | 4950418 | 10666903
2021-02-10 17:21:31.872876+00 | 1059084 | 3076558998 | 0 | 4950464 | 10729782
2021-02-10 17:26:32.184989+00 | 1059118 | 3076652179 | 0 | 4950498 | 10790894
2021-02-10 17:31:32.504895+00 | 1059158 | 3076744046 | 0 | 4950538 | 10851880
2021-02-10 17:36:32.812956+00 | 1059199 | 3076837208 | 0 | 4950579 | 10913580
2021-02-10 17:41:33.126065+00 | 1059238 | 3076928165 | 0 | 4950618 | 10973362
2021-02-10 17:46:33.439005+00 | 1059287 | 3077017101 | 0 | 4950667 | 11032461
2021-02-10 17:51:33.749222+00 | 1059327 | 3077109811 | 0 | 4950707 | 11092947
2021-02-10 17:56:34.064872+00 | 1059371 | 3077199779 | 0 | 4950751 | 11151997
2021-02-10 18:01:34.376887+00 | 1059414 | 3077287739 | 2 | 4950794 | 11209508
-------------------------------+---------+------------+---------+-------------+-------------
2021-02-10 17:01:30.625908+00 | 1058911 | 3076183381 | 0 | 4950291 | 10479817
2021-02-10 17:06:30.940216+00 | 1058956 | 3076277221 | 0 | 4950336 | 10543123
2021-02-10 17:11:31.246943+00 | 1058995 | 3076370779 | 0 | 4950375 | 10604466
2021-02-10 17:16:31.556107+00 | 1059038 | 3076465333 | 0 | 4950418 | 10666903
2021-02-10 17:21:31.872876+00 | 1059084 | 3076558998 | 0 | 4950464 | 10729782
2021-02-10 17:26:32.184989+00 | 1059118 | 3076652179 | 0 | 4950498 | 10790894
2021-02-10 17:31:32.504895+00 | 1059158 | 3076744046 | 0 | 4950538 | 10851880
2021-02-10 17:36:32.812956+00 | 1059199 | 3076837208 | 0 | 4950579 | 10913580
2021-02-10 17:41:33.126065+00 | 1059238 | 3076928165 | 0 | 4950618 | 10973362
2021-02-10 17:46:33.439005+00 | 1059287 | 3077017101 | 0 | 4950667 | 11032461
2021-02-10 17:51:33.749222+00 | 1059327 | 3077109811 | 0 | 4950707 | 11092947
2021-02-10 17:56:34.064872+00 | 1059371 | 3077199779 | 0 | 4950751 | 11151997
2021-02-10 18:01:34.376887+00 | 1059414 | 3077287739 | 2 | 4950794 | 11209508
Don Seiler
www.seiler.us
www.seiler.us
On Thu, Feb 11, 2021 at 9:13 AM Don Seiler <don@seiler.us> wrote: > My understanding is that these manual VACUUM ANALYZE jobs are not freezing rows that regular autovacuuming would otherwisebe doing, which leads up to the big anti-wraparound job. They will freeze rows, but not aggressively. The antiwraparound vacuum might block on acquiring buffer pins, low level stuff like that. Perhaps you should change the vacuum_index_cleanup reloption to 'off' for the table, but make the scripted overnight vacuums directly specify INDEX_CLEANUP=on. That way index cleanup would still be performed for the vacuums that run overnight, though not for the antiwraparound vacuums, where the overhead may be a real issue. -- Peter Geoghegan
On Thu, Feb 11, 2021 at 11:49 AM Peter Geoghegan <pg@bowt.ie> wrote:
On Thu, Feb 11, 2021 at 9:13 AM Don Seiler <don@seiler.us> wrote:
> My understanding is that these manual VACUUM ANALYZE jobs are not freezing rows that regular autovacuuming would otherwise be doing, which leads up to the big anti-wraparound job.
They will freeze rows, but not aggressively. The antiwraparound vacuum
might block on acquiring buffer pins, low level stuff like that.
Perhaps you should change the vacuum_index_cleanup reloption to 'off'
for the table, but make the scripted overnight vacuums directly
specify INDEX_CLEANUP=on. That way index cleanup would still be
performed for the vacuums that run overnight, though not for the
antiwraparound vacuums, where the overhead may be a real issue.
Thanks for the response, Peter. This table *does* have 14 indexes on it as well, including on GIN index (rest are btree, some are partial indexes). I've had a separate task on the back burner to try to identify any redundant ones.
In the scenario you describe, would we re-enable the routine autovacuuming? I'm assuming so but wanted to make it clear.
Cheers,
Don.
Don Seiler
www.seiler.us
www.seiler.us
On Thu, Feb 11, 2021 at 11:06 AM Don Seiler <don@seiler.us> wrote: > Thanks for the response, Peter. This table *does* have 14 indexes on it as well, including on GIN index (rest are btree,some are partial indexes). I've had a separate task on the back burner to try to identify any redundant ones. > > In the scenario you describe, would we re-enable the routine autovacuuming? I'm assuming so but wanted to make it clear. I'm not sure that you should re-enable av, actually -- you should at least be careful with combing it with vacuum_index_cleanup=off. The problem with the vacuum_index_cleanup table storage param that controls this behavior is that it will apply generally -- unless you override it using the VACUUM option each time. I strongly doubt that it could ever make sense to completely avoid index vacuuming forever here, so you certainly don't want to let that happen. The vacuum_index_cleanup table param makes that extreme approach a possibility, at least on Postgres 12+, but it's probably only something that makes sense with an append-only table. It might well not have made sense to disable AV here (it's often not a good idea I find), though running VACUUM at night time probably was a good idea. But vacuum_index_cleanup doesn't have granular options about when and how skipping indexes applies as a matter of policy, which makes it a bit tricky. -- Peter Geoghegan
On Thu, Feb 11, 2021 at 1:21 PM Peter Geoghegan <pg@bowt.ie> wrote:
On Thu, Feb 11, 2021 at 11:06 AM Don Seiler <don@seiler.us> wrote:
> Thanks for the response, Peter. This table *does* have 14 indexes on it as well, including on GIN index (rest are btree, some are partial indexes). I've had a separate task on the back burner to try to identify any redundant ones.
>
> In the scenario you describe, would we re-enable the routine autovacuuming? I'm assuming so but wanted to make it clear.
I'm not sure that you should re-enable av, actually -- you should at
least be careful with combing it with vacuum_index_cleanup=off. The
problem with the vacuum_index_cleanup table storage param that
controls this behavior is that it will apply generally -- unless you
override it using the VACUUM option each time. I strongly doubt that
it could ever make sense to completely avoid index vacuuming forever
here, so you certainly don't want to let that happen. The
vacuum_index_cleanup table param makes that extreme approach a
possibility, at least on Postgres 12+, but it's probably only
something that makes sense with an append-only table.
It might well not have made sense to disable AV here (it's often not a
good idea I find), though running VACUUM at night time probably was a
good idea. But vacuum_index_cleanup doesn't have granular options
about when and how skipping indexes applies as a matter of policy,
which makes it a bit tricky.
Right. I was thinking we'd still do the nightly manual vacuum, adding in the INDEX_CLEANUP option after disabling the attribute at the table level. But was wondering if we should still consider enabling autovacuum on that table to hopefully lessen the work needed by the wraparound-prevention aggressive autovac.
Don.
Don Seiler
www.seiler.us
www.seiler.us
On Thu, Feb 11, 2021 at 11:23 AM Don Seiler <don@seiler.us> wrote: > Right. I was thinking we'd still do the nightly manual vacuum, adding in the INDEX_CLEANUP option after disabling the attributeat the table level. But was wondering if we should still consider enabling autovacuum on that table to hopefullylessen the work needed by the wraparound-prevention aggressive autovac. Regular autovacuum (any VACUUM) will freeze tuples, so that certainly makes sense to me. Sometimes it can seem like it's skipping the work done in an anti-wraparound vacuum entirely because only the latter insists on doing cleanup of every heap page, if necessary by waiting to get a buffer pin -- and that could be harder during a busy period. And also because the visibility bits in the VM can allow regular vacuum to skip a lot more work than anti-wraparound vacuum, which can only skip pages with the freeze bit set. An anti-wraparound vacuum is actually less different to a regular autovacuum than most users intuit. Though I get it why it seems that way -- the practical difference can be really noticeable in production, even though the theoretical differences seem rather small to me. -- Peter Geoghegan
On Thu, Feb 11, 2021 at 1:32 PM Peter Geoghegan <pg@bowt.ie> wrote:
On Thu, Feb 11, 2021 at 11:23 AM Don Seiler <don@seiler.us> wrote:
> Right. I was thinking we'd still do the nightly manual vacuum, adding in the INDEX_CLEANUP option after disabling the attribute at the table level. But was wondering if we should still consider enabling autovacuum on that table to hopefully lessen the work needed by the wraparound-prevention aggressive autovac.
Regular autovacuum (any VACUUM) will freeze tuples, so that certainly
makes sense to me. Sometimes it can seem like it's skipping the work
done in an anti-wraparound vacuum entirely because only the latter
insists on doing cleanup of every heap page, if necessary by waiting
to get a buffer pin -- and that could be harder during a busy period.
And also because the visibility bits in the VM can allow regular
vacuum to skip a lot more work than anti-wraparound vacuum, which can
only skip pages with the freeze bit set.
Is there anything I could/should do to get my nightly manual vacuum jobs to do more freezing? It doesn't look like I can set vacuum_freeze_min_age to a per-table value, except for autovacuum_freeze_min_age which the name would suggest only applies to autovacuum. Am I wrong in thinking that regular autovacuuming will do more proactive freezing that my nightly vacuuming doesn't?
Would a nightly "vacuum freeze" be overkill/unreasonable as an alternative? I know it would be a lot more I/O but at least we'd limit it to off-peak hours.
Don.
Don Seiler
www.seiler.us
www.seiler.us