Re: Berserk Autovacuum (let's save next Mandrill)

Поиск
Список
Период
Сортировка
От Justin Pryzby
Тема Re: Berserk Autovacuum (let's save next Mandrill)
Дата
Msg-id 20200316210757.GO26184@telsasoft.com
обсуждение исходный текст
Ответ на Re: Berserk Autovacuum (let's save next Mandrill)  (Laurenz Albe <laurenz.albe@cybertec.at>)
Ответы Re: Berserk Autovacuum (let's save next Mandrill)  (Laurenz Albe <laurenz.albe@cybertec.at>)
Список pgsql-hackers
On Mon, Mar 16, 2020 at 08:49:43PM +0100, Laurenz Albe wrote:
> On Mon, 2020-03-16 at 07:47 -0500, Justin Pryzby wrote:
> > It seems to me that the easy thing to do is to implement this initially without
> > FREEZE (which is controlled by vacuum_freeze_table_age), and defer until
> > July/v14 further discussion and implementation of another GUC/relopt for
> > autovacuum freezing to be controlled by insert thresholds (or ratio).
> 
> Freezing tuples is the point of this patch.
> As I have said, if you have a table where you insert many rows in few
> transactions, you would trigger an autovacuum that then ends up doing nothing
> because none of the rows have reached vacuum_freeze_table_age yet.
> 
> Then some time later you will get a really large vacuum run.

Best practice is to vacuum following bulk load.  I don't think this patch is
going to change that.  Bulk-loaded tuples will be autovacuumed, which is nice,
but I don't think it'll be ideal if large bulk loads trigger an autovacuum with
cost delays which ISTM if it runs with FREEZE will take even longer.

If it's a bulk load, then I think it's okay to assume it was vacuumed, or
otherwise that it'll eventually be hit by autovac at some later date.

If it's not a "bulk load" but a normal runtime, and the table continues to
receive inserts/deletes, then eventually it'll hit a vacuum threshold and
tuples can be frozen.

If it receives a bunch of activity, which then stops (like a partition of a
table of timeseries data), then maybe it doesn't hit a vacuum threshold, until
wraparound vacuum.  I think in that case it's not catastrophic, since then it
wasn't big enough to hit any threshold (it's partitioned).  If every day,
autovacuum kicks in and does wraparound vacuum on table with data from (say)
100 days ago, I think that's reasonable.

One case which would suck is if the insert_threshold were 1e6, and you restore
a DB with 1000 tables of historic data (which are no longer being inserted
into) which have 9e5 rows each (just below the threshold).  Then autovacuum
will hit them all at once.  The solution to that is to manual vacuum after bulk
load, same as today.  As a practical matter, some of the tables are likely to
hit the autovacuum insert threshold, and some are likely to be pruned (or
updated) before wraparound vacuum, so the patch usually does improve that case.

-- 
Justin



В списке pgsql-hackers по дате отправления:

Предыдущее
От: Peter Geoghegan
Дата:
Сообщение: nbtree: Refactor "fastpath" and _bt_search() code
Следующее
От: Bruce Momjian
Дата:
Сообщение: Re: Just for fun: Postgres 20?