Re: vacuumdb --jobs deadlock: how to avoid pg_catalog ?

Поиск
Список
Период
Сортировка
От Adrian Klaver
Тема Re: vacuumdb --jobs deadlock: how to avoid pg_catalog ?
Дата
Msg-id 0ca712da-c661-d0e5-073a-557aa3a1aadf@aklaver.com
обсуждение исходный текст
Ответ на vacuumdb --jobs deadlock: how to avoid pg_catalog ?  (Eduard Rozenberg <edrozenberg@gmail.com>)
Список pgsql-general
On 5/14/20 12:03 PM, Eduard Rozenberg wrote:
> @Adrian thanks again.
> 
> I read the postgres docs the same way - that previously used space is marked as available and therefore no need for
vacuumfull. Previously used = now available space, which gets re-used, in theory.
 
> 
> And yet this same DB is growing at 30-50 GB weekly, despite the fact that 2.4 TB of previously used space is clearly
available("clearly available" as proven by vacuum full shrinking the DB space usage by 2.4 TB). I did verify
postgresql.confhas always been properly configured re: autovacuum:  'autovacuum = on'and 'track_counts = on'
 

Well if I'm counting zeros right 50GB on 4.4TB database is ~1.14%. Does 
that sound right for new data being added?


One place to look to see how aggressively the autovacuum  is being done 
here:

https://www.postgresql.org/docs/12/monitoring-stats.html#PG-STAT-ALL-TABLES-VIEW

What you find there may mean tweaking the values as explained here:

https://www.postgresql.org/docs/12/routine-vacuuming.html#AUTOVACUUM

Could be there are just a few tables that account for most of the churn 
and a manual VACUUM on them is needed. Say if there are regularly 
scheduled large UPDATEs to tables, incorporate a VACUUM after.

> 
> I'm not planning on running VACUUM FULL regularly, just "this one time". And I was trying to to parallelize VACUUM
FULLand minimize downtime of the production DB caused by table locking. And then I found the option of using "vacuumdb
--jobs"which sounded like the perfect solution except for "well you can't actually use --jobs because you'll run into a
deadlockand everybody knows that and nobody has a (good) solution for it" :).
 
> 
> --Ed
> 
>> On May 14, 2020, at 11:46, Adrian Klaver <adrian.klaver@aklaver.com> wrote:
>>
>> On 5/14/20 11:27 AM, Eduard Rozenberg wrote:
>>> @Adrian thanks.
>>> I did a test vacuum full the "normal" non-parallel way (VACUUM FULL sql) previously on a test db copy and saw the
DBsize (postgres 9.6) shrink from 6.8 TB to 4.4 TB of actual, real disk space usage ($ df -d 1).
 
>>> I don't know the reason so much space was "locked up" (other than there is a lot of data "churn" from daily
updates).But I definitely do need to do the vac full on the production db to get down to the smaller size - cannot
affordthe 2.4 TB of "wasted" space on an ongoing basis.
 
>>
>> It may not be wasted space. A regular VACUUM marks space within a table available for reuse(and not returned to OS)
whenit removes unneeded tuples. It then fills that space up with new tuples, roughly speaking. So if the vacuuming is
workingproperly you will reach a steady state where space is reused and the database on disk size grows slowly as
reusablespace is occupied. I would monitor the database size on a regular basis. My guess is that the VACUUM FULL is
droppingthe OS used space and then it fills up again as the database does those updates.
 
>>
>>> Based on your email it sounds like the vacuuming of those pg_catalog.* tables is done regardless, as a normal part
ofdoing vac full on my own database.
 
>>> Unfortunately I still don't see an ideal way to run vacuum full in parallel via vacuumdb without running into the
expectedand documented deadlock. Only method I'm aware of is to list each table individually with "-t table1 -t
table2..."to "vacuum db --jobs" which is not pleasant and not exceedingly beautiful.
 
>>> Thanks.
>>
>>
>> -- 
>> Adrian Klaver
>> adrian.klaver@aklaver.com
> 


-- 
Adrian Klaver
adrian.klaver@aklaver.com



В списке pgsql-general по дате отправления:

Предыдущее
От: Eduard Rozenberg
Дата:
Сообщение: Re: vacuumdb --jobs deadlock: how to avoid pg_catalog ?
Следующее
От: ktm@rice.edu
Дата:
Сообщение: Re: surprisingly slow creation of gist index used in excludeconstraint