Re: [PATCH] Speed up of vac_update_datfrozenxid.
| От | Heikki Linnakangas | 
|---|---|
| Тема | Re: [PATCH] Speed up of vac_update_datfrozenxid. | 
| Дата | |
| Msg-id | 8ea10508-f353-43e2-883e-f41d961d8a44@iki.fi обсуждение исходный текст  | 
		
| Ответ на | [PATCH] Speed up of vac_update_datfrozenxid. (Rustam Khamidullin <rstm.khamidullin@gmail.com>) | 
| Список | pgsql-hackers | 
On 15/07/2025 14:02, Rustam Khamidullin wrote: > Hi hackers, > > While testing behavior of 'vacuum' functionality under high load > we've noticed that the exclusive lock with 'LOCKTAG_DATABASE_FROZEN_ID' > tag could be held for a prolonged time, thus causing contention with > other sessions. > This lock is obtained by the 'vac_update_datfrozenxid' function at the > end of vacuum > process to update value of 'datfrozenxid' in the 'pg_database' relation. > This function currently performs two operations under the > mentioned lock: first scanning the 'pg_class' to find minimal > 'relfrozenxid' and 'relminmxid' and then update corresponding values > in the 'pg_database' relation. Update of the 'pg_database' table > includes locking the required tuple and then deciding whether row > need to be updated or not. While looking at the > 'vac_update_datfrozenxid' content, we've noticed following comment: > > /* > * Fetch a copy of the tuple to scribble on. We could check the syscache > * tuple first. If that concluded !dirty, we'd avoid waiting on > * concurrent heap_update() and would avoid exclusive-locking the buffer. > * For now, don't optimize that. > */ > > The optimization described in this comment (i.e. scanning and > locking tuple only if we really intend to update it) seems to be a > valid one, so I've created a patch to test it (attached to this > message). It should be applicable on top of current 'master' branch. > The code now tries to obtain data from the SysCache first and checks > whether it's necessary to call 'systable_inplace_update_begin()' to > actually update xid values. > > To test the effect of the patch, a synthetic test was written using > bpftrace. > Bpftrace script measures the execution time of 'vac_update_datfrozenxid'. > The script is attached to the email. > > The test scenario: > > 1. Create an empty database cluster > 2. Set autovacuum_naptime = '1s' (so that 'vac_update_datfrozenxid' is > called more often) > 3. Run the bpftrace script (it outputs the function execution time in > ns to stdout) > > It is not necessary to apply the load. The autovacuum worker will call > 'vac_update_datfrozenxid' regardless. > As a result, execution time of 'vac_update_datfrozenxid()' with the > patch is 30% faster > No patch Patch Speed up > Mean: 73058 ns 51444 ns 30% > Median: 71846 ns 48650 ns 33% > > It is worth noting that due to the fact thatbefore updating the > 'datfrozenxid',a full 'pg_class' scan is performed, the performance > improvementbecomes less significant as the number of > relationsincreases(more time will be spent on sequentialscan of > 'pg_class').The previous measurement was performedon an empty > clusterwith nouser relations.To test the patch in a more realistic case, > I generated 1000 tables in the database and repeated the measurement. No > patch Patch Speed up > Mean:97113 ns 73839 ns 24% Median:98328 ns 63541 ns 29% As we can see, > the speed gain has decreased, but it is still significant. How significant is that speedup in the grand scheme of things? What fraction is vac_update_datfrozenxid of the overall work that autovacuum does? Did this patch help with the lock contention that you mentioned at the top? - Heikki
В списке pgsql-hackers по дате отправления: