Re: Vacuum statistics
| От | Alena Rybakina |
|---|---|
| Тема | Re: Vacuum statistics |
| Дата | |
| Msg-id | 9d98562b-fa97-476b-9315-6fa2d736ab6b@yandex.ru обсуждение |
| Ответ на | Re: Vacuum statistics (Alena Rybakina <lena.ribackina@yandex.ru>) |
| Список | pgsql-hackers |
On 28.04.2026 05:16, Alena Rybakina wrote:
I noticed CI's complaints during extension installation and fixed it.Hi, all!
I have updated the core patch that implements the machinery for collecting extended vacuum statistics (I didn't touch the first patch that is ready for commit, only patches that are related to extension), and rebased the ext_vacuum_statistics extension on top of it. The split is intentional: the core only gathers metrics and hands them out, while the actual storage and SQL-level access to the statistics live entirely in the extension. If the extension is not loaded, the overhead is essentially zero - we only fill a small struct on the stack and do a NULL check on the hook.
What was updated in the core
The core gains the machinery and the hook through which the extension receives metrics after each vacuum.
The hook. A new hook has been added in pgstat - set_report_vacuum_hook. It is fired once per vacuumed table and once per vacuumed index, plus when forming the per-database aggregate. The extension registers its handler in _PG_init and by default the hook is NULL, so without an extension the core behaves exactly as before.
The set of statistics is the same as before. Common to tables, indexes and the database - hits and misses in shared buffers, number of dirtied and written pages, WAL volume, buffer read and write times, sleep time spent in delay points, total wall-clock vacuum time (including I/O and lock waits), counter of emergency anti-wraparound vacuums, number of interrupts and removed tuples. Tables additionally report frozen tuples, pages marked all-frozen / all-visible in the visibility map, number of scanned and removed pages, number of index passes, etc. Indexes report freed pages.
The least obvious part of the implementation is subtracting index statistics from the table statistics. This is the bit worth highlighting. The thing is that indexes are vacuumed before the heap, and the buffer and WAL statistics that we capture at the heap level by the end of the heap vacuum already include everything that was spent on the indexes. If we simply expose the diff of pgBufferUsage/pgWalUsage between start and end, the table ends up with double-counted pages/WAL: once in its own report, and a second time inside the reports of its indexes. This is especially noticeable with parallel index vacuum: workers accumulate their usage in the leader only after they finish, so without subtraction the heap report would receive the combined cost of all workers as a "bonus".
To handle this, as each index finishes vacuuming, its counters are accumulated into the state of the current operation, and at the moment the heap report is built these sums are subtracted out. As a result, the extension receives clean numbers: "this is what was actually spent on the table itself", and separately "this is what was actually spent on each index". The behaviour is idempotent for both serial and parallel vacuum.The ext_vacuum_statistics extension
The extension registers the hook handler and stores the received data through the pgstat custom statistics infrastructure. That is, vacuum counters are kept not in the extension's own files, but together with the regular cumulative statistics - they survive a restart and are reset together with pg_stat_reset_*. Access is provided through three views: one for tables, one for indexes, and one with the per-database aggregate.
Filtering
This is where the main flexibility lives - the extension does not force "collect everything", but lets you choose both what to track and which metrics to keep.
By object type. You can limit collection to databases only (without per-table detail), to tables only, or collect both. Among tables, you can additionally filter system / user / all.
By an explicit list. An alternative to "by type" is a whitelist: you turn the corresponding mode on, and the extension starts collecting statistics only for the databases and tables that were explicitly registered via add_track_database / add_track_relation (with matching remove_* for removal). When the lists are off, the type filter is in effect; when they are on, only the list applies. This is convenient when you are interested in monitoring specific "hot" tables and do not want to spend memory on statistics for everything else.
This list is persisted to disk, and there is one more non-trivial part here. List changes are concurrent - multiple sessions may call add_track_* simultaneously, plus there is an object-access hook that cleans the entry on DROP. To avoid ending up with a torn file, access to the list is serialized via a dedicated LWLock tranche (requested from a shmem_request_hook), and the file itself is written atomically: first into a temporary file, then fflush + pg_fsync + durable_rename. All I/O return codes are checked; on error the temporary file is removed and the real one is left untouched; PG_TRY/PG_CATCH guarantees cleanup on ereport(ERROR). Reading the list takes the same lock in shared mode, so a concurrent write cannot tear the load.
By metric category. There is also a GUC that takes a list and turns on the categories of interest - buffers, WAL, general counters, timings (or all). Unwanted categories are simply skipped on the hook handler side and never make it into the pgstat entry, which reduces the overhead of the handler itself. This is useful when, for example, only timings are needed - in that case the extension does not waste time copying the buffer and WAL fields.
Privileges. The add_track_* / remove_track_* functions require superuser or pg_read_all_stats. At the SQL level, EXECUTE is revoked from PUBLIC and granted only to pg_read_all_stats, so a regular user has no access to mutating the list. The views are unrestricted, like regular statistics.What is in the patches
0002-Machinery-for-grabbing-extended-vacuum-statistics.patch - the machinery in the core plus the hook.
0003-ext_vacuum_statistics-...patch - the extension itself, filtering, views, tests.
-- ----------- Best regards, Alena Rybakina Yandex Cloud
Вложения
В списке pgsql-hackers по дате отправления: