[Proposal] Adding callback support for custom statistics kinds
| От | Sami Imseih |
|---|---|
| Тема | [Proposal] Adding callback support for custom statistics kinds |
| Дата | |
| Msg-id | CAA5RZ0s9SDOu+Z6veoJCHWk+kDeTktAtC-KY9fQ9Z6BJdDUirQ@mail.gmail.com обсуждение исходный текст |
| Ответы |
Re: [Proposal] Adding callback support for custom statistics kinds
|
| Список | pgsql-hackers |
Hi, I'd like to propose $SUBJECT to serialize additional per-entry data beyond the standard statistics entries. Currently, custom statistics kinds can store their standard entry data in the main "pgstat.stat" file, but there is no mechanism for extensions to persist extra data stored in the entry. A common use case is extensions that register a custom kind and, besides standard counters, need to track variable-length data stored in a dsa_pointer. This proposal adds optional "to_serialized_extra" and "from_serialized_extra" callbacks to "PgStat_KindInfo" that allow custom kinds to write and read from extra data in a separate files (pgstat.<kind>.stat). The callbacks give extensions direct access to the file pointer so they can read and write data in any format, while the core "pgstat" infrastructure manages opening, closing, renaming, and cleanup, just as it does with "pgstat.stat". A concrete use case is pg_stat_statements. If it were to use custom stats kinds to track statement counters, it could also track query text stored in DSA. The callbacks allow saving the query text referenced by the dsa_pointer and restoring it after a clean shutdown. Since DSA (and more specifically DSM) cannot be attached by the postmaster, an extension cannot use "on_shmem_exit" or "shmem_startup_hook" to serialize or restore this data. This is why pgstat handles serialization during checkpointer shutdown and startup, allowing a single backend to manage it safely. I considered adding hooks to the existing pgstat code paths (pgstat_before_server_shutdown, pgstat_discard_stats, and pgstat_restore_stats), but that felt too unrestricted. Using per-kind callbacks provides more control. There are already "to_serialized_name" and "from_serialized_name" callbacks used to store and read entries by "name" instead of "PgStat_HashKey", currently used by replication slot stats. Those remain unchanged, as they serve a separate purpose. Other design points: 1. Filenames use "pgstat.<kind>.stat" based on the numeric kind ID. This avoids requiring extensions to provide names and prevents issues with spaces or special characters. 2. Both callbacks must be registered together. Serializing without deserializing would leave orphaned files behind, and I cannot think of a reason to allow this. 3. "write_chunk", "read_chunk", "write_chunk_s", and "read_chunk_s" are renamed to "pgstat_write_chunk", etc., and moved to "pgstat_internal.h" so extensions can use them without re-implementing these functions. 4. These callbacks are valid only for custom, variable-numbered statistics kinds. Custom fixed kinds may not benefit, but could be considered in the future. Attached 0001 is the proposed change, still in POC form. The second patch contains tests in "injection_points" to demonstrate this proposal, and is not necessarily intended for commit. Looking forward to your feedback! -- Sami Imseih Amazon Web Services (AWS)
Вложения
В списке pgsql-hackers по дате отправления: