Обсуждение: Pluggable cumulative statistics
Hi all, While looking at ways to make pg_stat_statements more scalable and dynamically manageable (no more PGC_POSTMASTER for the max number of entries), which came out as using a dshash, Andres has mentioned me off-list (on twitter/X) that we'd better plug in it to the shmem pgstats facility, moving the text file that holds the query strings into memory (with size restrictions for the query strings, for example). This has challenges on its own (query ID is 8 bytes incompatible with the dboid/objid hash key used by pgstats, discard of entries when maximum). Anyway, this won't happen if we don't do one of these two things: 1) Move pg_stat_statements into core, adapting pgstats for its requirements. 2) Make the shmem pgstats pluggable so as it is possible for extensions to register their own stats kinds. 1) may have its advantages, still I am not sure if we want to do that. And 2) is actually something that can be used for more things than just pg_stat_statements, because people love extensions and statistics (spoiler: I do). The idea is simple: any extension defining a custom stats kind would be able to rely on all the in-core facilities we use for the existing in-core kinds: a) Snapshotting and caching of the stats, via stats_fetch_consistency. b) Native handling and persistency of the custom stats data. c) Reuse stats after a crash, pointing at this comment in xlog.c: * TODO: With a bit of extra work we could just start with a pgstat file * associated with the checkpoint redo location we're starting from. This means that we always remove the stats after a crash. That's something I have a patch for, not for this thread, but the idea is that custom stats would also benefit from this property. The implementation is based on the following ideas: * A structure in shared memory that tracks the IDs of the custom stats kinds with their names. These are incremented starting from PGSTAT_KIND_LAST. * Processes use a local array cache that keeps tracks of all the custom PgStat_KindInfos, indexed by (kind_id - PGSTAT_KIND_LAST). * The kind IDs may change across restarts, meaning that any stats data associated to a custom kind is stored with the *name* of the custom stats kind. Depending on the discussion happening here, I'd be open to use the same concept as custom RMGRs, where custom kind IDs are "reserved", fixed in time, and tracked in the Postgres wiki. It is cheaper to store the stats this way, as well, while managing conflicts across extensions available in the community ecosystem. * Custom stats can be added without shared_preload_libraries, loading them from a shmem startup hook with shared_preload_libraries is also possible. * The shmem pgstats defines two types of statistics: the ones in a dshash and what's called a "fixed" type like for archiver, WAL, etc. pointing to areas of shared memory. All the fixed types are linked to structures for snapshotting and shmem tracking. As a matter of simplification and because I could not really see a case where I'd want to plug in a fixed stats kind, the patch forbids this case. This case could be allowed, but I'd rather refactor the structures of pgstat_internal.h so as we don't have traces of the "fixed" stats structures in so many areas. * Making custom stats data persistent is an interesting problem, and there are a couple of approaches I've considered: ** Allow custom kinds to define callbacks to read and write data from a source they'd want, like their own file through a fd. This has the disadvantage to remove the benefit of c) above. ** Store everything in the existing stats file, adding one type of entry like 'S' and 'N' with a "custom" type, where the *name* of the custom stats kind is stored instead of its ID computed from shared memory. A mix of both? The patch attached has used the second approach. If the process reading/writing the stats does not know about the custom stats data, the data is discarded. * pgstat.c has a big array called pgstat_kind_infos to define all the existing stats kinds. Perhaps the code should be refactored to use this new API? That would make the code more consistent with what we do for resource managers, for one, while moving the KindInfos into their own file. With that in mind, storing the kind ID in KindInfos feels intuitive. While thinking about a use case to show what these APIs can do, I have decided to add statistics to the existing module injection_points rather than implement a new test module, gathering data about them and have tests that could use this data (like tracking the number of times a point is taken). This is simple enough that it can be used as a template, as well. There is a TAP test checking the data persistence across restarts, so I did not mess up this part much, hopefully. Please find attached a patch set implementing these ideas: - 0001 switches PgStat_Kind from an enum to a uint32, for the internal counters. - 0002 is some cleanup for the hardcoded S, N and E in pgstat.c. - 0003 introduces the backend-side APIs, with the shmem table counter and the routine to give code paths a way to register their own stats kind (see pgstat_add_kind). - 0004 implements an example of how to use these APIs, see injection_stats.c in src/test/modules/injection_points/. - 0005 adds some docs. - 0006 is an idea of how to make this custom stats data persistent. This will hopefully spark a discussion, and I was looking for answers regarding these questions: - Should the pgstat_kind_infos array in pgstat.c be refactored to use something similar to pgstat_add_kind? - How should the persistence of the custom stats be achieved? Callbacks to give custom stats kinds a way to write/read their data, push everything into a single file, or support both? - Should this do like custom RMGRs and assign to each stats kinds ID that are set in stone rather than dynamic ones? Thanks for reading. -- Michael
Вложения
- 0001-Switch-PgStat_Kind-from-enum-to-uint32.patch
- 0002-Replace-hardcoded-identifiers-in-pgstats-file-by-var.patch
- 0003-Introduce-pluggable-APIs-for-Cumulative-Statistics.patch
- 0004-injection_points-Add-statistics-for-custom-points.patch
- 0005-doc-Add-section-for-Custom-Cumulative-Statistics-API.patch
- 0006-Extend-custom-cumulative-stats-to-be-persistent.patch
- signature.asc
On Thu, Jun 13, 2024 at 04:59:50PM +0900, Michael Paquier wrote: > - How should the persistence of the custom stats be achieved? > Callbacks to give custom stats kinds a way to write/read their data, > push everything into a single file, or support both? > - Should this do like custom RMGRs and assign to each stats kinds ID > that are set in stone rather than dynamic ones? These two questions have been itching me in terms of how it would work for extension developers, after noticing that custom RMGRs are used more than I thought: https://wiki.postgresql.org/wiki/CustomWALResourceManagers The result is proving to be nicer, shorter by 300 lines in total and much simpler when it comes to think about the way stats are flushed because it is possible to achieve the same result as the first patch set without manipulating any of the code paths doing the read and write of the pgstats file. In terms of implementation, pgstat.c's KindInfo data is divided into two parts, for efficiency: - The exiting in-core stats with designated initializers, renamed as built-in stats kinds. - The custom stats kinds are saved in TopMemoryContext, and can only be registered with shared_preload_libraries. The patch reserves a set of 128 harcoded slots for all the custom kinds making the lookups for the KindInfos quite cheap. Upon registration, a custom stats kind needs to assign a unique ID, with uniqueness on the names and IDs checked at registration. The backend code does ID -> information lookups in the hotter paths, meaning that the code only checks if an ID is built-in or custom, then redirects to the correct array where the information is stored. There is one code path that does a name -> information lookup for the undocumented SQL function pg_stat_have_stats() used in the tests, which is a bit less efficient now, but that does not strike me as an issue. modules/injection_points/ works as previously as a template to show how to use these APIs, with tests for the whole. With that in mind, the patch set is more pleasant to the eye, and the attached v2 consists of: - 0001 and 0002 are some cleanups, same as previously to prepare for the backend-side APIs. - 0003 adds the backend support to plug-in custom stats. - 0004 includes documentation. - 0005 is an example of how to use them, with a TAP test providing coverage. Note that the patch I've proposed to make stats persistent at checkpoint so as we don't discard everything after a crash is able to work with the custom stats proposed on this thread: https://commitfest.postgresql.org/48/5047/ -- Michael
Вложения
- v2-0004-doc-Add-section-for-Custom-Cumulative-Statistics-.patch
- v2-0002-Replace-hardcoded-identifiers-in-pgstats-file-by-.patch
- v2-0003-Introduce-pluggable-APIs-for-Cumulative-Statistic.patch
- v2-0001-Switch-PgStat_Kind-from-enum-to-uint32.patch
- v2-0005-injection_points-Add-statistics-for-custom-points.patch
- signature.asc
Hi, On Thu, Jun 13, 2024 at 04:59:50PM +0900, Michael Paquier wrote: > Hi all, > > 2) Make the shmem pgstats pluggable so as it is possible for extensions > to register their own stats kinds. Thanks for the patch! I like the idea of having custom stats (it has also been somehow mentioned in [1]). > 2) is actually something that can be used for more things than > just pg_stat_statements, because people love extensions and > statistics (spoiler: I do). +1 > * Making custom stats data persistent is an interesting problem, and > there are a couple of approaches I've considered: > ** Allow custom kinds to define callbacks to read and write data from > a source they'd want, like their own file through a fd. This has the > disadvantage to remove the benefit of c) above. > ** Store everything in the existing stats file, adding one type of > entry like 'S' and 'N' with a "custom" type, where the *name* of the > custom stats kind is stored instead of its ID computed from shared > memory. What about having 2 files? - One is the existing stats file - One "predefined" for all the custom stats (so what you've done minus the in-core stats). This one would not be configurable and the extensions will not need to know about it. Would that remove the benefit from c) that you mentioned up-thread? [1]: https://www.postgresql.org/message-id/20220818195124.c7ipzf6c5v7vxymc%40awork3.anarazel.de Regards, -- Bertrand Drouvot PostgreSQL Contributors Team RDS Open Source Databases Amazon Web Services: https://aws.amazon.com
Hi, On Thu, Jun 20, 2024 at 09:46:30AM +0900, Michael Paquier wrote: > On Thu, Jun 13, 2024 at 04:59:50PM +0900, Michael Paquier wrote: > > - How should the persistence of the custom stats be achieved? > > Callbacks to give custom stats kinds a way to write/read their data, > > push everything into a single file, or support both? > > - Should this do like custom RMGRs and assign to each stats kinds ID > > that are set in stone rather than dynamic ones? > These two questions have been itching me in terms of how it would work > for extension developers, after noticing that custom RMGRs are used > more than I thought: > https://wiki.postgresql.org/wiki/CustomWALResourceManagers > > The result is proving to be nicer, shorter by 300 lines in total and > much simpler when it comes to think about the way stats are flushed > because it is possible to achieve the same result as the first patch > set without manipulating any of the code paths doing the read and > write of the pgstats file. I think it makes sense to follow the same "behavior" as the custom wal resource managers. That, indeed, looks much more simpler than v1. > In terms of implementation, pgstat.c's KindInfo data is divided into > two parts, for efficiency: > - The exiting in-core stats with designated initializers, renamed as > built-in stats kinds. > - The custom stats kinds are saved in TopMemoryContext, Agree that a backend lifetime memory area is fine for that purpose. > and can only > be registered with shared_preload_libraries. The patch reserves a set > of 128 harcoded slots for all the custom kinds making the lookups for > the KindInfos quite cheap. + MemoryContextAllocZero(TopMemoryContext, + sizeof(PgStat_KindInfo *) * PGSTAT_KIND_CUSTOM_SIZE); and that's only 8 * PGSTAT_KIND_CUSTOM_SIZE bytes in total. I had a quick look at the patches (have in mind to do more): > With that in mind, the patch set is more pleasant to the eye, and the > attached v2 consists of: > - 0001 and 0002 are some cleanups, same as previously to prepare for > the backend-side APIs. 0001 and 0002 look pretty straightforward at a quick look. > - 0003 adds the backend support to plug-in custom stats. 1 === It looks to me that there is a mix of "in core" and "built-in" to name the non custom stats. Maybe it's worth to just use one? As I can see (and as you said above) this is mainly inspired by the custom resource manager and 2 === and 3 === are probably copy/paste consequences. 2 === + if (pgstat_kind_custom_infos[idx] != NULL && + pgstat_kind_custom_infos[idx]->name != NULL) + ereport(ERROR, + (errmsg("failed to register custom cumulative statistics \"%s\" with ID %u", kind_info->name,kind), + errdetail("Custom resource manager \"%s\" already registered with the same ID.", + pgstat_kind_custom_infos[idx]->name))); s/Custom resource manager/Custom cumulative statistics/ 3 === + ereport(LOG, + (errmsg("registered custom resource manager \"%s\" with ID %u", + kind_info->name, kind))); s/custom resource manager/custom cumulative statistics/ > - 0004 includes documentation. Did not look yet. > - 0005 is an example of how to use them, with a TAP test providing > coverage. Did not look yet. As I said, I've in mind to do a more in depth review. I've noted the above while doing a quick read of the patches so thought it makes sense to share them now while at it. Regards, -- Bertrand Drouvot PostgreSQL Contributors Team RDS Open Source Databases Amazon Web Services: https://aws.amazon.com
On Thu, Jun 20, 2024 at 01:05:42PM +0000, Bertrand Drouvot wrote: > On Thu, Jun 13, 2024 at 04:59:50PM +0900, Michael Paquier wrote: >> * Making custom stats data persistent is an interesting problem, and >> there are a couple of approaches I've considered: >> ** Allow custom kinds to define callbacks to read and write data from >> a source they'd want, like their own file through a fd. This has the >> disadvantage to remove the benefit of c) above. >> ** Store everything in the existing stats file, adding one type of >> entry like 'S' and 'N' with a "custom" type, where the *name* of the >> custom stats kind is stored instead of its ID computed from shared >> memory. > > What about having 2 files? > > - One is the existing stats file > - One "predefined" for all the custom stats (so what you've done minus the > in-core stats). This one would not be configurable and the extensions will > not need to know about it. Another thing that can be done here is to add a few callbacks to control how an entry should be written out when the dshash is scanned or read when the dshash is populated depending on the KindInfo. That's not really complicated to do as the populate part could have a cleanup phase if an error is found. I just did not do it yet because this patch set is already covering a lot, just to get the basics in. > Would that remove the benefit from c) that you mentioned up-thread? Yes, that can be slightly annoying. Splitting the stats across multiple files would mean that each stats file would have to store the redo LSN. That's not really complicated to implement, but really easy to miss. Perhaps folks implementing their own stats kinds would be aware anyway because we are going to need a callback to initialize the file to write if we do that, and the redo LSN should be provided in input of it. Giving more control to extension developers here would be OK for me, especially since they could use their own format for their output file(s). -- Michael
Вложения
On Thu, Jun 20, 2024 at 02:27:14PM +0000, Bertrand Drouvot wrote: > On Thu, Jun 20, 2024 at 09:46:30AM +0900, Michael Paquier wrote: > I think it makes sense to follow the same "behavior" as the custom > wal resource managers. That, indeed, looks much more simpler than v1. Thanks for the feedback. >> and can only >> be registered with shared_preload_libraries. The patch reserves a set >> of 128 harcoded slots for all the custom kinds making the lookups for >> the KindInfos quite cheap. > > + MemoryContextAllocZero(TopMemoryContext, > + sizeof(PgStat_KindInfo *) * PGSTAT_KIND_CUSTOM_SIZE); > > and that's only 8 * PGSTAT_KIND_CUSTOM_SIZE bytes in total. Enlarging that does not worry me much. Just not too much. >> With that in mind, the patch set is more pleasant to the eye, and the >> attached v2 consists of: >> - 0001 and 0002 are some cleanups, same as previously to prepare for >> the backend-side APIs. > > 0001 and 0002 look pretty straightforward at a quick look. 0002 is quite independentn. Still, 0001 depends a bit on the rest. Anyway, the Kind is already 4 bytes and it cleans up some APIs that used int for the Kind, so enforcing signedness is just cleaner IMO. >> - 0003 adds the backend support to plug-in custom stats. > > 1 === > > It looks to me that there is a mix of "in core" and "built-in" to name the > non custom stats. Maybe it's worth to just use one? Right. Perhaps better to remove "in core" and stick to "builtin", as I've used the latter for the variables and such. > As I can see (and as you said above) this is mainly inspired by the custom > resource manager and 2 === and 3 === are probably copy/paste consequences. > > 2 === > > + if (pgstat_kind_custom_infos[idx] != NULL && > + pgstat_kind_custom_infos[idx]->name != NULL) > + ereport(ERROR, > + (errmsg("failed to register custom cumulative statistics \"%s\" with ID %u", kind_info->name,kind), > + errdetail("Custom resource manager \"%s\" already registered with the same ID.", > + pgstat_kind_custom_infos[idx]->name))); > > s/Custom resource manager/Custom cumulative statistics/ > > 3 === > > + ereport(LOG, > + (errmsg("registered custom resource manager \"%s\" with ID %u", > + kind_info->name, kind))); > > s/custom resource manager/custom cumulative statistics/ Oops. Will fix. -- Michael
Вложения
At Thu, 13 Jun 2024 16:59:50 +0900, Michael Paquier <michael@paquier.xyz> wrote in > * The kind IDs may change across restarts, meaning that any stats data > associated to a custom kind is stored with the *name* of the custom > stats kind. Depending on the discussion happening here, I'd be open > to use the same concept as custom RMGRs, where custom kind IDs are > "reserved", fixed in time, and tracked in the Postgres wiki. It is > cheaper to store the stats this way, as well, while managing conflicts > across extensions available in the community ecosystem. I prefer to avoid having a central database if possible. If we don't intend to move stats data alone out of a cluster for use in another one, can't we store the relationship between stats names and numeric IDs (or index numbers) in a separate file, which is loaded just before and synced just after extension preloading finishes? regards. -- Kyotaro Horiguchi NTT Open Source Software Center
On Fri, Jun 21, 2024 at 01:09:10PM +0900, Kyotaro Horiguchi wrote: > At Thu, 13 Jun 2024 16:59:50 +0900, Michael Paquier <michael@paquier.xyz> wrote in >> * The kind IDs may change across restarts, meaning that any stats data >> associated to a custom kind is stored with the *name* of the custom >> stats kind. Depending on the discussion happening here, I'd be open >> to use the same concept as custom RMGRs, where custom kind IDs are >> "reserved", fixed in time, and tracked in the Postgres wiki. It is >> cheaper to store the stats this way, as well, while managing conflicts >> across extensions available in the community ecosystem. > > I prefer to avoid having a central database if possible. I was thinking the same originally, but the experience with custom RMGRs has made me change my mind. There are more of these than I thought originally: https://wiki.postgresql.org/wiki/CustomWALResourceManagers > If we don't intend to move stats data alone out of a cluster for use > in another one, can't we store the relationship between stats names > and numeric IDs (or index numbers) in a separate file, which is loaded > just before and synced just after extension preloading finishes? Yeah, I've implemented a prototype that does exactly something like that with a restriction on the stats name to NAMEDATALEN, except that I've added the kind ID <-> kind name mapping at the beginning of the main stats file. At the end, it still felt weird and over-engineered to me, like the v1 prototype of upthread, because we finish with a strange mix when reloading the dshash where the builtin ID are handled with fixed values, with more code paths required when doing the serialize callback dance for stats kinds like replication slots, because the custom kinds need to update their hash keys to the new values based on the ID/name mapping stored at the beginning of the file itself. The equation complicates itself a bit more once you'd try to add more ways to write some stats kinds to other places, depending on what a custom kind wants to achieve. I can see the benefits of both approaches, still fixing the IDs in time leads to a lot of simplicity in this infra, which is very appealing on its own before tackling the next issues where I would rely on the proposed APIs. -- Michael