Re: Report bytes and transactions actually sent downtream
От | Ashutosh Bapat |
---|---|
Тема | Re: Report bytes and transactions actually sent downtream |
Дата | |
Msg-id | CAExHW5tfVHABuv1moL_shp7oPrWmg8ha7T8CqwZxiMrKror7iw@mail.gmail.com обсуждение исходный текст |
Ответ на | Re: Report bytes and transactions actually sent downtream (Amit Kapila <amit.kapila16@gmail.com>) |
Список | pgsql-hackers |
Hi Amit, On Mon, Jul 14, 2025 at 3:31 PM Amit Kapila <amit.kapila16@gmail.com> wrote: > > On Mon, Jul 14, 2025 at 10:55 AM Ashutosh Bapat > <ashutosh.bapat.oss@gmail.com> wrote: > > > > On Sun, Jul 13, 2025 at 4:34 PM Amit Kapila <amit.kapila16@gmail.com> wrote: > > > > > > > > > I think we don't want to make it mandatory for plugins to implement > > > these stats, so instead of throwing ERROR, the view should show that > > > the plugin doesn't provide stats. How about having OutputPluginStats > > > similar to OutputPluginCallbacks and OutputPluginOptions members in > > > LogicalDecodingContext? It will have members like stats_available, > > > txns_sent or txns_skipped, txns_filtered, etc. > > > > Not making mandatory looks useful. I can try your suggestion. Rather > > than having stats_available as a member of OutputPluginStats, it's > > better to have a NULL value for the corresponding member in > > LogicalDecodingContext. We don't want an output plugin to reset > > stats_available once set. Will that work? > > > > We can try that. > > > > I am thinking it will > > > be better to provide this information in a separate view like > > > pg_stat_plugin_stats or something like that, here we can report > > > slot_name, plugin_name, then the other stats we want to implement part > > > of OutputPluginStats. > > > > As you have previously pointed out, the view should make it explicit > > that the new stats are maintained by the plugin and not core. I agree > > with that intention. However, already have three views > > pg_replication_slots (which has slot name and plugin name), then > > pg_replication_stats which is about stats maintained by a WAL sender > > or running replication and then pg_stat_replication_slots, which is > > about accumulated statistics for a replication through a given > > replication slot. It's already a bit hard to keep track of who's who > > when debugging an issue. Adding one more view will add to confusion. > > > > Instead of adding a new view how about > > a. name the columns as plugin_sent_txns, plugin_sent_bytes, > > plugin_filtered_change_bytes to make it clear that these columns are > > maintained by plugin > > b. report these NULL if stats_available = false OR OutputPluginStats > > is not set in LogicalDecodingContext > > c. Document that NULL value for these columns indicates that the > > plugin is not maintaining/reporting these stats > > d. adding plugin name to pg_stat_replication_slots, that will make it > > easy for users to know which plugin they should look at in case of > > dubious or unavailable stats > > > > Sounds reasonable. Here's the next patch which considers all the discussion so far. It adds four fields to pg_stat_replication_slots. - plugin - name of the output plugin - plugin_filtered_bytes - reports the amount of changes filtered out by the output plugin - plugin_sent_txns - the amount of transactions sent downstream by the output plugin - plugin_sent_bytes - the amount of data sent downstream by the outputplugin. There are some points up for a discussion: 1. pg_stat_reset_replication_slot() zeroes out the statistics entry by calling pgstat_reset() or pgstat_reset_of_kind() which don't know about the contents of the entry. So PgStat_StatReplSlotEntry::plugin_has_stats is set to false and plugin stats are reported as NULL, instead of zero, immediately after reset. This is the same case when the stats is queried immediately after the statistics is initialized and before any stats are reported. We could instead make it report zero, if we save the plugin_has_stats and restore it after reset. But doing that in pgstat_reset_of_kind() seems like an extra overhead + we will need to write a function to find all replication slot entries. Resetting the stats would be a rare event in practice. Trying to report 0 instead of NULL in that rare case doesn't seem to be worth the efforts and code. Given that the core code doesn't know whether a given plugin reports stats or not, I think this behaviour is appropriate as long as we document it. Please let me know if the documentation in the patch is clear enough. 2. There's also a bit of asymmetry in the way sent_bytes is handled. The code which actually sends the logical changes to the downstream is part of the core code but the format of the change and hence the number of bytes sent is decided by the plugin. It's a stat related to plugin but maintained by the core code. The patch implements it as a plugin stat (so the corresponding column has "plugin" prefix and is also reported as NULL upon reset etc.), but we may want to reconsider how to report and maintain it. 3. The names of new columns have the prefix "plugin_" but the internal variables tracking those don't for the sake of brevity. If you prefer to have the same prefix for the internal variables, I can change that. I think I have covered all the cases where filteredbytes should be incremented, but please let me know if I have missed any. -- Best Wishes, Ashutosh Bapat
Вложения
В списке pgsql-hackers по дате отправления: