Обсуждение: Add new option 'all' to pg_stat_reset_shared()
Hi, After 96f052613f3, we have below 6 types of parameter for pg_stat_reset_shared(). "archiver", "bgwriter", "checkpointer", "io", "recovery_prefetch", "wal" How about adding a new option 'all' to delete all targets above? I imagine there are cases where people want to initialize all of them at the same time in addition to initializing one at a time. -- Regards, -- Atsushi Torikoshi NTT DATA Group Corporation
At Mon, 30 Oct 2023 16:35:19 +0900, torikoshia <torikoshia@oss.nttdata.com> wrote in > Hi, > > After 96f052613f3, we have below 6 types of parameter for > pg_stat_reset_shared(). > > "archiver", "bgwriter", "checkpointer", "io", "recovery_prefetch", > "wal" > > How about adding a new option 'all' to delete all targets above? > > I imagine there are cases where people want to initialize all of them > at the same time in addition to initializing one at a time. FWIW, I fairly often wanted it, but forgot about that:p regards. -- Kyotaro Horiguchi NTT Open Source Software Center
On Mon, Oct 30, 2023 at 1:47 PM Kyotaro Horiguchi <horikyota.ntt@gmail.com> wrote: > > At Mon, 30 Oct 2023 16:35:19 +0900, torikoshia <torikoshia@oss.nttdata.com> wrote in > > Hi, > > > > After 96f052613f3, we have below 6 types of parameter for > > pg_stat_reset_shared(). > > > > "archiver", "bgwriter", "checkpointer", "io", "recovery_prefetch", > > "wal" > > > > How about adding a new option 'all' to delete all targets above? > > > > I imagine there are cases where people want to initialize all of them > > at the same time in addition to initializing one at a time. > > FWIW, I fairly often wanted it, but forgot about that:p Isn't calling pg_stat_reset_shared() for all stats types helping you out? Is there any problem with it? Can you be more specific about the use-case? IMV, I don't see any point for adding another pseudo (rather non-existent) shared stats target which might confuse users - it's easy to specify pg_stat_reset_shared('all'); to clear things out when someone actually doesn't want to reset all - an accidental usage of the 'all' option will reset all shared memory stats. -- Bharath Rupireddy PostgreSQL Contributors Team RDS Open Source Databases Amazon Web Services: https://aws.amazon.com
At Mon, 30 Oct 2023 14:15:53 +0530, Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com> wrote in > > > I imagine there are cases where people want to initialize all of them > > > at the same time in addition to initializing one at a time. > > > > FWIW, I fairly often wanted it, but forgot about that:p > > Isn't calling pg_stat_reset_shared() for all stats types helping you > out? Is there any problem with it? Can you be more specific about the > use-case? Oh. Sorry, it's my bad. pg_stat_reset_shared() is sufficient. regards. -- Kyotaro Horiguchi NTT Open Source Software Center
On Mon, Oct 30, 2023 at 5:46 PM Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com> wrote: Thanks for the comments! > Isn't calling pg_stat_reset_shared() for all stats types helping you > out? Is there any problem with it? Can you be more specific about the > use-case? Yes, calling pg_stat_reset_shared() for all stats types can do what I wanted to do. But calling it with 6 different parameters seems tiresome and I thought it would be convenient to have a parameter to delete all cluster-wide statistics at once. I may be wrong, but I imagine that it's more common to want to delete all of the statistics for an entire cluster rather than just a portion of it. > IMV, I don't see any point for adding another pseudo (rather > non-existent) shared stats target which might confuse users - it's > easy to specify pg_stat_reset_shared('all'); to clear things out when > someone actually doesn't want to reset all - an accidental usage of > the 'all' option will reset all shared memory stats. I once considered changing the pg_stat_reset_shared() to delete all stats when called without parameters like pg_stat_statements_reset(), but gave it up since it can confuse users as you described. I was hoping that the need to specify 'all' would remind users that the target can be specified individually. -- Regards, -- Atsushi Torikoshi NTT DATA Group Corporation
On Tue, Oct 31, 2023 at 04:26:18PM +0900, torikoshia wrote: > Yes, calling pg_stat_reset_shared() for all stats types can do what I wanted > to do. > But calling it with 6 different parameters seems tiresome and I thought it > would be convenient to have a parameter to delete all cluster-wide > statistics at once. > > I may be wrong, but I imagine that it's more common to want to delete all of > the statistics for an entire cluster rather than just a portion of it. If more flexibility is wanted in this function, could it be an option to consider a flavor like pg_stat_reset_shared(text[]), where it is possible to specify a list of shared stats types to reset? Perhaps there are no real use cases for it, just wanted to mention it anyway regarding the fact that it could have benefits to refactor this code to use a bitwise operator for its internals with bit flags for each type. -- Michael
Вложения
On Wed, Nov 1, 2023 at 4:24 AM Michael Paquier <michael@paquier.xyz> wrote: > > On Tue, Oct 31, 2023 at 04:26:18PM +0900, torikoshia wrote: > > Yes, calling pg_stat_reset_shared() for all stats types can do what I wanted > > to do. > > But calling it with 6 different parameters seems tiresome and I thought it > > would be convenient to have a parameter to delete all cluster-wide > > statistics at once. > > > > I may be wrong, but I imagine that it's more common to want to delete all of > > the statistics for an entire cluster rather than just a portion of it. > > If more flexibility is wanted in this function, could it be an option > to consider a flavor like pg_stat_reset_shared(text[]), where it is > possible to specify a list of shared stats types to reset? Perhaps > there are no real use cases for it, just wanted to mention it anyway > regarding the fact that it could have benefits to refactor this code > to use a bitwise operator for its internals with bit flags for each > type. I don't see a strong reason to introduce yet-another API when someone can just call things in a loop. I could recollect a recent analogy - a proposal to have a way to define multiple custom wait events with a single function call instead of callers defining in a loop didn't draw much interest. -- Bharath Rupireddy PostgreSQL Contributors Team RDS Open Source Databases Amazon Web Services: https://aws.amazon.com
On Thu, 2 Nov 2023 at 20:26, Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com> wrote: > > On Wed, Nov 1, 2023 at 4:24 AM Michael Paquier <michael@paquier.xyz> wrote: > > > > On Tue, Oct 31, 2023 at 04:26:18PM +0900, torikoshia wrote: > > > Yes, calling pg_stat_reset_shared() for all stats types can do what I wanted > > > to do. > > > But calling it with 6 different parameters seems tiresome and I thought it > > > would be convenient to have a parameter to delete all cluster-wide > > > statistics at once. > > > > > > I may be wrong, but I imagine that it's more common to want to delete all of > > > the statistics for an entire cluster rather than just a portion of it. > > > > If more flexibility is wanted in this function, could it be an option > > to consider a flavor like pg_stat_reset_shared(text[]), where it is > > possible to specify a list of shared stats types to reset? Perhaps > > there are no real use cases for it, just wanted to mention it anyway > > regarding the fact that it could have benefits to refactor this code > > to use a bitwise operator for its internals with bit flags for each > > type. > > I don't see a strong reason to introduce yet-another API when someone > can just call things in a loop. I could recollect a recent analogy - a > proposal to have a way to define multiple custom wait events with a > single function call instead of callers defining in a loop didn't draw > much interest. Knowing that your metrics have a shared starting point can be quite valuable, as it allows you to do some math that would otherwise be much less accurate when working with stats over a short amount of time. I've not used these stats systems much myself, but skew between metrics caused by different reset points can be difficult to detect and debug, so I think an atomic call to reset all these stats could be worth implementing. Kind regards, Matthias van de Meent Neon (https://neon.tech)
Hi, On 2023-11-03 00:55:00 +0530, Bharath Rupireddy wrote: > On Wed, Nov 1, 2023 at 4:24 AM Michael Paquier <michael@paquier.xyz> wrote: > > > > On Tue, Oct 31, 2023 at 04:26:18PM +0900, torikoshia wrote: > > > Yes, calling pg_stat_reset_shared() for all stats types can do what I wanted > > > to do. > > > But calling it with 6 different parameters seems tiresome and I thought it > > > would be convenient to have a parameter to delete all cluster-wide > > > statistics at once. > > > I may be wrong, but I imagine that it's more common to want to delete all of > > > the statistics for an entire cluster rather than just a portion of it. Yes, agreed. However, I'd suggest adding pg_stat_reset_shared(), instead of pg_stat_reset_shared('all') for this purpose. > > If more flexibility is wanted in this function, could it be an option > > to consider a flavor like pg_stat_reset_shared(text[]), where it is > > possible to specify a list of shared stats types to reset? Perhaps > > there are no real use cases for it, just wanted to mention it anyway > > regarding the fact that it could have benefits to refactor this code > > to use a bitwise operator for its internals with bit flags for each > > type. I don't think there is much point in such an API - if the caller actually wants to delete individual stats, it's not too hard to loop. But most of the time resetting individual stats doesn't make sense. IMO it was a mistake to ever add the ability. But that ship has sailed. > I don't see a strong reason to introduce yet-another API when someone > can just call things in a loop. I don't agree at all. That requires callers to know the set of possible values that stats need to be reset for - which has grown over time. But nearly all the time the right thing to do is to reset *all* shared stats, not just some. > I could recollect a recent analogy - a > proposal to have a way to define multiple custom wait events with a > single function call instead of callers defining in a loop didn't draw > much interest. That's not analoguous - in your example the caller by definition knows the set of wait events it wants to create. Introducing a batch API wouldn't change that. But in case of resetting all stats the caller does *not* inherently know the set of stats types. Greetings, Andres Freund
Thanks all for the comments! On Fri, Nov 3, 2023 at 5:17 AM Matthias van de Meent <boekewurm+postgres@gmail.com> wrote: > Knowing that your metrics have a shared starting point can be quite > valuable, as it allows you to do some math that would otherwise be > much less accurate when working with stats over a short amount of > time. I've not used these stats systems much myself, but skew between > metrics caused by different reset points can be difficult to detect > and debug, so I think an atomic call to reset all these stats could be > worth implementing. Since each stats, except wal_prefetch was reset acquiring LWLock, attached PoC patch makes the call atomic by using these LWlocks. If this is the right direction, I'll try to make wal_prefetch also take LWLock. On 2023-11-04 10:49, Andres Freund wrote: > Yes, agreed. However, I'd suggest adding pg_stat_reset_shared(), > instead of > pg_stat_reset_shared('all') for this purpose. In the attached PoC patch the shared statistics are reset by calling pg_stat_reset_shared() not pg_stat_reset_shared('all'). -- Regards, -- Atsushi Torikoshi NTT DATA Group Corporation
Вложения
On Sat, Nov 4, 2023 at 7:19 AM Andres Freund <andres@anarazel.de> wrote: > > On 2023-11-03 00:55:00 +0530, Bharath Rupireddy wrote: > > On Wed, Nov 1, 2023 at 4:24 AM Michael Paquier <michael@paquier.xyz> wrote: > > > > > > On Tue, Oct 31, 2023 at 04:26:18PM +0900, torikoshia wrote: > > > > Yes, calling pg_stat_reset_shared() for all stats types can do what I wanted > > > > to do. > > > > But calling it with 6 different parameters seems tiresome and I thought it > > > > would be convenient to have a parameter to delete all cluster-wide > > > > statistics at once. > > > > I may be wrong, but I imagine that it's more common to want to delete all of > > > > the statistics for an entire cluster rather than just a portion of it. > > Yes, agreed. However, I'd suggest adding pg_stat_reset_shared(), instead of > pg_stat_reset_shared('all') for this purpose. An overloaded function seems a better choice than another target 'all'. I'm all +1 for it as others seem to concur with the idea of having something to reset all shared stats. > > > If more flexibility is wanted in this function, could it be an option > > > to consider a flavor like pg_stat_reset_shared(text[]), where it is > > > possible to specify a list of shared stats types to reset? Perhaps > > > there are no real use cases for it, just wanted to mention it anyway > > > regarding the fact that it could have benefits to refactor this code > > > to use a bitwise operator for its internals with bit flags for each > > > type. > > I don't think there is much point in such an API - if the caller actually > wants to delete individual stats, it's not too hard to loop. -1 for text[] version. -- Bharath Rupireddy PostgreSQL Contributors Team RDS Open Source Databases Amazon Web Services: https://aws.amazon.com
On Mon, Nov 6, 2023 at 11:39 AM torikoshia <torikoshia@oss.nttdata.com> wrote: > > Thanks all for the comments! > > On Fri, Nov 3, 2023 at 5:17 AM Matthias van de Meent > <boekewurm+postgres@gmail.com> wrote: > > Knowing that your metrics have a shared starting point can be quite > > valuable, as it allows you to do some math that would otherwise be > > much less accurate when working with stats over a short amount of > > time. I've not used these stats systems much myself, but skew between > > metrics caused by different reset points can be difficult to detect > > and debug, so I think an atomic call to reset all these stats could be > > worth implementing. > > Since each stats, except wal_prefetch was reset acquiring LWLock, > attached PoC patch makes the call atomic by using these LWlocks. > > If this is the right direction, I'll try to make wal_prefetch also take > LWLock. + // Acquire LWLocks + LWLock *locks[] = {&stats_archiver->lock, &stats_bgwriter->lock, + &stats_checkpointer->lock, &stats_wal->lock}; + + for (int i = 0; i < STATS_SHARED_NUM_LWLOCK; i++) + LWLockAcquire(locks[i], LW_EXCLUSIVE); + + for (int i = 0; i < BACKEND_NUM_TYPES; i++) + { + LWLock *bktype_lock = &stats_io->locks[i]; + LWLockAcquire(bktype_lock, LW_EXCLUSIVE); + } Well, that's a total of ~17 LWLocks this new function takes to make the stats reset atomic. I'm not sure if this atomicity is worth the effort which can easily be misused - what if someone runs something like SELECT pg_stat_reset_shared() FROM generate_series(1, 100000....n); to cause heavy lock acquisition and release cycles? IMV, atomicity is not something that applies for the stats reset operation because stats are approximate numbers by nature after all. If the pg_stat_reset_shared() resets stats for only a bunch of stats types and fails, it's the basic application programming style that when a query fails it's the application that needs to have a retry mechanism. FWIW, the atomicity doesn't apply today if someone wants to reset stats in a loop for all stats types. > On 2023-11-04 10:49, Andres Freund wrote: > > > Yes, agreed. However, I'd suggest adding pg_stat_reset_shared(), > > instead of > > pg_stat_reset_shared('all') for this purpose. > > In the attached PoC patch the shared statistics are reset by calling > pg_stat_reset_shared() not pg_stat_reset_shared('all'). Some quick comments: 1. +/* +pg_stat_reset_shared_all(PG_FUNCTION_ARGS) +{ + pgstat_reset_shared_all(); + PG_RETURN_VOID(); +} IMO, simpler way is to move the existing code in pg_stat_reset_shared() to a common internal function like pgstat_reset_shared(char *target) and the pg_stat_reset_shared_all() can just loop over all the stats targets. 2. +{ oid => '8000', + descr => 'statistics: reset collected statistics shared across the cluster', + proname => 'pg_stat_reset_shared', provolatile => 'v', prorettype => 'void', + proargtypes => '', prosrc => 'pg_stat_reset_shared_all' }, Why a new function consuming the oid? Why can't we just do the trick of proisstrict => 'f' and if (PG_ARGISNULL(0)) { reset all stats} else {reset specified stats kind} like the pg_stat_reset_slru()? 3. I think the new reset all stats function must also consider resetting all SLRU stats, no? /* stats for fixed-numbered objects */ PGSTAT_KIND_ARCHIVER, PGSTAT_KIND_BGWRITER, PGSTAT_KIND_CHECKPOINTER, PGSTAT_KIND_IO, PGSTAT_KIND_SLRU, PGSTAT_KIND_WAL, 4. I think the new reset all stats function must also consider resetting recovery_prefetch. 5. + If no argument is specified, reset all these views at once. + Note current patch is WIP and pg_stat_recovery_prefetch is not reset. </para> How about "If the argument is NULL, all counters shown in all of these views are reset."? 6. Add a test case to cover the code in stats.sql. -- Bharath Rupireddy PostgreSQL Contributors Team RDS Open Source Databases Amazon Web Services: https://aws.amazon.com
At Mon, 6 Nov 2023 14:00:13 +0530, Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com> wrote in > On Mon, Nov 6, 2023 at 11:39 AM torikoshia <torikoshia@oss.nttdata.com> wrote: > > Since each stats, except wal_prefetch was reset acquiring LWLock, > > attached PoC patch makes the call atomic by using these LWlocks. > > > > If this is the right direction, I'll try to make wal_prefetch also take > > LWLock. I must say, I have reservations about this approach. The main concern is the duplication of reset code, which has been efficiently encapsulated for individual targets, into this location. This practice degrades the maintainability and clarity of the code. > Well, that's a total of ~17 LWLocks this new function takes to make > the stats reset atomic. I'm not sure if this atomicity is worth the > effort which can easily be misused - what if someone runs something > like SELECT pg_stat_reset_shared() FROM generate_series(1, > 100000....n); to cause heavy lock acquisition and release cycles? ... > IMV, atomicity is not something that applies for the stats reset > operation because stats are approximate numbers by nature after all. > If the pg_stat_reset_shared() resets stats for only a bunch of stats > types and fails, it's the basic application programming style that > when a query fails it's the application that needs to have a retry > mechanism. FWIW, the atomicity doesn't apply today if someone wants to > reset stats in a loop for all stats types. The infrequent use of this feature, coupled with the fact that there is no inherent need for these counters to be reset simultaneoulsy, leads me to think that there is little point in cnetralizing the locks. regards. -- Kyotaro Horiguchi NTT Open Source Software Center
On Wed, Nov 08, 2023 at 10:08:42AM +0900, Kyotaro Horiguchi wrote: > At Mon, 6 Nov 2023 14:00:13 +0530, Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com> wrote in > I must say, I have reservations about this approach. The main concern > is the duplication of reset code, which has been efficiently > encapsulated for individual targets, into this location. This practice > degrades the maintainability and clarity of the code. +{ oid => '8000', This OID pick had me smile. >> IMV, atomicity is not something that applies for the stats reset >> operation because stats are approximate numbers by nature after all. >> If the pg_stat_reset_shared() resets stats for only a bunch of stats >> types and fails, it's the basic application programming style that >> when a query fails it's the application that needs to have a retry >> mechanism. FWIW, the atomicity doesn't apply today if someone wants to >> reset stats in a loop for all stats types. > > The infrequent use of this feature, coupled with the fact that there > is no inherent need for these counters to be reset simultaneoulsy, > leads me to think that there is little point in centralizing the > locks. Each stat listed with fixed_amount has meaning taken in isolation, so I don't see why this patch has to be that complicated. I'd expect one code path that just calls a series of pgstat_reset_of_kind(). There could be an argument for a new routine in pgstat.c that loops over the pgstat_kind_infos and triggers the callbacks where .fixed_amount is set, but that's less transparent than the other approach. The reset time should be consistent across all the calls as we rely on GetCurrentTimestamp(). -- Michael
Вложения
Hi, On 2023-11-06 14:00:13 +0530, Bharath Rupireddy wrote: > On Mon, Nov 6, 2023 at 11:39 AM torikoshia <torikoshia@oss.nttdata.com> wrote: > > > > Thanks all for the comments! > > > > On Fri, Nov 3, 2023 at 5:17 AM Matthias van de Meent > > <boekewurm+postgres@gmail.com> wrote: > > > Knowing that your metrics have a shared starting point can be quite > > > valuable, as it allows you to do some math that would otherwise be > > > much less accurate when working with stats over a short amount of > > > time. I've not used these stats systems much myself, but skew between > > > metrics caused by different reset points can be difficult to detect > > > and debug, so I think an atomic call to reset all these stats could be > > > worth implementing. > > > > Since each stats, except wal_prefetch was reset acquiring LWLock, > > attached PoC patch makes the call atomic by using these LWlocks. > > > > If this is the right direction, I'll try to make wal_prefetch also take > > LWLock. > > + // Acquire LWLocks > + LWLock *locks[] = {&stats_archiver->lock, &stats_bgwriter->lock, > + &stats_checkpointer->lock, &stats_wal->lock}; > + > + for (int i = 0; i < STATS_SHARED_NUM_LWLOCK; i++) > + LWLockAcquire(locks[i], LW_EXCLUSIVE); > + > + for (int i = 0; i < BACKEND_NUM_TYPES; i++) > + { > + LWLock *bktype_lock = &stats_io->locks[i]; > + LWLockAcquire(bktype_lock, LW_EXCLUSIVE); > + } > > Well, that's a total of ~17 LWLocks this new function takes to make > the stats reset atomic. I'm not sure if this atomicity is worth the > effort which can easily be misused - what if someone runs something > like SELECT pg_stat_reset_shared() FROM generate_series(1, > 100000....n); to cause heavy lock acquisition and release cycles? Yea, this seems like an *extremely* bad idea to me. Without careful analysis it could very well cause deadlocks. > IMV, atomicity is not something that applies for the stats reset > operation because stats are approximate numbers by nature after all. > If the pg_stat_reset_shared() resets stats for only a bunch of stats > types and fails, it's the basic application programming style that > when a query fails it's the application that needs to have a retry > mechanism. FWIW, the atomicity doesn't apply today if someone wants to > reset stats in a loop for all stats types. Yea. Additionally it's not really atomic regardless of the lwlocks, due to various processes all accumulating in local counters first, and only occasionally updating the shared data. So even after holding all the locks at the same time, the shared stats would still not actually represent a truly atomic state. > 2. > +{ oid => '8000', > + descr => 'statistics: reset collected statistics shared across the cluster', > + proname => 'pg_stat_reset_shared', provolatile => 'v', prorettype => 'void', > + proargtypes => '', prosrc => 'pg_stat_reset_shared_all' }, > > Why a new function consuming the oid? Why can't we just do the trick > of proisstrict => 'f' and if (PG_ARGISNULL(0)) { reset all stats} else > {reset specified stats kind} like the pg_stat_reset_slru()? It's not like oids are a precious resource. It's a more confusing API to have to have to specify a NULL as an argument than not having to do so. If we really want to avoid a separate oid, a more sensible path would be to add a default argument to pg_stat_reset_slru() (by doing a CREATE OR REPLACE in system_functions.sql). Greetings, Andres Freund
Hi, On 2023-11-08 10:08:42 +0900, Kyotaro Horiguchi wrote: > At Mon, 6 Nov 2023 14:00:13 +0530, Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com> wrote in > > On Mon, Nov 6, 2023 at 11:39 AM torikoshia <torikoshia@oss.nttdata.com> wrote: > > > Since each stats, except wal_prefetch was reset acquiring LWLock, > > > attached PoC patch makes the call atomic by using these LWlocks. > > > > > > If this is the right direction, I'll try to make wal_prefetch also take > > > LWLock. > > I must say, I have reservations about this approach. The main concern > is the duplication of reset code, which has been efficiently > encapsulated for individual targets, into this location. This practice > degrades the maintainability and clarity of the code. Yes, as-is this seems to evolve the code in precisely the wrong direction. We want less central awareness of different types of stats, not more. The proposed new code is far longer than the current pg_stat_reset(), despite doing something conceptually simpler. Greetings, Andres Freund
On Wed, Nov 8, 2023 at 9:43 AM Andres Freund <andres@anarazel.de> wrote: > > > 2. > > +{ oid => '8000', > > + descr => 'statistics: reset collected statistics shared across the cluster', > > + proname => 'pg_stat_reset_shared', provolatile => 'v', prorettype => 'void', > > + proargtypes => '', prosrc => 'pg_stat_reset_shared_all' }, > > > > Why a new function consuming the oid? Why can't we just do the trick > > of proisstrict => 'f' and if (PG_ARGISNULL(0)) { reset all stats} else > > {reset specified stats kind} like the pg_stat_reset_slru()? > > It's not like oids are a precious resource. It's a more confusing API to have > to have to specify a NULL as an argument than not having to do so. If we > really want to avoid a separate oid, a more sensible path would be to add a > default argument to pg_stat_reset_slru() (by doing a CREATE OR REPLACE in > system_functions.sql). +1. Attached the patch. -- Bharath Rupireddy PostgreSQL Contributors Team RDS Open Source Databases Amazon Web Services: https://aws.amazon.com
Вложения
On Wed, 8 Nov 2023 at 05:13, Andres Freund <andres@anarazel.de> wrote: > > Hi, > > On 2023-11-06 14:00:13 +0530, Bharath Rupireddy wrote: > > Well, that's a total of ~17 LWLocks this new function takes to make > > the stats reset atomic. I'm not sure if this atomicity is worth the > > effort which can easily be misused - what if someone runs something > > like SELECT pg_stat_reset_shared() FROM generate_series(1, > > 100000....n); to cause heavy lock acquisition and release cycles? > > Yea, this seems like an *extremely* bad idea to me. Without careful analysis > it could very well cause deadlocks. I didn't realize that it'd take 17 LwLocks to reset those stats; I thought it was one shared system using the same lock, or a very limited set of locks. Aquiring 17 locks is quite likely not worth the chance of having to wait for some stats lock or another and thus generating 'bubbles' in other stats gathering pipelines. > > IMV, atomicity is not something that applies for the stats reset > > operation because stats are approximate numbers by nature after all. > > If the pg_stat_reset_shared() resets stats for only a bunch of stats > > types and fails, it's the basic application programming style that > > when a query fails it's the application that needs to have a retry > > mechanism. FWIW, the atomicity doesn't apply today if someone wants to > > reset stats in a loop for all stats types. > > Yea. Additionally it's not really atomic regardless of the lwlocks, due to > various processes all accumulating in local counters first, and only > occasionally updating the shared data. So even after holding all the locks at > the same time, the shared stats would still not actually represent a truly > atomic state. Good points that I hadn't thought much about yet. I agree that atomic reset isn't worth implementing in this stats system - it's too costly and complex to do it correctly. Kind regards, Matthias van de Meent Neon (https://neon.tech)
On Wed, Nov 08, 2023 at 02:15:24PM +0530, Bharath Rupireddy wrote: > On Wed, Nov 8, 2023 at 9:43 AM Andres Freund <andres@anarazel.de> wrote: >> It's not like oids are a precious resource. It's a more confusing API to have >> to have to specify a NULL as an argument than not having to do so. If we >> really want to avoid a separate oid, a more sensible path would be to add a >> default argument to pg_stat_reset_slru() (by doing a CREATE OR REPLACE in >> system_functions.sql). > > +1. Attached the patch. > > -- Test that multiple SLRUs are reset when no specific SLRU provided to reset function > -SELECT pg_stat_reset_slru(NULL); > +SELECT pg_stat_reset_slru(); For the SLRU part, why not. Hmm. What's the final plan for pg_stat_reset_shared(), then? An equivalent that calls a series of pgstat_reset_of_kind()? -- Michael
Вложения
On 2023-11-09 08:58, Michael Paquier wrote: > On Wed, Nov 08, 2023 at 02:15:24PM +0530, Bharath Rupireddy wrote: >> On Wed, Nov 8, 2023 at 9:43 AM Andres Freund <andres@anarazel.de> >> wrote: >>> It's not like oids are a precious resource. It's a more confusing API >>> to have >>> to have to specify a NULL as an argument than not having to do so. If >>> we >>> really want to avoid a separate oid, a more sensible path would be to >>> add a >>> default argument to pg_stat_reset_slru() (by doing a CREATE OR >>> REPLACE in >>> system_functions.sql). >> >> +1. Attached the patch. >> >> -- Test that multiple SLRUs are reset when no specific SLRU provided >> to reset function >> -SELECT pg_stat_reset_slru(NULL); >> +SELECT pg_stat_reset_slru(); > > For the SLRU part, why not. > > Hmm. What's the final plan for pg_stat_reset_shared(), then? An > equivalent that calls a series of pgstat_reset_of_kind()? Sorry for late reply and thanks for the feedbacks everyone. As your 1st suggestion, I think "calls a series of pgstat_reset_of_kind()" would be enough. I am a little concerned about that the reset time is not the same and that GetCurrentTimestamp() is called multiple times, but I think it would be acceptable because the function is probably not used that often and the reset time is not atomic in practice. I'll attach the patch. -- Regards, -- Atsushi Torikoshi NTT DATA Group Corporation
On Thu, Nov 09, 2023 at 10:10:39AM +0900, torikoshia wrote: > I am a little concerned about that the reset time is not the same and that > GetCurrentTimestamp() is called multiple times, but I think it would be > acceptable because the function is probably not used that often and the > reset time is not atomic in practice. Arf, right. I misremembered that this is just a clock_timestamp() so that's not transaction-resilient. Anyway, my take is that this is not a big deal in practice compared to the usability of the wrapper. -- Michael
Вложения
Hi, On 2023-11-09 10:25:18 +0900, Michael Paquier wrote: > On Thu, Nov 09, 2023 at 10:10:39AM +0900, torikoshia wrote: > > I am a little concerned about that the reset time is not the same and that > > GetCurrentTimestamp() is called multiple times, but I think it would be > > acceptable because the function is probably not used that often and the > > reset time is not atomic in practice. > > Arf, right. I misremembered that this is just a clock_timestamp() so > that's not transaction-resilient. Anyway, my take is that this is not > a big deal in practice compared to the usability of the wrapper. It seems inconsequential cost-wise. Resetting stats is way more expensive that a few timestamp determinations. Correctness wise it actually seems *better* to record the timestamps more granularly, after all, that moves them closer to the time the individual kind of stats is reset. Greetings, Andres Freund
On Thu, Nov 09, 2023 at 10:10:39AM +0900, torikoshia wrote: > I'll attach the patch. Attached. On Mon, Nov 6, 2023 at 5:30 PM Bharath Rupireddy > 3. I think the new reset all stats function must also consider > resetting all SLRU stats, no? > /* stats for fixed-numbered objects */ > PGSTAT_KIND_ARCHIVER, > PGSTAT_KIND_BGWRITER, > PGSTAT_KIND_CHECKPOINTER, > PGSTAT_KIND_IO, > PGSTAT_KIND_SLRU, > PGSTAT_KIND_WAL, PGSTAT_KIND_SLRU cannot be reset by pg_stat_reset_shared(), so I feel uncomfortable to delete it all together. It might be better after pg_stat_reset_shared() has been modified to take 'slru' as an argument, though. On Wed, Nov 8, 2023 at 1:13 PM Andres Freund <andres@anarazel.de> wrote: > It's not like oids are a precious resource. It's a more confusing API > to have > to have to specify a NULL as an argument than not having to do so. If > we > really want to avoid a separate oid, a more sensible path would be to > add a > default argument to pg_stat_reset_slru() (by doing a CREATE OR REPLACE > in > system_functions.sql). Currently proisstrict is true and pg_stat_reset_shared() returns null without doing any work. I thought it would be better to reset statistics even when null is specified so that users are not confused with the behavior of pg_stat_reset_slru(). Attached patch added pg_stat_reset_shared() in system_functions.sql mainly for this reason. -- Regards, -- Atsushi Torikoshi NTT DATA Group Corporation
Вложения
On Thu, Nov 09, 2023 at 01:50:34PM +0900, torikoshia wrote: > PGSTAT_KIND_SLRU cannot be reset by pg_stat_reset_shared(), so I feel > uncomfortable to delete it all together. > It might be better after pg_stat_reset_shared() has been modified to take > 'slru' as an argument, though. Not sure how to feel about that, TBH, but I would not include SLRUs here if we have already a separate function. > I thought it would be better to reset statistics even when null is specified > so that users are not confused with the behavior of pg_stat_reset_slru(). > Attached patch added pg_stat_reset_shared() in system_functions.sql mainly > for this reason. I'm OK with doing what your patch does, aka do the work if the value is NULL or if there's no argument given. - Resets some cluster-wide statistics counters to zero, depending on the + Resets cluster-wide statistics counters to zero, depending on the This does not need to change, aka SLRUs are for example still global and not included here. + If the argument is NULL or not specified, all counters shown in all + of these views are reset. Missing a <literal> markup around NULL. I know, we're not consistent about that, either, but if we are tweaking the area let's be right at least. Perhaps "all the counters from the views listed above are reset"? -- Michael
Вложения
On 2023-11-09 16:28, Michael Paquier wrote: Thanks for your review. Attached v2 patch. > On Thu, Nov 09, 2023 at 01:50:34PM +0900, torikoshia wrote: >> PGSTAT_KIND_SLRU cannot be reset by pg_stat_reset_shared(), so I feel >> uncomfortable to delete it all together. >> It might be better after pg_stat_reset_shared() has been modified to >> take >> 'slru' as an argument, though. > > Not sure how to feel about that, TBH, but I would not include SLRUs > here if we have already a separate function. IMHO I agree with you. >> I thought it would be better to reset statistics even when null is >> specified >> so that users are not confused with the behavior of >> pg_stat_reset_slru(). >> Attached patch added pg_stat_reset_shared() in system_functions.sql >> mainly >> for this reason. > > I'm OK with doing what your patch does, aka do the work if the value > is NULL or if there's no argument given. > > - Resets some cluster-wide statistics counters to zero, > depending on the > + Resets cluster-wide statistics counters to zero, depending on > the > > This does not need to change, aka SLRUs are for example still global > and not included here. > > + If the argument is NULL or not specified, all counters shown > in all > + of these views are reset. > > Missing a <literal> markup around NULL. I know, we're not consistent > about that, either, but if we are tweaking the area let's be right at > least. Perhaps "all the counters from the views listed above are > reset"? > -- > Michael -- Regards, -- Atsushi Torikoshi NTT DATA Group Corporation
Вложения
On Fri, Nov 10, 2023 at 12:33:50PM +0900, torikoshia wrote: > On 2023-11-09 16:28, Michael Paquier wrote: >> Not sure how to feel about that, TBH, but I would not include SLRUs >> here if we have already a separate function. > > IMHO I agree with you. The comments added could be better grammatically, but basically LGTM. I'll take care of that if there are no objections. -- Michael
Вложения
Hi, On November 8, 2023 11:28:08 PM PST, Michael Paquier <michael@paquier.xyz> wrote: >On Thu, Nov 09, 2023 at 01:50:34PM +0900, torikoshia wrote: >> PGSTAT_KIND_SLRU cannot be reset by pg_stat_reset_shared(), so I feel >> uncomfortable to delete it all together. >> It might be better after pg_stat_reset_shared() has been modified to take >> 'slru' as an argument, though. > >Not sure how to feel about that, TBH, but I would not include SLRUs >here if we have already a separate function. > >> I thought it would be better to reset statistics even when null is specified >> so that users are not confused with the behavior of pg_stat_reset_slru(). >> Attached patch added pg_stat_reset_shared() in system_functions.sql mainly >> for this reason. > >I'm OK with doing what your patch does, aka do the work if the value >is NULL or if there's no argument given. > >- Resets some cluster-wide statistics counters to zero, depending on the >+ Resets cluster-wide statistics counters to zero, depending on the > >This does not need to change, aka SLRUs are for example still global >and not included here. > >+ If the argument is NULL or not specified, all counters shown in all >+ of these views are reset. > >Missing a <literal> markup around NULL. I know, we're not consistent >about that, either, but if we are tweaking the area let's be right at >least. Perhaps "all the counters from the views listed above are >reset"? I see no reason to not include slrus. We should never have added the ability to reset them individually, particularly notwithout a use case - I couldn't find one skimming some discussion. And what's the point in not allowing to reset themvia pg_stat_reset_shared()? Greetings, Andres -- Sent from my Android device with K-9 Mail. Please excuse my brevity.
On 2023-11-10 13:18, Andres Freund wrote: > Hi, > > On November 8, 2023 11:28:08 PM PST, Michael Paquier > <michael@paquier.xyz> wrote: >> On Thu, Nov 09, 2023 at 01:50:34PM +0900, torikoshia wrote: >>> PGSTAT_KIND_SLRU cannot be reset by pg_stat_reset_shared(), so I feel >>> uncomfortable to delete it all together. >>> It might be better after pg_stat_reset_shared() has been modified to >>> take >>> 'slru' as an argument, though. >> >> Not sure how to feel about that, TBH, but I would not include SLRUs >> here if we have already a separate function. >> >>> I thought it would be better to reset statistics even when null is >>> specified >>> so that users are not confused with the behavior of >>> pg_stat_reset_slru(). >>> Attached patch added pg_stat_reset_shared() in system_functions.sql >>> mainly >>> for this reason. >> >> I'm OK with doing what your patch does, aka do the work if the value >> is NULL or if there's no argument given. >> >> - Resets some cluster-wide statistics counters to zero, >> depending on the >> + Resets cluster-wide statistics counters to zero, depending on >> the >> >> This does not need to change, aka SLRUs are for example still global >> and not included here. >> >> + If the argument is NULL or not specified, all counters shown >> in all >> + of these views are reset. >> >> Missing a <literal> markup around NULL. I know, we're not consistent >> about that, either, but if we are tweaking the area let's be right at >> least. Perhaps "all the counters from the views listed above are >> reset"? > > I see no reason to not include slrus. We should never have added the > ability to reset them individually, particularly not without a use > case - I couldn't find one skimming some discussion. And what's the > point in not allowing to reset them via pg_stat_reset_shared()? When including SLRUs, do you think it's better to add 'slrus' argument which enables pg_stat_reset_shared() to reset all SLRUs? As described above, since SLRUs cannot be reset by pg_stat_reset_shared(), I feel a bit uncomfortable to delete it all together. -- Regards, -- Atsushi Torikoshi NTT DATA Group Corporation
On Fri, Nov 10, 2023 at 01:15:50PM +0900, Michael Paquier wrote: > The comments added could be better grammatically, but basically LGTM. > I'll take care of that if there are no objections. The documentation also needed a few tweaks (for DEFAULT and the argument name), so I have fixed the whole and adapted the new part of the docs to that, with few little tweaks. -- Michael
Вложения
On Fri, Nov 10, 2023 at 08:32:34PM +0900, torikoshia wrote: > On 2023-11-10 13:18, Andres Freund wrote: >> I see no reason to not include slrus. We should never have added the >> ability to reset them individually, particularly not without a use >> case - I couldn't find one skimming some discussion. And what's the >> point in not allowing to reset them via pg_stat_reset_shared()? > > When including SLRUs, do you think it's better to add 'slrus' argument which > enables pg_stat_reset_shared() to reset all SLRUs? I understand that Andres says that he'd be OK with a addition of a 'slru' option in pg_stat_reset_shared(), as well as including SLRUs in the resets if everything should be wiped. 28cac71bd368 is around since 13~, so changing pg_stat_reset_slru() or removing it could impact existing applications, so there's little benefit in changing it at this stage. Let it be itself. > As described above, since SLRUs cannot be reset by pg_stat_reset_shared(), I > feel a bit uncomfortable to delete it all together. That would be only effective if NULL is given to the function to reset everything, which is OK IMO, because this is a shared stats. -- Michael
Вложения
On 2023-11-12 16:46, Michael Paquier wrote: > On Fri, Nov 10, 2023 at 01:15:50PM +0900, Michael Paquier wrote: >> The comments added could be better grammatically, but basically LGTM. >> I'll take care of that if there are no objections. > > The documentation also needed a few tweaks (for DEFAULT and the > argument name), so I have fixed the whole and adapted the new part of > the docs to that, with few little tweaks. Thanks! I assume you have already taken this into account, but I think we should add the same documentation to the below patch for pg_stat_reset_slru(): https://www.postgresql.org/message-id/CALj2ACW4Fqc_m%2BOaavrOMEivZ5aBa24pVKvoXRTmuFECsNBfAg%40mail.gmail.com On 2023-11-12 16:54, Michael Paquier wrote: > On Fri, Nov 10, 2023 at 08:32:34PM +0900, torikoshia wrote: >> On 2023-11-10 13:18, Andres Freund wrote: >>> I see no reason to not include slrus. We should never have added the >>> ability to reset them individually, particularly not without a use >>> case - I couldn't find one skimming some discussion. And what's the >>> point in not allowing to reset them via pg_stat_reset_shared()? >> >> When including SLRUs, do you think it's better to add 'slrus' argument >> which >> enables pg_stat_reset_shared() to reset all SLRUs? > > I understand that Andres says that he'd be OK with a addition of a > 'slru' option in pg_stat_reset_shared(), as well as including SLRUs in > the resets if everything should be wiped. Thanks, I'll make the patch. > 28cac71bd368 is around since 13~, so changing pg_stat_reset_slru() or > removing it could impact existing applications, so there's little > benefit in changing it at this stage. Let it be itself. +1. >> As described above, since SLRUs cannot be reset by >> pg_stat_reset_shared(), I >> feel a bit uncomfortable to delete it all together. > > That would be only effective if NULL is given to the function to reset > everything, which is OK IMO, because this is a shared stats. > -- > Michael -- Regards, -- Atsushi Torikoshi NTT DATA Group Corporation
On Mon, Nov 13, 2023 at 01:15:14PM +0900, torikoshia wrote: > I assume you have already taken this into account, but I think we should add > the same documentation to the below patch for pg_stat_reset_slru(): > > https://www.postgresql.org/message-id/CALj2ACW4Fqc_m%2BOaavrOMEivZ5aBa24pVKvoXRTmuFECsNBfAg%40mail.gmail.com Yep, the DEFAULT value and the argument name should be documented in monitoring.sgml. -- Michael
Вложения
On Mon, Nov 13, 2023 at 9:45 AM torikoshia <torikoshia@oss.nttdata.com> wrote: > > On 2023-11-12 16:46, Michael Paquier wrote: > > On Fri, Nov 10, 2023 at 01:15:50PM +0900, Michael Paquier wrote: > >> The comments added could be better grammatically, but basically LGTM. > >> I'll take care of that if there are no objections. > > > > The documentation also needed a few tweaks (for DEFAULT and the > > argument name), so I have fixed the whole and adapted the new part of > > the docs to that, with few little tweaks. > > Thanks! > > I assume you have already taken this into account, but I think we should > add the same documentation to the below patch for pg_stat_reset_slru(): > > https://www.postgresql.org/message-id/CALj2ACW4Fqc_m%2BOaavrOMEivZ5aBa24pVKvoXRTmuFECsNBfAg%40mail.gmail.com Modified the docs for pg_stat_reset_slru to match with that of pg_stat_reset_shared. PSA v2 patch. I noticed that the commit 23c8c0c8 missed to add proargnames => '{target}' in .dat file for pg_stat_reset_shared, is it intentional? Naming the argument in system_funtion.sql is enough to be able to pass in named arguments like SELECT pg_stat_reset_shared(target := 'io');, but is it needed in .dat file as well to keep it consistent? -- Bharath Rupireddy PostgreSQL Contributors Team RDS Open Source Databases Amazon Web Services: https://aws.amazon.com
Вложения
On Mon, Nov 13, 2023 at 02:07:21PM +0530, Bharath Rupireddy wrote: > Modified the docs for pg_stat_reset_slru to match with that of > pg_stat_reset_shared. PSA v2 patch. That feels consistent. Thanks. > I noticed that the commit 23c8c0c8 missed to add proargnames => > '{target}' in .dat file for pg_stat_reset_shared, is it intentional? > Naming the argument in system_funtion.sql is enough to be able to pass > in named arguments like SELECT pg_stat_reset_shared(target := 'io');, > but is it needed in .dat file as well to keep it consistent? I don't see a need to do that because, as you say, the functions are redefined for their default values, meaning that they'll also have argument names consistent with the docs. There are quite a few like that in pg_proc.dat like pg_promote, pg_backup_start, json_populate_record, etc. -- Michael
Вложения
On Mon, Nov 13, 2023 at 07:31:41PM +0900, Michael Paquier wrote: > On Mon, Nov 13, 2023 at 02:07:21PM +0530, Bharath Rupireddy wrote: >> Modified the docs for pg_stat_reset_slru to match with that of >> pg_stat_reset_shared. PSA v2 patch. > > That feels consistent. Thanks. And applied this one. -- Michael
Вложения
On 2023-11-13 13:15, torikoshia wrote: > On 2023-11-12 16:46, Michael Paquier wrote: >> On Fri, Nov 10, 2023 at 01:15:50PM +0900, Michael Paquier wrote: >>> The comments added could be better grammatically, but basically LGTM. >>> I'll take care of that if there are no objections. >> >> The documentation also needed a few tweaks (for DEFAULT and the >> argument name), so I have fixed the whole and adapted the new part of >> the docs to that, with few little tweaks. > > Thanks! > > I assume you have already taken this into account, but I think we > should add the same documentation to the below patch for > pg_stat_reset_slru(): > > > https://www.postgresql.org/message-id/CALj2ACW4Fqc_m%2BOaavrOMEivZ5aBa24pVKvoXRTmuFECsNBfAg%40mail.gmail.com > > On 2023-11-12 16:54, Michael Paquier wrote: >> On Fri, Nov 10, 2023 at 08:32:34PM +0900, torikoshia wrote: >>> On 2023-11-10 13:18, Andres Freund wrote: >>>> I see no reason to not include slrus. We should never have added the >>>> ability to reset them individually, particularly not without a use >>>> case - I couldn't find one skimming some discussion. And what's the >>>> point in not allowing to reset them via pg_stat_reset_shared()? >>> >>> When including SLRUs, do you think it's better to add 'slrus' >>> argument which >>> enables pg_stat_reset_shared() to reset all SLRUs? >> >> I understand that Andres says that he'd be OK with a addition of a >> 'slru' option in pg_stat_reset_shared(), as well as including SLRUs in >> the resets if everything should be wiped. > > Thanks, I'll make the patch. Attached patch. BTW currently the documentation explains all the arguments of pg_stat_reset_shared() in one line and I feel it's a bit hard to read. Attached patch uses <itemizedlist>. What do you think? -- Regards, -- Atsushi Torikoshi NTT DATA Group Corporation
Вложения
On Tue, Nov 14, 2023 at 10:02:32PM +0900, torikoshia wrote: > Attached patch. You have forgotten to update the errhint at the end of pg_stat_reset_shared(), where "slru" needs to be listed :) > BTW currently the documentation explains all the arguments of > pg_stat_reset_shared() in one line and I feel it's a bit hard to read. > Attached patch uses <itemizedlist>. Yes, this one is a good idea because each target works on a different system view so it becomes easier to understand what a target affects, so I've applied this bit, without the slru addition. -- Michael
Вложения
On 2023-11-15 09:47, Michael Paquier wrote: > On Tue, Nov 14, 2023 at 10:02:32PM +0900, torikoshia wrote: >> Attached patch. > > You have forgotten to update the errhint at the end of > pg_stat_reset_shared(), where "slru" needs to be listed :) Oops, attached v2 patch. >> BTW currently the documentation explains all the arguments of >> pg_stat_reset_shared() in one line and I feel it's a bit hard to read. >> Attached patch uses <itemizedlist>. > > Yes, this one is a good idea because each target works on a different > system view so it becomes easier to understand what a target affects, > so I've applied this bit, without the slru addition. Thanks! > -- > Michael -- Regards, -- Atsushi Torikoshi NTT DATA Group Corporation
Вложения
On Wed, Nov 15, 2023 at 11:58:38AM +0900, torikoshia wrote: > On 2023-11-15 09:47, Michael Paquier wrote: >> You have forgotten to update the errhint at the end of >> pg_stat_reset_shared(), where "slru" needs to be listed :) > > Oops, attached v2 patch. +SELECT stats_reset > :'slru_reset_ts'::timestamptz FROM pg_stat_slru; A problem with these two queries is that they depend on the number of SLRUs set in the system while only returning a single 't' without the cache names each tuple is linked to. To keep things simple, you could just LIMIT 1 or aggregate through the whole set. Other than that, it looks OK. -- Michael
Вложения
On Wed, Nov 15, 2023 at 04:25:14PM +0900, Michael Paquier wrote: > Other than that, it looks OK. Tweaked the queries of this one slightly, and applied. So I think that we are now good for this thread. Thanks, all! -- Michael
Вложения
On 2023-11-16 16:48, Michael Paquier wrote: > On Wed, Nov 15, 2023 at 04:25:14PM +0900, Michael Paquier wrote: >> Other than that, it looks OK. > > Tweaked the queries of this one slightly, and applied. So I think > that we are now good for this thread. Thanks, all! Thanks for the modification and apply! > -- > Michael -- Regards, -- Atsushi Torikoshi NTT DATA Group Corporation