Обсуждение: Reconcile stats in find_tabstat_entry() and get rid of PgStat_BackendFunctionEntry
Reconcile stats in find_tabstat_entry() and get rid of PgStat_BackendFunctionEntry
От
"Drouvot, Bertrand"
Дата:
Hi hackers, Please find attached a patch proposal for $SUBJECT. The idea has been raised in [1] by Andres: it would allow to simplify even more the work done to generate pg_stat_get_xact*() functions with Macros. Indeed, with the reconciliation done in find_tabstat_entry() then all the pg_stat_get_xact*() functions (making use of find_tabstat_entry()) now "look the same" (should they take into account live subtransactions or not). Looking forward to your feedback, Regards -- Bertrand Drouvot PostgreSQL Contributors Team RDS Open Source Databases Amazon Web Services: https://aws.amazon.com [1]: https://www.postgresql.org/message-id/20230111225907.6el6c5j3hukizqxc%40awork3.anarazel.de
Вложения
Re: Reconcile stats in find_tabstat_entry() and get rid of PgStat_BackendFunctionEntry
От
Kyotaro Horiguchi
Дата:
At Thu, 9 Feb 2023 11:38:18 +0100, "Drouvot, Bertrand" <bertranddrouvot.pg@gmail.com> wrote in > Hi hackers, > > Please find attached a patch proposal for $SUBJECT. > > The idea has been raised in [1] by Andres: it would allow to simplify > even more the work done to > generate pg_stat_get_xact*() functions with Macros. > > Indeed, with the reconciliation done in find_tabstat_entry() then all > the pg_stat_get_xact*() functions > (making use of find_tabstat_entry()) now "look the same" (should they > take into account live subtransactions or not). > > Looking forward to your feedback, I like that direction. Don't we rename PgStat_FunctionCounts to PgStat_FuncStatus, unifying neighboring functions? Why does find_tabstat_entry() copies the whole pending data and performs subxaction summarization? The summarization is needed only by few callers but now that cost is imposed to the all callers along with additional palloc()/pfree() calls. That doesn't seem reasonable. regards. -- Kyotaro Horiguchi NTT Open Source Software Center
Re: Reconcile stats in find_tabstat_entry() and get rid of PgStat_BackendFunctionEntry
От
"Drouvot, Bertrand"
Дата:
Hi, On 2/10/23 3:32 AM, Kyotaro Horiguchi wrote: > At Thu, 9 Feb 2023 11:38:18 +0100, "Drouvot, Bertrand" <bertranddrouvot.pg@gmail.com> wrote in >> Hi hackers, >> >> Please find attached a patch proposal for $SUBJECT. >> >> The idea has been raised in [1] by Andres: it would allow to simplify >> even more the work done to >> generate pg_stat_get_xact*() functions with Macros. >> >> Indeed, with the reconciliation done in find_tabstat_entry() then all >> the pg_stat_get_xact*() functions >> (making use of find_tabstat_entry()) now "look the same" (should they >> take into account live subtransactions or not). >> >> Looking forward to your feedback, > > I like that direction. > Thanks for looking at it! > Don't we rename PgStat_FunctionCounts to PgStat_FuncStatus, unifying > neighboring functions? > Not sure, I think it's the counter part of PgStat_TableCounts for example. > Why does find_tabstat_entry() copies the whole pending data and > performs subxaction summarization? It copies the pending data to not increment it's counters while doing the summarization. The summarization was done here to avoid the pg_stat_get_xact*() functions to do the computation so that all the pg_stat_get_xact*() functions look the same but.... > The summarization is needed only by > few callers but now that cost is imposed to the all callers along with > additional palloc()/pfree() calls. That doesn't seem reasonable. > I agree that's not the best approach..... Let me come back with another proposal (thinking to increment reconciled counters in pgstat_count_heap_insert(), pgstat_count_heap_delete() and pgstat_count_heap_update()). Regards, -- Bertrand Drouvot PostgreSQL Contributors Team RDS Open Source Databases Amazon Web Services: https://aws.amazon.com
Re: Reconcile stats in find_tabstat_entry() and get rid of PgStat_BackendFunctionEntry
От
Andres Freund
Дата:
: Andres Freund <andres@anarazel.de> References: <b9e1f543-ee93-8168-d530-d961708ad9d3@gmail.com> <20230210.113242.699878230551547182.horikyota.ntt@gmail.com> <5420b28c-d33f-d25d-9f47-b06b8a2372ba@gmail.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <5420b28c-d33f-d25d-9f47-b06b8a2372ba@gmail.com> Hi, On 2023-02-10 16:50:32 +0100, Drouvot, Bertrand wrote: > On 2/10/23 3:32 AM, Kyotaro Horiguchi wrote: > > The summarization is needed only by > > few callers but now that cost is imposed to the all callers along with > > additional palloc()/pfree() calls. That doesn't seem reasonable. > > > > I agree that's not the best approach..... I think it's completely fine to do unnecessary reconciliation for the _xact_ functions. They're not that commonly used, and very rarely is there a huge number of relations with lots of pending data across lots of subtransactions. > Let me come back with another proposal (thinking to increment reconciled > counters in pgstat_count_heap_insert(), pgstat_count_heap_delete() and > pgstat_count_heap_update()). Those are the performance crucial functions, we shouldn't do any additional work there if we can avoid it. Shifting cost from the "looking at transactional stats" side to the collecting stats side is the opposite of what we should. Greetings, Andres Freund
Re: Reconcile stats in find_tabstat_entry() and get rid of PgStat_BackendFunctionEntry
От
Andres Freund
Дата:
Hi, On 2023-02-09 11:38:18 +0100, Drouvot, Bertrand wrote: > Please find attached a patch proposal for $SUBJECT. > > The idea has been raised in [1] by Andres: it would allow to simplify even more the work done to > generate pg_stat_get_xact*() functions with Macros. Thanks! I think this is useful beyond being able to generate those functions with macros. The fact that we had to deal with transactional code in pgstatfuncs.c meant that a lot of the relevant itnernals had to be exposed "outside" pgstat, which seems architecturally not great. > Indeed, with the reconciliation done in find_tabstat_entry() then all the pg_stat_get_xact*() functions > (making use of find_tabstat_entry()) now "look the same" (should they take into account live subtransactions or not). I'm not bothered by making all of pg_stat_get_xact* functions more expensive, they're not a hot code path. But if we need to, we could just add a parameter to find_tabstat_entry() indicating whether we need to reconcile or not. > /* save stats for this function, later used to compensate for recursion */ > - fcu->save_f_total_time = pending->f_counts.f_total_time; > + fcu->save_f_total_time = pending->f_total_time; > > /* save current backend-wide total time */ > fcu->save_total = total_func_time; The diff is noisy due to all the mechanical changes like the above. Could that be split into a separate commit? > find_tabstat_entry(Oid rel_id) > { > PgStat_EntryRef *entry_ref; > + PgStat_TableStatus *tablestatus = NULL; > > entry_ref = pgstat_fetch_pending_entry(PGSTAT_KIND_RELATION, MyDatabaseId, rel_id); > if (!entry_ref) > entry_ref = pgstat_fetch_pending_entry(PGSTAT_KIND_RELATION, InvalidOid, rel_id); > > if (entry_ref) > - return entry_ref->pending; > - return NULL; > + { > + PgStat_TableStatus *tabentry = (PgStat_TableStatus *) entry_ref->pending; I'd add an early return for the !entry_ref case, that way you don't need to indent the bulk of the function. > + PgStat_TableXactStatus *trans; > + > + tablestatus = palloc(sizeof(PgStat_TableStatus)); > + memcpy(tablestatus, tabentry, sizeof(PgStat_TableStatus)); For things like this I'd use *tablestatus = *tabentry; that way the compiler will warn you about mismatching types, and you don't need the sizeof(). > + /* live subtransactions' counts aren't in t_counts yet */ > + for (trans = tabentry->trans; trans != NULL; trans = trans->upper) > + { > + tablestatus->t_counts.t_tuples_inserted += trans->tuples_inserted; > + tablestatus->t_counts.t_tuples_updated += trans->tuples_updated; > + tablestatus->t_counts.t_tuples_deleted += trans->tuples_deleted; > + } > + } Hm, why do we end uup with t_counts still being used here, but removed other places? > diff --git a/src/backend/utils/adt/pgstatfuncs.c b/src/backend/utils/adt/pgstatfuncs.c > index 6737493402..40a6fbf871 100644 > --- a/src/backend/utils/adt/pgstatfuncs.c > +++ b/src/backend/utils/adt/pgstatfuncs.c > @@ -1366,7 +1366,10 @@ pg_stat_get_xact_numscans(PG_FUNCTION_ARGS) > if ((tabentry = find_tabstat_entry(relid)) == NULL) > result = 0; > else > + { > result = (int64) (tabentry->t_counts.t_numscans); > + pfree(tabentry); > + } > > PG_RETURN_INT64(result); > } I don't think we need to bother with individual pfrees in this path. The caller will call the function in a dedicated memory context, that'll be reset very soon after this. Greetings, Andres Freund
Re: Reconcile stats in find_tabstat_entry() and get rid of PgStat_BackendFunctionEntry
От
"Drouvot, Bertrand"
Дата:
Hi, On 2/10/23 10:46 PM, Andres Freund wrote: > Hi, > > On 2023-02-09 11:38:18 +0100, Drouvot, Bertrand wrote: >> Please find attached a patch proposal for $SUBJECT. >> >> The idea has been raised in [1] by Andres: it would allow to simplify even more the work done to >> generate pg_stat_get_xact*() functions with Macros. > > Thanks! > Thanks for looking at it! > I think this is useful beyond being able to generate those functions with > macros. The fact that we had to deal with transactional code in pgstatfuncs.c > meant that a lot of the relevant itnernals had to be exposed "outside" pgstat, > which seems architecturally not great. > Right, good point. >> Indeed, with the reconciliation done in find_tabstat_entry() then all the pg_stat_get_xact*() functions >> (making use of find_tabstat_entry()) now "look the same" (should they take into account live subtransactions or not). > > I'm not bothered by making all of pg_stat_get_xact* functions more expensive, > they're not a hot code path. But if we need to, we could just add a parameter > to find_tabstat_entry() indicating whether we need to reconcile or not. > I think that's a good idea to avoid doing extra work if not needed. V2 adds such a bool. >> /* save stats for this function, later used to compensate for recursion */ >> - fcu->save_f_total_time = pending->f_counts.f_total_time; >> + fcu->save_f_total_time = pending->f_total_time; >> >> /* save current backend-wide total time */ >> fcu->save_total = total_func_time; > > The diff is noisy due to all the mechanical changes like the above. Could that > be split into a separate commit? > Fully agree, the PgStat_BackendFunctionEntry stuff will be done in a separate patch. > >> find_tabstat_entry(Oid rel_id) >> { >> PgStat_EntryRef *entry_ref; >> + PgStat_TableStatus *tablestatus = NULL; >> >> entry_ref = pgstat_fetch_pending_entry(PGSTAT_KIND_RELATION, MyDatabaseId, rel_id); >> if (!entry_ref) >> entry_ref = pgstat_fetch_pending_entry(PGSTAT_KIND_RELATION, InvalidOid, rel_id); >> >> if (entry_ref) >> - return entry_ref->pending; >> - return NULL; >> + { >> + PgStat_TableStatus *tabentry = (PgStat_TableStatus *) entry_ref->pending; > > I'd add an early return for the !entry_ref case, that way you don't need to > indent the bulk of the function. > Good point, done in V2. > >> + PgStat_TableXactStatus *trans; >> + >> + tablestatus = palloc(sizeof(PgStat_TableStatus)); >> + memcpy(tablestatus, tabentry, sizeof(PgStat_TableStatus)); > > For things like this I'd use > *tablestatus = *tabentry; > > that way the compiler will warn you about mismatching types, and you don't > need the sizeof(). > > Good point, done in V2. >> + /* live subtransactions' counts aren't in t_counts yet */ >> + for (trans = tabentry->trans; trans != NULL; trans = trans->upper) >> + { >> + tablestatus->t_counts.t_tuples_inserted += trans->tuples_inserted; >> + tablestatus->t_counts.t_tuples_updated += trans->tuples_updated; >> + tablestatus->t_counts.t_tuples_deleted += trans->tuples_deleted; >> + } >> + } > > Hm, why do we end uup with t_counts still being used here, but removed other > places? > t_counts are not removed, maybe you are confused with the "f_counts" that were removed in V1 due to the PgStat_BackendFunctionEntry related changes? > >> diff --git a/src/backend/utils/adt/pgstatfuncs.c b/src/backend/utils/adt/pgstatfuncs.c >> index 6737493402..40a6fbf871 100644 >> --- a/src/backend/utils/adt/pgstatfuncs.c >> +++ b/src/backend/utils/adt/pgstatfuncs.c >> @@ -1366,7 +1366,10 @@ pg_stat_get_xact_numscans(PG_FUNCTION_ARGS) >> if ((tabentry = find_tabstat_entry(relid)) == NULL) >> result = 0; >> else >> + { >> result = (int64) (tabentry->t_counts.t_numscans); >> + pfree(tabentry); >> + } >> >> PG_RETURN_INT64(result); >> } > > I don't think we need to bother with individual pfrees in this path. The > caller will call the function in a dedicated memory context, that'll be reset > very soon after this. Oh right, the palloc is done in the ExprContext memory context that is reset soon after. Removing the pfrees in V2 attached. Regards, -- Bertrand Drouvot PostgreSQL Contributors Team RDS Open Source Databases Amazon Web Services: https://aws.amazon.com
Вложения
Re: Reconcile stats in find_tabstat_entry() and get rid of PgStat_BackendFunctionEntry
От
Kyotaro Horiguchi
Дата:
At Mon, 13 Feb 2023 08:09:50 +0100, "Drouvot, Bertrand" <bertranddrouvot.pg@gmail.com> wrote in >> I think this is useful beyond being able to generate those functions >> with >> macros. The fact that we had to deal with transactional code in >> pgstatfuncs.c >> meant that a lot of the relevant itnernals had to be exposed "outside" >> pgstat, >> which seems architecturally not great. >> > Right, good point. Agreed. > Removing the pfrees in V2 attached. Ah, that sound good. if (!entry_ref) + { entry_ref = pgstat_fetch_pending_entry(PGSTAT_KIND_RELATION, InvalidOid, rel_id); + return tablestatus; + } We should return something if the call returns a non-null value? So, since we want to hide the internal from pgstatfuncs, the additional flag should be gone. If the additional cost doesn't bother anyone, I don't mind to remove the flag. The patch will get far simpler by that. regards. -- Kyotaro Horiguchi NTT Open Source Software Center
Re: Reconcile stats in find_tabstat_entry() and get rid of PgStat_BackendFunctionEntry
От
"Drouvot, Bertrand"
Дата:
Hi, On 2/13/23 8:40 AM, Kyotaro Horiguchi wrote: > At Mon, 13 Feb 2023 08:09:50 +0100, "Drouvot, Bertrand" <bertranddrouvot.pg@gmail.com> wrote in >>> I think this is useful beyond being able to generate those functions >>> with >>> macros. The fact that we had to deal with transactional code in >>> pgstatfuncs.c >>> meant that a lot of the relevant itnernals had to be exposed "outside" >>> pgstat, >>> which seems architecturally not great. >>> >> Right, good point. > > Agreed. > >> Removing the pfrees in V2 attached. > > Ah, that sound good. > > if (!entry_ref) > + { > entry_ref = pgstat_fetch_pending_entry(PGSTAT_KIND_RELATION, InvalidOid, rel_id); > + return tablestatus; > + } > > We should return something if the call returns a non-null value? What we do is: if entry_ref is NULL then we return NULL (so that the caller returns 0). If entry_ref is not NULL then we return a copy of entry_ref->pending (with or without subtrans). > > So, since we want to hide the internal from pgstatfuncs, the > additional flag should be gone. I think there is pros and cons for both but I don't have a strong opinion about that. So also proposing V3 attached without the flag in case this is the preferred approach. Regards, -- Bertrand Drouvot PostgreSQL Contributors Team RDS Open Source Databases Amazon Web Services: https://aws.amazon.com
Вложения
Re: Reconcile stats in find_tabstat_entry() and get rid of PgStat_BackendFunctionEntry
От
Kyotaro Horiguchi
Дата:
Thanks for the new version. At Mon, 13 Feb 2023 09:58:52 +0100, "Drouvot, Bertrand" <bertranddrouvot.pg@gmail.com> wrote in > Hi, > > On 2/13/23 8:40 AM, Kyotaro Horiguchi wrote: > > At Mon, 13 Feb 2023 08:09:50 +0100, "Drouvot, Bertrand" > > <bertranddrouvot.pg@gmail.com> wrote in > >>> I think this is useful beyond being able to generate those functions > >>> with > >>> macros. The fact that we had to deal with transactional code in > >>> pgstatfuncs.c > >>> meant that a lot of the relevant itnernals had to be exposed "outside" > >>> pgstat, > >>> which seems architecturally not great. > >>> > >> Right, good point. > > Agreed. > > > >> Removing the pfrees in V2 attached. > > Ah, that sound good. > > if (!entry_ref) > > + { > > entry_ref = pgstat_fetch_pending_entry(PGSTAT_KIND_RELATION, > > InvalidOid, rel_id); > > + return tablestatus; > > + } > > We should return something if the call returns a non-null value? > > What we do is: if entry_ref is NULL then we return NULL (so that the > caller returns 0). > > If entry_ref is not NULL then we return a copy of entry_ref->pending > (with or without subtrans). Isn't it ignoring the second call to pgstat_fetch_pending_entry? What the code did is: if entry_ref is NULL for MyDatabaseId then we *retry* fetching an global (not database-wise) entry. If any global entry is found, return it (correctly entry_ref->pending) to the caller. The current patch returns NULL when a glboal entry is found. I thought that we might be able to return entry_ref->pending since the callers don't call pfree on the returned pointer, but it is not great that we don't inform the callers if the returned memory can be safely pfreed or not. Thus what I have in mind is the following. > if (!entry_ref) > + { > entry_ref = pgstat_fetch_pending_entry(PGSTAT_KIND_RELATION, > InvalidOid, rel_id); > + if (!entry_ref) > + return NULL; > + } > > So, since we want to hide the internal from pgstatfuncs, the > > additional flag should be gone. > > I think there is pros and cons for both but I don't have a strong > opinion about that. > > So also proposing V3 attached without the flag in case this is the > preferred approach. That part looks good to me. regards. -- Kyotaro Horiguchi NTT Open Source Software Center
Re: Reconcile stats in find_tabstat_entry() and get rid of PgStat_BackendFunctionEntry
От
"Drouvot, Bertrand"
Дата:
Hi, On 2/14/23 7:11 AM, Kyotaro Horiguchi wrote: > > Isn't it ignoring the second call to pgstat_fetch_pending_entry? > Oh right, my bad (the issue has been introduced in V2). Fixed in V4. > I thought that we might be able to return entry_ref->pending since the > callers don't call pfree on the returned pointer, but it is not great > that we don't inform the callers if the returned memory can be safely > pfreed or not. > > Thus what I have in mind is the following. > >> if (!entry_ref) >> + { >> entry_ref = pgstat_fetch_pending_entry(PGSTAT_KIND_RELATION, >> InvalidOid, rel_id); >> + if (!entry_ref) >> + return NULL; >> + } LGTM, done that way in V4. > > > >>> So, since we want to hide the internal from pgstatfuncs, the >>> additional flag should be gone. >> >> I think there is pros and cons for both but I don't have a strong >> opinion about that. >> >> So also proposing V3 attached without the flag in case this is the >> preferred approach. > > That part looks good to me. > Thanks! Regards, -- Bertrand Drouvot PostgreSQL Contributors Team RDS Open Source Databases Amazon Web Services: https://aws.amazon.com
Вложения
Re: Reconcile stats in find_tabstat_entry() and get rid of PgStat_BackendFunctionEntry
От
Kyotaro Horiguchi
Дата:
At Tue, 14 Feb 2023 15:43:26 +0100, "Drouvot, Bertrand" <bertranddrouvot.pg@gmail.com> wrote in > Oh right, my bad (the issue has been introduced in V2). > Fixed in V4. Great! > > I thought that we might be able to return entry_ref->pending since the > > callers don't call pfree on the returned pointer, but it is not great > > that we don't inform the callers if the returned memory can be safely > > pfreed or not. > > Thus what I have in mind is the following. > > > >> if (!entry_ref) > >> + { > >> entry_ref = pgstat_fetch_pending_entry(PGSTAT_KIND_RELATION, > >> InvalidOid, rel_id); > >> + if (!entry_ref) > >> + return NULL; > >> + } > > LGTM, done that way in V4. That part looks good to me, thanks! I was going through v4 and it seems to me that the comment for find_tabstat_entry may not be quite right. > * find any existing PgStat_TableStatus entry for rel > * > * Find any existing PgStat_TableStatus entry for rel_id in the current > * database. If not found, try finding from shared tables. > * > * If no entry found, return NULL, don't create a new one The comment assumed that the function directly returns an entry from shared memory, but now it copies the entry's contents into a palloc'ed memory and stores the sums of some counters for the current transaction in it. Do you think we should update the comment to reflect this change? regards. -- Kyotaro Horiguchi NTT Open Source Software Center
Re: Reconcile stats in find_tabstat_entry() and get rid of PgStat_BackendFunctionEntry
От
"Drouvot, Bertrand"
Дата:
Hi, On 2/15/23 1:56 AM, Kyotaro Horiguchi wrote: > At Tue, 14 Feb 2023 15:43:26 +0100, "Drouvot, Bertrand" <bertranddrouvot.pg@gmail.com> wrote in > > The comment assumed that the function directly returns an entry from > shared memory, but now it copies the entry's contents into a palloc'ed > memory and stores the sums of some counters for the current > transaction in it. Do you think we should update the comment to > reflect this change? > Good point, thanks! Yeah, definitively, done in V5 attached. Regards, -- Bertrand Drouvot PostgreSQL Contributors Team RDS Open Source Databases Amazon Web Services: https://aws.amazon.com
Вложения
Re: Reconcile stats in find_tabstat_entry() and get rid of PgStat_BackendFunctionEntry
От
Andres Freund
Дата:
Hi, On 2023-02-15 09:21:48 +0100, Drouvot, Bertrand wrote: > diff --git a/src/backend/utils/activity/pgstat_relation.c b/src/backend/utils/activity/pgstat_relation.c > index f793ac1516..b26e2a5a7a 100644 > --- a/src/backend/utils/activity/pgstat_relation.c > +++ b/src/backend/utils/activity/pgstat_relation.c > @@ -471,20 +471,46 @@ pgstat_fetch_stat_tabentry_ext(bool shared, Oid reloid) > * Find any existing PgStat_TableStatus entry for rel_id in the current > * database. If not found, try finding from shared tables. > * > + * If an entry is found, copy it and increment the copy's counters with their > + * subtransactions counterparts. Then return the copy. There is no need for the > + * caller to pfree the copy as the MemoryContext will be reset soon after. > + * The "There is no need" bit seems a bit off. Yes, that's true for the current callers, but who says that it has to stay that way? Otherwise this looks ready, on a casual scan. Greetings, Andres Freund
Re: Reconcile stats in find_tabstat_entry() and get rid of PgStat_BackendFunctionEntry
От
"Drouvot, Bertrand"
Дата:
Hi, On 2/16/23 10:21 PM, Andres Freund wrote: > Hi, > > On 2023-02-15 09:21:48 +0100, Drouvot, Bertrand wrote: >> diff --git a/src/backend/utils/activity/pgstat_relation.c b/src/backend/utils/activity/pgstat_relation.c >> index f793ac1516..b26e2a5a7a 100644 >> --- a/src/backend/utils/activity/pgstat_relation.c >> +++ b/src/backend/utils/activity/pgstat_relation.c >> @@ -471,20 +471,46 @@ pgstat_fetch_stat_tabentry_ext(bool shared, Oid reloid) >> * Find any existing PgStat_TableStatus entry for rel_id in the current >> * database. If not found, try finding from shared tables. >> * >> + * If an entry is found, copy it and increment the copy's counters with their >> + * subtransactions counterparts. Then return the copy. There is no need for the >> + * caller to pfree the copy as the MemoryContext will be reset soon after. >> + * > > The "There is no need" bit seems a bit off. Yes, that's true for the current > callers, but who says that it has to stay that way? > Good point. Wording has been changed in V6 attached. > Otherwise this looks ready, on a casual scan. > Thanks for having looked at it! Regards, -- Bertrand Drouvot PostgreSQL Contributors Team RDS Open Source Databases Amazon Web Services: https://aws.amazon.com
Вложения
Re: Reconcile stats in find_tabstat_entry() and get rid of PgStat_BackendFunctionEntry
От
Michael Paquier
Дата:
On Mon, Mar 06, 2023 at 08:33:15AM +0100, Drouvot, Bertrand wrote: > Thanks for having looked at it! Looking at that, I have a few comments. + tabentry = (PgStat_TableStatus *) entry_ref->pending; + tablestatus = palloc(sizeof(PgStat_TableStatus)); + *tablestatus = *tabentry; + [...] + for (trans = tabentry->trans; trans != NULL; trans = trans->upper) + { + tablestatus->t_counts.t_tuples_inserted += trans->tuples_inserted; + tablestatus->t_counts.t_tuples_updated += trans->tuples_updated; + tablestatus->t_counts.t_tuples_deleted += trans->tuples_deleted; + } - if (entry_ref) - return entry_ref->pending; - return NULL; + return tablestatus; From what I get with this change, the number of tuples changed by DMLs have their computations done a bit earlier, meaning that it would make all the callers of find_tabstat_entry() pay the computation cost. Still it is not really going to matter, because we will just do the computation once when looking at any pending changes of pg_stat_xact_all_tables for each entry. There are 9 callers of find_tabstat_entry, with 7 being used for pg_stat_xact_all_tables. How much do we need to care about the remaining two callers pg_stat_get_xact_blocks_fetched() and pg_stat_get_xact_blocks_hit()? Could it be a problem if these two also pay the extra computation cost if a transaction with many subtransactions (aka )needs to look at their data? These two are used nowhere, they have pg_proc entries and they are undocumented, so it is hard to say the impact of this change on them.. Second question: once the data from the subtransactions is copied, would it be cleaner to set trans to NULL after the data copy is done? It would feel a bit safer to me to document that find_tabstat_entry() is currently only used for this xact system view.. The extra computation could lead to surprises, actually, if this routine is used outside this context? Perhaps that's OK, but it does not give me a warm feeling, just to reshape three functions of pgstatfuncs.c with macros. -- Michael
Вложения
Re: Reconcile stats in find_tabstat_entry() and get rid of PgStat_BackendFunctionEntry
От
"Drouvot, Bertrand"
Дата:
Hi, On 3/16/23 7:29 AM, Michael Paquier wrote: > On Mon, Mar 06, 2023 at 08:33:15AM +0100, Drouvot, Bertrand wrote: >> Thanks for having looked at it! > > Looking at that, I have a few comments. > > + tabentry = (PgStat_TableStatus *) entry_ref->pending; > + tablestatus = palloc(sizeof(PgStat_TableStatus)); > + *tablestatus = *tabentry; > + > [...] > + for (trans = tabentry->trans; trans != NULL; trans = trans->upper) > + { > + tablestatus->t_counts.t_tuples_inserted += trans->tuples_inserted; > + tablestatus->t_counts.t_tuples_updated += trans->tuples_updated; > + tablestatus->t_counts.t_tuples_deleted += trans->tuples_deleted; > + } > > - if (entry_ref) > - return entry_ref->pending; > - return NULL; > + return tablestatus; > > From what I get with this change, the number of tuples changed by DMLs > have their computations done a bit earlier, Thanks for looking at it! Right, but note this is in a dedicated new tablestatus (created within find_tabstat_entry()). > meaning that it would make > all the callers of find_tabstat_entry() pay the computation cost. Right. Another suggested approach was to add a flag but then we'd not really hide the internal from pgstatfuncs. > Still it is not really going to matter, because we will just do the > computation once when looking at any pending changes of > pg_stat_xact_all_tables for each entry. Yes. > There are 9 callers of > find_tabstat_entry, with 7 being used for pg_stat_xact_all_tables. Right, those are: pg_stat_get_xact_numscans() pg_stat_get_xact_tuples_returned() pg_stat_get_xact_tuples_fetched() pg_stat_get_xact_tuples_inserted() pg_stat_get_xact_tuples_updated() pg_stat_get_xact_tuples_deleted() pg_stat_get_xact_tuples_hot_updated() > How much do we need to care about the remaining two callers > pg_stat_get_xact_blocks_fetched() and pg_stat_get_xact_blocks_hit()? Regarding pg_stat_get_xact_blocks_fetched() and pg_stat_get_xact_blocks_hit() the callers (if any) are outside of the core PG (as from what I can see they are not used at all). I don't think we should pay any particular attention to those 2 ones as anyway nothing prevent the 7 others to be called outside of the pg_stat_xact_all_tables view. > Could it be a problem if these two also pay the extra computation cost > if a transaction with many subtransactions (aka )needs to look at their > data? These two are used nowhere, they have pg_proc entries and they > are undocumented, so it is hard to say the impact of this change on > them.. > Right, and that's the same for the 7 others as nothing prevent them to be called outside of the pg_stat_xact_all_tables view. Do you think it would be better to add the extra flag then? > Second question: once the data from the subtransactions is copied, > would it be cleaner to set trans to NULL after the data copy is done? > That would not hurt but I'm not sure it's worth it (well, it's currently not done in pg_stat_get_xact_tuples_inserted() for example). > It would feel a bit safer to me to document that find_tabstat_entry() > is currently only used for this xact system view.. The extra > computation could lead to surprises, actually, if this routine is used > outside this context? Perhaps that's OK, but it does not give me a > warm feeling, just to reshape three functions of pgstatfuncs.c with > macros. That's a fair point. On the other hand those 9 functions (which can all be used outside of the pg_stat_xact_all_tables view) are not documented, so I'm not sure this is that much of a concern (and if we think it is we still gave the option to add an extra flag to indicate whether or not the extra computation is needed.) Regards, -- Bertrand Drouvot PostgreSQL Contributors Team RDS Open Source Databases Amazon Web Services: https://aws.amazon.com
Re: Reconcile stats in find_tabstat_entry() and get rid of PgStat_BackendFunctionEntry
От
Michael Paquier
Дата:
On Thu, Mar 16, 2023 at 11:32:56AM +0100, Drouvot, Bertrand wrote: > On 3/16/23 7:29 AM, Michael Paquier wrote: >> From what I get with this change, the number of tuples changed by DMLs >> have their computations done a bit earlier, > > Thanks for looking at it! > > Right, but note this is in a dedicated new tablestatus (created > within find_tabstat_entry()). Sure, however it copies the pointer of the PgStat_TableXactStatus from tabentry, isn't it? This means that it keeps a reference of the chain of subtransactions. It does not matter for the functions but it could for out-of-core callers of find_tabstat_entry(), no? Perhaps you are right and that's not worth worrying, still I don't feel particularly confident that this is the best approach we can take. >> How much do we need to care about the remaining two callers >> pg_stat_get_xact_blocks_fetched() and pg_stat_get_xact_blocks_hit()? > > Regarding pg_stat_get_xact_blocks_fetched() and pg_stat_get_xact_blocks_hit() > the callers (if any) are outside of the core PG (as from what I can > see they are not used at all). > > I don't think we should pay any particular attention to those 2 ones > as anyway nothing prevent the 7 others to be called outside of the > pg_stat_xact_all_tables view. I am not quite sure, TBH. Did you look at the difference with a long chain of subtrans, like savepoints? The ODBC driver "loves" producing a lot of savepoints, for example. >> It would feel a bit safer to me to document that find_tabstat_entry() >> is currently only used for this xact system view.. The extra >> computation could lead to surprises, actually, if this routine is used >> outside this context? Perhaps that's OK, but it does not give me a >> warm feeling, just to reshape three functions of pgstatfuncs.c with >> macros. > > That's a fair point. On the other hand those 9 functions (which can > all be used outside of the pg_stat_xact_all_tables view) are not > documented, so I'm not sure this is that much of a concern (and if > we think it is we still gave the option to add an extra flag to > indicate whether or not the extra computation is needed.) That's not quite exact, I think. The first 7 functions are used in a system catalog that is documented. Still we have a problem here. I can actually see a few projects relying on these two functions while looking a bit around, so they are used. And the issue comes from ddfc2d9, that has removed these functions from the documentation ignoring that they are used in no system catalogs. I think that we should fix that and re-add the two missing functions with a proper description in the docs, at least? There is no trace of them. Perhaps the ones exposted through pg_stat_xact_all_tables are fine if not listed. -- Michael
Вложения
Re: Reconcile stats in find_tabstat_entry() and get rid of PgStat_BackendFunctionEntry
От
"Drouvot, Bertrand"
Дата:
On 3/16/23 12:46 PM, Michael Paquier wrote: > On Thu, Mar 16, 2023 at 11:32:56AM +0100, Drouvot, Bertrand wrote: >> On 3/16/23 7:29 AM, Michael Paquier wrote: >>> From what I get with this change, the number of tuples changed by DMLs >>> have their computations done a bit earlier, >> >> Thanks for looking at it! >> >> Right, but note this is in a dedicated new tablestatus (created >> within find_tabstat_entry()). > > Sure, however it copies the pointer of the PgStat_TableXactStatus from > tabentry, isn't it? Oh I see what you mean, yeah, the pointer is copied. > This means that it keeps a reference of the chain > of subtransactions. It does not matter for the functions but it could > for out-of-core callers of find_tabstat_entry(), no? Yeah, maybe. > Perhaps you are > right and that's not worth worrying, still I don't feel particularly > confident that this is the best approach we can take. > due to what potential out-of-core callers could do with it? >>> How much do we need to care about the remaining two callers >>> pg_stat_get_xact_blocks_fetched() and pg_stat_get_xact_blocks_hit()? >> >> Regarding pg_stat_get_xact_blocks_fetched() and pg_stat_get_xact_blocks_hit() >> the callers (if any) are outside of the core PG (as from what I can >> see they are not used at all). >> >> I don't think we should pay any particular attention to those 2 ones >> as anyway nothing prevent the 7 others to be called outside of the >> pg_stat_xact_all_tables view. > > I am not quite sure, TBH. Did you look at the difference with a long > chain of subtrans, like savepoints? The ODBC driver "loves" producing > a lot of savepoints, for example. > No, I did not measure the impact. >>> It would feel a bit safer to me to document that find_tabstat_entry() >>> is currently only used for this xact system view.. The extra >>> computation could lead to surprises, actually, if this routine is used >>> outside this context? Perhaps that's OK, but it does not give me a >>> warm feeling, just to reshape three functions of pgstatfuncs.c with >>> macros. >> >> That's a fair point. On the other hand those 9 functions (which can >> all be used outside of the pg_stat_xact_all_tables view) are not >> documented, so I'm not sure this is that much of a concern (and if >> we think it is we still gave the option to add an extra flag to >> indicate whether or not the extra computation is needed.) > > That's not quite exact, I think. The first 7 functions are used in a > system catalog that is documented. Right. > Still we have a problem here. I > can actually see a few projects relying on these two functions while > looking a bit around, so they are used. And the issue comes from > ddfc2d9, that has removed these functions from the documentation > ignoring that they are used in no system catalogs. I think that we > should fix that and re-add the two missing functions with a proper > description in the docs, at least? As they could be/are used outside of the xact view, yes I think the same. > There is no trace of them. > Perhaps the ones exposted through pg_stat_xact_all_tables are fine if > not listed. I'd be tempted to add documentation for all of them, I can look at it. Regards, -- Bertrand Drouvot PostgreSQL Contributors Team RDS Open Source Databases Amazon Web Services: https://aws.amazon.com
Re: Reconcile stats in find_tabstat_entry() and get rid of PgStat_BackendFunctionEntry
От
Michael Paquier
Дата:
On Thu, Mar 16, 2023 at 02:13:45PM +0100, Drouvot, Bertrand wrote: > On 3/16/23 12:46 PM, Michael Paquier wrote: >> There is no trace of them. >> Perhaps the ones exposted through pg_stat_xact_all_tables are fine if >> not listed. > > I'd be tempted to add documentation for all of them, I can look at it. I am not sure that there is any need to completely undo ddfc2d9, later simplified by 5f2b089, so my opinion would be to just add documentation for the functions that can be used but are used in none of the system functions. Anyway, double-checking, I only see an inconsistency for these two, confirming my first impression: - pg_stat_get_xact_blocks_fetched - pg_stat_get_xact_blocks_hit There may be a point in having them in some of the system views, but the non-xact flavors are only used in the statio views, which don't really need xact versions AFAIK. I am not sure that it makes much sense to add them in pg_stat_xact_all_tables, either. Another view is just remove them, though some could rely on them externally. At the end, documenting both still sounds like the best move to me. -- Michael
Вложения
Re: Reconcile stats in find_tabstat_entry() and get rid of PgStat_BackendFunctionEntry
От
"Drouvot, Bertrand"
Дата:
Hi, On 3/20/23 12:43 AM, Michael Paquier wrote: > At the > end, documenting both still sounds like the best move to me. Agree. Please find attached v1-0001-pg_stat_get_xact_blocks_fetched-and_hit-doc.patch doing so. I did not put the exact same wording as the one being removed in ddfc2d9, as: - For pg_stat_get_xact_blocks_hit(), I think it's better to be closer to say the pg_statio_all_tables.heap_blks_hit definition. - For pg_stat_get_xact_blocks_fetched(), I think that using "buffer" is better (than block) as at the end it's related to pgstat_count_buffer_read(). At the end there is a choice to be made for both for the wording between block and buffer. Indeed their counters get incremented in "buffer" macros while retrieved in those "blocks" functions. "Buffer" sounds more appropriate to me, so the attached has been done that way. Regards, -- Bertrand Drouvot PostgreSQL Contributors Team RDS Open Source Databases Amazon Web Services: https://aws.amazon.com
Вложения
Re: Reconcile stats in find_tabstat_entry() and get rid of PgStat_BackendFunctionEntry
От
Michael Paquier
Дата:
On Mon, Mar 20, 2023 at 11:57:31AM +0100, Drouvot, Bertrand wrote: > "Buffer" sounds more appropriate to me, so the attached has been done that way. This choice is OK for me. > + <indexterm> > + <primary>pg_stat_get_xact_blocks_fetched</primary> > + </indexterm> > + <function>pg_stat_get_xact_blocks_fetched</function> ( <type>oid</type> ) > + <returnvalue>bigint</returnvalue> > + </para> > + <para> > + Returns the number of buffer fetches for table or index, in the current transaction This should be "number of buffer fetched", no? > + </indexterm> > + <function>pg_stat_get_xact_blocks_hit</function> ( <type>oid</type> ) > + <returnvalue>bigint</returnvalue> > + </para> > + <para> > + Returns the number of buffer hits for table or index, in the current transaction > + </para></entry> This one looks OK to me too. -- Michael
Вложения
Re: Reconcile stats in find_tabstat_entry() and get rid of PgStat_BackendFunctionEntry
От
Kyotaro Horiguchi
Дата:
At Wed, 22 Mar 2023 10:16:12 +0900, Michael Paquier <michael@paquier.xyz> wrote in > On Mon, Mar 20, 2023 at 11:57:31AM +0100, Drouvot, Bertrand wrote: > > "Buffer" sounds more appropriate to me, so the attached has been done that way. > > This choice is OK for me. > > > + <indexterm> > > + <primary>pg_stat_get_xact_blocks_fetched</primary> > > + </indexterm> > > + <function>pg_stat_get_xact_blocks_fetched</function> ( <type>oid</type> ) > > + <returnvalue>bigint</returnvalue> > > + </para> > > + <para> > > + Returns the number of buffer fetches for table or index, in the current transaction > > This should be "number of buffer fetched", no? In the original description, "buffer fetches" appears to be a plural form of a compound noun and correct, similar to "buffer hits" mentioned later. If we reword it, I think it should be "number of buffers fetched". > > + </indexterm> > > + <function>pg_stat_get_xact_blocks_hit</function> ( <type>oid</type> ) > > + <returnvalue>bigint</returnvalue> > > + </para> > > + <para> > > + Returns the number of buffer hits for table or index, in the current transaction > > + </para></entry> > > This one looks OK to me too. regards. -- Kyotaro Horiguchi NTT Open Source Software Center
Re: Reconcile stats in find_tabstat_entry() and get rid of PgStat_BackendFunctionEntry
От
Michael Paquier
Дата:
On Wed, Mar 22, 2023 at 11:37:03AM +0900, Kyotaro Horiguchi wrote: > In the original description, "buffer fetches" appears to be a plural > form of a compound noun and correct, similar to "buffer hits" > mentioned later. If we reword it, I think it should be "number of > buffers fetched". Using the plural makes sense, yes. -- Michael
Вложения
Re: Reconcile stats in find_tabstat_entry() and get rid of PgStat_BackendFunctionEntry
От
"Drouvot, Bertrand"
Дата:
Hi, On 3/22/23 5:45 AM, Michael Paquier wrote: > On Wed, Mar 22, 2023 at 11:37:03AM +0900, Kyotaro Horiguchi wrote: >> In the original description, "buffer fetches" appears to be a plural >> form of a compound noun and correct, similar to "buffer hits" >> mentioned later. If we reword it, I think it should be "number of >> buffers fetched". > > Using the plural makes sense, yes. Yeah, "buffer fetches" is similar to "buffer hits". For consistency, ISTM than renaming it to "buffers fetched" would also mean renaming "buffer hits" to "buffers hit". But then it would not be consistent with the documentation for things like pg_statio_all_tables.heap_blks_hit, idx_blks_hit, toast_blks_hit, tidx_blks_hit, pg_statio_all_indexes.idx_blks_hit, pg_statio_all_sequences.blks_hit where "Number of buffer hits" is used). Regards, -- Bertrand Drouvot PostgreSQL Contributors Team RDS Open Source Databases Amazon Web Services: https://aws.amazon.com
Re: Reconcile stats in find_tabstat_entry() and get rid of PgStat_BackendFunctionEntry
От
"Drouvot, Bertrand"
Дата:
Hi, On 3/22/23 7:44 AM, Drouvot, Bertrand wrote: > Hi, > > On 3/22/23 5:45 AM, Michael Paquier wrote: >> On Wed, Mar 22, 2023 at 11:37:03AM +0900, Kyotaro Horiguchi wrote: >>> In the original description, "buffer fetches" appears to be a plural >>> form of a compound noun and correct, similar to "buffer hits" >>> mentioned later. If we reword it, I think it should be "number of >>> buffers fetched". >> >> Using the plural makes sense, yes. > > Yeah, "buffer fetches" is similar to "buffer hits". > > For consistency, ISTM than renaming it to "buffers fetched" would also mean > renaming "buffer hits" to "buffers hit". But then it would not be consistent > with the documentation for things like pg_statio_all_tables.heap_blks_hit, idx_blks_hit, toast_blks_hit, > tidx_blks_hit, pg_statio_all_indexes.idx_blks_hit, pg_statio_all_sequences.blks_hit > where "Number of buffer hits" is used). > That said, please find enclosed V2 with "buffers fetched" suggested above (and no changes to "buffer hits" to keep consistency with the other part of the documentation mentioned up-thread). Regards, -- Bertrand Drouvot PostgreSQL Contributors Team RDS Open Source Databases Amazon Web Services: https://aws.amazon.com
Вложения
Re: Reconcile stats in find_tabstat_entry() and get rid of PgStat_BackendFunctionEntry
От
Michael Paquier
Дата:
On Wed, Mar 22, 2023 at 09:20:25AM +0100, Drouvot, Bertrand wrote: > That said, please find enclosed V2 with "buffers fetched" suggested > above (and no changes to "buffer hits" to keep consistency with the > other part of the documentation mentioned up-thread). Thanks. Applied and backpatched that. In 12 and 11, the style of the table for these functions was a bit different. -- Michael
Вложения
Re: Reconcile stats in find_tabstat_entry() and get rid of PgStat_BackendFunctionEntry
От
Melanie Plageman
Дата:
On Mon, Mar 20, 2023 at 6:58 AM Drouvot, Bertrand <bertranddrouvot.pg@gmail.com> wrote: > > Hi, > > On 3/20/23 12:43 AM, Michael Paquier wrote: > > At the > > end, documenting both still sounds like the best move to me. > > Agree. > > Please find attached v1-0001-pg_stat_get_xact_blocks_fetched-and_hit-doc.patch doing so. > > I did not put the exact same wording as the one being removed in ddfc2d9, as: > > - For pg_stat_get_xact_blocks_hit(), I think it's better to be closer to say the > pg_statio_all_tables.heap_blks_hit definition. > > - For pg_stat_get_xact_blocks_fetched(), I think that using "buffer" is better (than block) as at the > end it's related to pgstat_count_buffer_read(). > > At the end there is a choice to be made for both for the wording between block and buffer. Indeed their > counters get incremented in "buffer" macros while retrieved in those "blocks" functions. > > "Buffer" sounds more appropriate to me, so the attached has been done that way. Apologies as I know this docs update has already been committed, but buffers fetched and blocks fetched both feel weird to me. If you have a cache hit, you don't end up really "fetching" anything at all (since pgstat_count_buffer_read() is called before ReadBuffer_common() and we don't know if it is a hit or miss yet). And, I would normally associate fetching with fetching a block into a buffer. It seems like this counter is really reflecting the number of buffers acquired or used. tuples_fetched makes more sense because a tuple is "fetched" into a slot. This isn't really the fault of this patch since that member was already called blocks_fetched. - Melanie
Re: Reconcile stats in find_tabstat_entry() and get rid of PgStat_BackendFunctionEntry
От
Michael Paquier
Дата:
On Wed, Mar 22, 2023 at 02:21:12PM -0400, Melanie Plageman wrote: > Apologies as I know this docs update has already been committed, but > buffers fetched and blocks fetched both feel weird to me. If you have a > cache hit, you don't end up really "fetching" anything at all (since > pgstat_count_buffer_read() is called before ReadBuffer_common() and we > don't know if it is a hit or miss yet). And, I would normally associate > fetching with fetching a block into a buffer. It seems like this counter > is really reflecting the number of buffers acquired or used. Well, it is the number of times we've requested a block read, though it may not actually be a read if the block was in the cache already. > This isn't really the fault of this patch since that member was already > called blocks_fetched. The original documentation of these functions added by 46aa77c refers to "block fetch requests" and "block requests found in cache", so that would not be right either based on your opinion here. If you find "fetch" to be incorrect in this context, here is another idea: - "block read requests" for blocks_fetched(). - "block read requested but actually found in cache" for blocks_hit(). All the system views care only about the difference between both counters to count the number of physical reads really done. -- Michael
Вложения
Re: Reconcile stats in find_tabstat_entry() and get rid of PgStat_BackendFunctionEntry
От
Michael Paquier
Дата:
On Thu, Mar 16, 2023 at 02:13:45PM +0100, Drouvot, Bertrand wrote: > On 3/16/23 12:46 PM, Michael Paquier wrote: >>> I don't think we should pay any particular attention to those 2 ones >>> as anyway nothing prevent the 7 others to be called outside of the >>> pg_stat_xact_all_tables view. >> >> I am not quite sure, TBH. Did you look at the difference with a long >> chain of subtrans, like savepoints? The ODBC driver "loves" producing >> a lot of savepoints, for example. > > No, I did not measure the impact. I have been thinking again about this particular point, and I would be fine with an additional boolean flag to compute the subtrans data in the global counter when only necessary. This would not make the macros for the stat functions that much more complicated, while being sure that we don't do unnecessary computations when we know that we don't need them.. -- Michael
Вложения
Re: Reconcile stats in find_tabstat_entry() and get rid of PgStat_BackendFunctionEntry
От
Andres Freund
Дата:
Hi, On 2023-03-25 11:56:22 +0900, Michael Paquier wrote: > On Thu, Mar 16, 2023 at 02:13:45PM +0100, Drouvot, Bertrand wrote: > > On 3/16/23 12:46 PM, Michael Paquier wrote: > >>> I don't think we should pay any particular attention to those 2 ones > >>> as anyway nothing prevent the 7 others to be called outside of the > >>> pg_stat_xact_all_tables view. > >> > >> I am not quite sure, TBH. Did you look at the difference with a long > >> chain of subtrans, like savepoints? The ODBC driver "loves" producing > >> a lot of savepoints, for example. > > > > No, I did not measure the impact. > > I have been thinking again about this particular point, and I would be > fine with an additional boolean flag to compute the subtrans data in > the global counter when only necessary. This would not make the > macros for the stat functions that much more complicated, while being > sure that we don't do unnecessary computations when we know that we > don't need them.. I don't understand what we're optimizing for here. These functions are very very very far from being a hot path. The xact functions are barely ever used. Compared to the cost of query evaluation the cost of iterating throught he subxacts is neglegible. Greetings, Andres Freund
Re: Reconcile stats in find_tabstat_entry() and get rid of PgStat_BackendFunctionEntry
От
Michael Paquier
Дата:
On Fri, Mar 24, 2023 at 08:00:44PM -0700, Andres Freund wrote: > I don't understand what we're optimizing for here. These functions are very > very very far from being a hot path. The xact functions are barely ever > used. Compared to the cost of query evaluation the cost of iterating throught > he subxacts is neglegible. I was wondering about that, and I see why I'm wrong. I have quickly gone up to 10k subtransactions, and while I was seeing what looks like difference of 8~10% in runtime when looking at pg_stat_xact_all_tables, the overval runtime was still close enough (5.8ms vs 6.4ms). At this scale, possible that it was some noise, these seemed repeatable still not to worry about. Anyway, I was looking at this patch, and I still feel that it is a bit incorrect to have the copy of PgStat_TableStatus returned by find_tabstat_entry() to point to the same list of subtransaction data as the pending entry found, while the counters are incremented. This could lead to mistakes if the copy from find_tabstat_entry() is used in an unexpected way in the future. The current callers are OK, but this does not give me a warm feeling :/ -- Michael
Вложения
Re: Reconcile stats in find_tabstat_entry() and get rid of PgStat_BackendFunctionEntry
От
Melanie Plageman
Дата:
On Wed, Mar 22, 2023 at 6:42 PM Michael Paquier <michael@paquier.xyz> wrote: > > On Wed, Mar 22, 2023 at 02:21:12PM -0400, Melanie Plageman wrote: > > Apologies as I know this docs update has already been committed, but > > buffers fetched and blocks fetched both feel weird to me. If you have a > > cache hit, you don't end up really "fetching" anything at all (since > > pgstat_count_buffer_read() is called before ReadBuffer_common() and we > > don't know if it is a hit or miss yet). And, I would normally associate > > fetching with fetching a block into a buffer. It seems like this counter > > is really reflecting the number of buffers acquired or used. > > Well, it is the number of times we've requested a block read, though > it may not actually be a read if the block was in the cache already. > > > This isn't really the fault of this patch since that member was already > > called blocks_fetched. > > The original documentation of these functions added by 46aa77c refers > to "block fetch requests" and "block requests found in cache", so that > would not be right either based on your opinion here. If you find > "fetch" to be incorrect in this context, here is another idea: > - "block read requests" for blocks_fetched(). > - "block read requested but actually found in cache" for blocks_hit(). I do like/prefer "block read requests" and "blocks requested found in cache" Though, now I fear my initial complaint may have been a bit pedantic. - Melanie
Re: Reconcile stats in find_tabstat_entry() and get rid of PgStat_BackendFunctionEntry
От
Michael Paquier
Дата:
On Mon, Mar 27, 2023 at 10:24:46AM -0400, Melanie Plageman wrote: > I do like/prefer "block read requests" and > "blocks requested found in cache" > Though, now I fear my initial complaint may have been a bit pedantic. That's fine. Let's ask for extra opinions, then. So, have others an opinion to share here? -- Michael
Вложения
Re: Reconcile stats in find_tabstat_entry() and get rid of PgStat_BackendFunctionEntry
От
Kyotaro Horiguchi
Дата:
At Tue, 28 Mar 2023 07:38:25 +0900, Michael Paquier <michael@paquier.xyz> wrote in > On Mon, Mar 27, 2023 at 10:24:46AM -0400, Melanie Plageman wrote: > > I do like/prefer "block read requests" and > > "blocks requested found in cache" > > Though, now I fear my initial complaint may have been a bit pedantic. > > That's fine. Let's ask for extra opinions, then. > > So, have others an opinion to share here? I do not have a strong preference for the wording, but consistency is important. IMHO simply swapping out a few words won't really improve things. I found that commit ddfc2d9a37 removed the descriptions for pg_stat_get_blocks_fetched and pg_stat_get_blocks_hit. Right before that commit, monitoring.sgml had these lines: - <function>pg_stat_get_blocks_fetched</function> minus - <function>pg_stat_get_blocks_hit</function> gives the number of kernel - <function>read()</> calls issued for the table, index, or - database; the number of actual physical reads is usually - lower due to kernel-level buffering. The <literal>*_blks_read</> - statistics columns use this subtraction, i.e., fetched minus hit. The commit then added the following sentence to the description for pg_statio_all_tables.heap_blks_read. > <entry>Number of disk blocks read from this table. > This value can also be returned by directly calling > the <function>pg_stat_get_blocks_fetched</function> and > <function>pg_stat_get_blocks_hit</function> functions and > subtracting the results.</entry> Later, in 5f2b089387 it twas revised as: + <entry>Number of disk blocks read in this database</entry> This revision lost the explantion regarding the relationship among fetch, hit and read, as it became hidden in the views' definitions. As the result, in the current state, it doesn't make sense to just add a description for pg_stat_get_xact_blocks_fetched(). The confusion stems from the inconsistency between the views and underlying functions related to block reads and hits. If we add descriptions for the two functions, we should also explain their relationship. Otherwise, it might be better to add the functions pg_stat_*_blocks_read() instead. regards. -- Kyotaro Horiguchi NTT Open Source Software Center
Re: Reconcile stats in find_tabstat_entry() and get rid of PgStat_BackendFunctionEntry
От
Michael Paquier
Дата:
On Tue, Mar 28, 2023 at 12:36:15PM +0900, Kyotaro Horiguchi wrote: > I found that commit ddfc2d9a37 removed the descriptions for > pg_stat_get_blocks_fetched and pg_stat_get_blocks_hit. Right before > that commit, monitoring.sgml had these lines: > > - <function>pg_stat_get_blocks_fetched</function> minus > - <function>pg_stat_get_blocks_hit</function> gives the number of kernel > - <function>read()</> calls issued for the table, index, or > - database; the number of actual physical reads is usually > - lower due to kernel-level buffering. The <literal>*_blks_read</> > - statistics columns use this subtraction, i.e., fetched minus hit. > > The commit then added the following sentence to the description for > pg_statio_all_tables.heap_blks_read. > > Later, in 5f2b089387 it twas revised as: > + <entry>Number of disk blocks read in this database</entry> Yeah, maybe adding something like that at the bottom of the table for stat functions, telling that the difference is the number of read() calls, may help. Perhaps also adding a mention that these are used in none of the existing system views.. > The confusion stems from the inconsistency between the views and > underlying functions related to block reads and hits. If we add > descriptions for the two functions, we should also explain their > relationship. Otherwise, it might be better to add the functions > pg_stat_*_blocks_read() instead. I am not sure that we really need to get down to that as this holds the same meaning as the current system views showing read as the difference between fetched and hit. -- Michael
Вложения
Re: Reconcile stats in find_tabstat_entry() and get rid of PgStat_BackendFunctionEntry
От
"Drouvot, Bertrand"
Дата:
Hi, On 3/28/23 7:23 AM, Michael Paquier wrote: > On Tue, Mar 28, 2023 at 12:36:15PM +0900, Kyotaro Horiguchi wrote: >> I found that commit ddfc2d9a37 removed the descriptions for >> pg_stat_get_blocks_fetched and pg_stat_get_blocks_hit. Right before >> that commit, monitoring.sgml had these lines: >> >> - <function>pg_stat_get_blocks_fetched</function> minus >> - <function>pg_stat_get_blocks_hit</function> gives the number of kernel >> - <function>read()</> calls issued for the table, index, or >> - database; the number of actual physical reads is usually >> - lower due to kernel-level buffering. The <literal>*_blks_read</> >> - statistics columns use this subtraction, i.e., fetched minus hit. >> >> The commit then added the following sentence to the description for >> pg_statio_all_tables.heap_blks_read. >> >> Later, in 5f2b089387 it twas revised as: >> + <entry>Number of disk blocks read in this database</entry> > > Yeah, maybe adding something like that at the bottom of the table for > stat functions, telling that the difference is the number of read() > calls, may help. Perhaps also adding a mention that these are used in > none of the existing system views.. > >> The confusion stems from the inconsistency between the views and >> underlying functions related to block reads and hits. If we add >> descriptions for the two functions, we should also explain their >> relationship. I agree that adding more explanation would help and avoid confusion. What about something like? for pg_stat_get_xact_blocks_fetched(): "block read requests for table or index, in the current transaction. This number minus pg_stat_get_xact_blocks_hit() gives the number of kernel read() calls." pg_stat_get_xact_blocks_hit(): "block read requests for table or index, in the current transaction, found in cache (not triggering kernel read() calls)". Regards, -- Bertrand Drouvot PostgreSQL Contributors Team RDS Open Source Databases Amazon Web Services: https://aws.amazon.com
Re: Reconcile stats in find_tabstat_entry() and get rid of PgStat_BackendFunctionEntry
От
Michael Paquier
Дата:
On Tue, Mar 28, 2023 at 07:49:45AM +0200, Drouvot, Bertrand wrote: > What about something like? > > for pg_stat_get_xact_blocks_fetched(): "block read requests for table or index, in the current > transaction. This number minus pg_stat_get_xact_blocks_hit() gives the number of kernel > read() calls." > > pg_stat_get_xact_blocks_hit(): "block read requests for table or index, in the current > transaction, found in cache (not triggering kernel read() calls)". Something among these lines within the table would be also OK by me. Horiguchi-san or Melanie-san, perhaps you have counter-proposals? -- Michael
Вложения
Re: Reconcile stats in find_tabstat_entry() and get rid of PgStat_BackendFunctionEntry
От
Kyotaro Horiguchi
Дата:
At Tue, 28 Mar 2023 15:16:36 +0900, Michael Paquier <michael@paquier.xyz> wrote in > On Tue, Mar 28, 2023 at 07:49:45AM +0200, Drouvot, Bertrand wrote: > > What about something like? > > > > for pg_stat_get_xact_blocks_fetched(): "block read requests for table or index, in the current > > transaction. This number minus pg_stat_get_xact_blocks_hit() gives the number of kernel > > read() calls." > > > > pg_stat_get_xact_blocks_hit(): "block read requests for table or index, in the current > > transaction, found in cache (not triggering kernel read() calls)". > > Something among these lines within the table would be also OK by me. > Horiguchi-san or Melanie-san, perhaps you have counter-proposals? No. Fine by me, except that "block read requests" seems to suggest kernel read() calls, maybe because it's not clear whether "block" refers to our buffer blocks or file blocks to me.. If it is generally clear, I'm fine with the proposal. regards. -- Kyotaro Horiguchi NTT Open Source Software Center
Re: Reconcile stats in find_tabstat_entry() and get rid of PgStat_BackendFunctionEntry
От
Michael Paquier
Дата:
On Tue, Mar 28, 2023 at 05:43:26PM +0900, Kyotaro Horiguchi wrote: > No. Fine by me, except that "block read requests" seems to suggest > kernel read() calls, maybe because it's not clear whether "block" > refers to our buffer blocks or file blocks to me.. If it is generally > clear, I'm fine with the proposal. Okay. Would somebody like to draft a patch? -- Michael
Вложения
Re: Reconcile stats in find_tabstat_entry() and get rid of PgStat_BackendFunctionEntry
От
"Drouvot, Bertrand"
Дата:
Hi, On 3/29/23 2:09 AM, Michael Paquier wrote: > On Tue, Mar 28, 2023 at 05:43:26PM +0900, Kyotaro Horiguchi wrote: >> No. Fine by me, except that "block read requests" seems to suggest >> kernel read() calls, maybe because it's not clear whether "block" >> refers to our buffer blocks or file blocks to me.. If it is generally >> clear, I'm fine with the proposal. > > Okay. Would somebody like to draft a patch? Please find a draft attached. Regards, -- Bertrand Drouvot PostgreSQL Contributors Team RDS Open Source Databases Amazon Web Services: https://aws.amazon.com
Вложения
Re: Reconcile stats in find_tabstat_entry() and get rid of PgStat_BackendFunctionEntry
От
Michael Paquier
Дата:
On Wed, Mar 29, 2023 at 07:44:20AM +0200, Drouvot, Bertrand wrote: > Please find a draft attached. This addition looks OK for me. Thanks for the patch! -- Michael
Вложения
Re: Reconcile stats in find_tabstat_entry() and get rid of PgStat_BackendFunctionEntry
От
Michael Paquier
Дата:
On Tue, Apr 04, 2023 at 09:04:34PM +0900, Michael Paquier wrote: > This addition looks OK for me. Thanks for the patch! Okay, finally done. One part that was still not complete to me in light of the information ddfc2d9 has removed is that the number of physical reads could be lower than the reported number depending on what the kernel cache holds. So I've added this sentence, while on it. -- Michael
Вложения
Re: Reconcile stats in find_tabstat_entry() and get rid of PgStat_BackendFunctionEntry
От
"Drouvot, Bertrand"
Дата:
Hi, On 3/27/23 8:35 AM, Michael Paquier wrote: > On Fri, Mar 24, 2023 at 08:00:44PM -0700, Andres Freund wrote: >> I don't understand what we're optimizing for here. These functions are very >> very very far from being a hot path. The xact functions are barely ever >> used. Compared to the cost of query evaluation the cost of iterating throught >> he subxacts is neglegible. > > I was wondering about that, and I see why I'm wrong. I have quickly > gone up to 10k subtransactions, and while I was seeing what looks like > difference of 8~10% in runtime when looking at > pg_stat_xact_all_tables, the overval runtime was still close enough > (5.8ms vs 6.4ms). At this scale, possible that it was some noise, > these seemed repeatable still not to worry about. > > Anyway, I was looking at this patch, and I still feel that it is a bit > incorrect to have the copy of PgStat_TableStatus returned by > find_tabstat_entry() to point to the same list of subtransaction data > as the pending entry found, while the counters are incremented. This > could lead to mistakes if the copy from find_tabstat_entry() is used > in an unexpected way in the future. The current callers are OK, but > this does not give me a warm feeling :/ FWIW, please find attached V7 (mandatory rebase). It would allow to also define: - pg_stat_get_xact_tuples_inserted - pg_stat_get_xact_tuples_updated - pg_stat_get_xact_tuples_deleted as macros, joining others pg_stat_get_xact_*() that are already defined as macros. The concern you raised above has not been addressed, meaning that find_tabstat_entry() still return a copy of PgStat_TableStatus. By "used in an unexpected way in the future", what do you mean exactly? Do you mean the caller forgetting it is working on a copy and then could work with "stale" counters? Trying to understand to see if I should invest time to try to address your concern or leave those 3 functions as they are currently before moving back to the "Split index and table statistics into different types of stats" work [1]. [1]: https://www.postgresql.org/message-id/flat/f572abe7-a1bb-e13b-48c7-2ca150546822@gmail.com Regards, -- Bertrand Drouvot PostgreSQL Contributors Team RDS Open Source Databases Amazon Web Services: https://aws.amazon.com
Вложения
Re: Reconcile stats in find_tabstat_entry() and get rid of PgStat_BackendFunctionEntry
От
Michael Paquier
Дата:
On Thu, Oct 26, 2023 at 10:04:25AM +0200, Drouvot, Bertrand wrote: > By "used in an unexpected way in the future", what do you mean exactly? Do you mean > the caller forgetting it is working on a copy and then could work with > "stale" counters? (Be careful about the code indentation.) The part that I found disturbing is here: + tabentry = (PgStat_TableStatus *) entry_ref->pending; + tablestatus = palloc(sizeof(PgStat_TableStatus)); + *tablestatus = *tabentry; This causes tablestatus->trans to point to the same location as tabentry->trans, but wouldn't it be better to set tablestatus->trans to NULL instead for the copy returned to the caller? -- Michael
Вложения
Re: Reconcile stats in find_tabstat_entry() and get rid of PgStat_BackendFunctionEntry
От
"Drouvot, Bertrand"
Дата:
Hi, On 10/27/23 8:07 AM, Michael Paquier wrote: > > The part that I found disturbing is here: > + tabentry = (PgStat_TableStatus *) entry_ref->pending; > + tablestatus = palloc(sizeof(PgStat_TableStatus)); > + *tablestatus = *tabentry; > > This causes tablestatus->trans to point to the same location as > tabentry->trans, but wouldn't it be better to set tablestatus->trans > to NULL instead for the copy returned to the caller? Oh I see, yeah I do agree to set tablestatus->trans to NULL to avoid any undesired interference with tabentry->trans. Done in V8 attached (pgindent has been run on pgstatfuncs.c and pgstat_relation.c). Regards, -- Bertrand Drouvot PostgreSQL Contributors Team RDS Open Source Databases Amazon Web Services: https://aws.amazon.com
Вложения
Re: Reconcile stats in find_tabstat_entry() and get rid of PgStat_BackendFunctionEntry
От
Michael Paquier
Дата:
On Fri, Oct 27, 2023 at 09:45:52AM +0200, Drouvot, Bertrand wrote: > Oh I see, yeah I do agree to set tablestatus->trans to NULL to avoid > any undesired interference with tabentry->trans. > > Done in V8 attached (pgindent has been run on pgstatfuncs.c and > pgstat_relation.c). LGTM. -- Michael
Вложения
Re: Reconcile stats in find_tabstat_entry() and get rid of PgStat_BackendFunctionEntry
От
Michael Paquier
Дата:
On Fri, Oct 27, 2023 at 09:45:52AM +0200, Drouvot, Bertrand wrote: > Done in V8 attached (pgindent has been run on pgstatfuncs.c and > pgstat_relation.c). And applied that after editing a bit the comments. -- Michael
Вложения
Re: Reconcile stats in find_tabstat_entry() and get rid of PgStat_BackendFunctionEntry
От
Andres Freund
Дата:
On 2023-10-30 08:25:17 +0900, Michael Paquier wrote: > On Fri, Oct 27, 2023 at 09:45:52AM +0200, Drouvot, Bertrand wrote: > > Done in V8 attached (pgindent has been run on pgstatfuncs.c and > > pgstat_relation.c). > > And applied that after editing a bit the comments. Thank you both!