Обсуждение: Add connection active, idle time to pg_stat_activity

Поиск
Список
Период
Сортировка

Add connection active, idle time to pg_stat_activity

От
Rafia Sabih
Дата:
Hello there hackers,

We at Zalando have faced some issues around long running idle
transactions and were thinking about increasing the visibility of
pg_stat_* views to capture them easily. What I found is that currently
in pg_stat_activity there is a lot of good information about the
current state of the process, but it is lacking the cumulative
information on how much time the connection spent being idle, idle in
transaction or active, we would like to see cumulative values for each
of these per connection. I believe it would be helpful for us and more
people out there if we could have total connection active and idle
time displayed in pg_stat_activity.

To provide this information I was digging into how the statistics
collector is working and found out there is already information like
total time that a connection is active as well as idle computed in
pgstat_report_activity[1]. Ideally, this would be the values we would
like to see per process in pg_stat_activity.

Curious to know your thoughts on this.


[1]https://github.com/postgres/postgres/blob/cd3f429d9565b2e5caf0980ea7c707e37bc3b317/src/backend/utils/activity/backend_status.c#L593



-- 
Regards,
Rafia Sabih



Re: Add connection active, idle time to pg_stat_activity

От
Rafia Sabih
Дата:
On Fri, 22 Oct 2021 at 10:22, Rafia Sabih <rafia.pghackers@gmail.com> wrote:
>
> Hello there hackers,
>
> We at Zalando have faced some issues around long running idle
> transactions and were thinking about increasing the visibility of
> pg_stat_* views to capture them easily. What I found is that currently
> in pg_stat_activity there is a lot of good information about the
> current state of the process, but it is lacking the cumulative
> information on how much time the connection spent being idle, idle in
> transaction or active, we would like to see cumulative values for each
> of these per connection. I believe it would be helpful for us and more
> people out there if we could have total connection active and idle
> time displayed in pg_stat_activity.
>
> To provide this information I was digging into how the statistics
> collector is working and found out there is already information like
> total time that a connection is active as well as idle computed in
> pgstat_report_activity[1]. Ideally, this would be the values we would
> like to see per process in pg_stat_activity.
>
> Curious to know your thoughts on this.
>
>
[1]https://github.com/postgres/postgres/blob/cd3f429d9565b2e5caf0980ea7c707e37bc3b317/src/backend/utils/activity/backend_status.c#L593
>
>
>
> --
> Regards,
> Rafia Sabih

Please find the attached patch for the idea of our intentions.
It basically adds three attributes for idle, idle_in_transaction, and
active time respectively.
Please let me know your views on this and I shall add this to the
upcoming commitfest for better tracking.


--
Regards,
Rafia Sabih

Вложения

Re: Add connection active, idle time to pg_stat_activity

От
Dilip Kumar
Дата:
On Tue, Oct 26, 2021 at 5:17 PM Rafia Sabih <rafia.pghackers@gmail.com> wrote:
>
> >
> > To provide this information I was digging into how the statistics
> > collector is working and found out there is already information like
> > total time that a connection is active as well as idle computed in
> > pgstat_report_activity[1]. Ideally, this would be the values we would
> > like to see per process in pg_stat_activity.
> >
> > Curious to know your thoughts on this.

+1 for the idea

> Please find the attached patch for the idea of our intentions.
> It basically adds three attributes for idle, idle_in_transaction, and
> active time respectively.
> Please let me know your views on this and I shall add this to the
> upcoming commitfest for better tracking.

About the patch, IIUC earlier all the idle time was accumulated in the
"pgStatTransactionIdleTime" counter, now with your patch you have
introduced one more counter which specifically tracks the
STATE_IDLEINTRANSACTION state.  But my concern is that the
STATE_IDLEINTRANSACTION_ABORTED is still computed under STATE_IDLE and
that looks odd to me.  Either STATE_IDLEINTRANSACTION_ABORTED should
be accumulated in the "pgStatTransactionIdleInTxnTime" counter or
there should be a separate counter for that.  But after your patch we
can not accumulate this in the "pgStatTransactionIdleTime" counter.

-- 
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com



Re: Add connection active, idle time to pg_stat_activity

От
Rafia Sabih
Дата:
On Tue, 2 Nov 2021 at 09:00, Dilip Kumar <dilipbalaut@gmail.com> wrote:
>
> On Tue, Oct 26, 2021 at 5:17 PM Rafia Sabih <rafia.pghackers@gmail.com> wrote:
> >
> > >
> > > To provide this information I was digging into how the statistics
> > > collector is working and found out there is already information like
> > > total time that a connection is active as well as idle computed in
> > > pgstat_report_activity[1]. Ideally, this would be the values we would
> > > like to see per process in pg_stat_activity.
> > >
> > > Curious to know your thoughts on this.
>
> +1 for the idea
>
Thanks!

> > Please find the attached patch for the idea of our intentions.
> > It basically adds three attributes for idle, idle_in_transaction, and
> > active time respectively.
> > Please let me know your views on this and I shall add this to the
> > upcoming commitfest for better tracking.
>
> About the patch, IIUC earlier all the idle time was accumulated in the
> "pgStatTransactionIdleTime" counter, now with your patch you have
> introduced one more counter which specifically tracks the
> STATE_IDLEINTRANSACTION state.  But my concern is that the
> STATE_IDLEINTRANSACTION_ABORTED is still computed under STATE_IDLE and
> that looks odd to me.  Either STATE_IDLEINTRANSACTION_ABORTED should
> be accumulated in the "pgStatTransactionIdleInTxnTime" counter or
> there should be a separate counter for that.  But after your patch we
> can not accumulate this in the "pgStatTransactionIdleTime" counter.
>
As per your comments I have added it in pgStatTransactionIdleInTxnTime.
Please let me know if there are any further comments.

-- 
Regards,
Rafia Sabih

Вложения

Re: Add connection active, idle time to pg_stat_activity

От
Dilip Kumar
Дата:
On Tue, Nov 9, 2021 at 8:28 PM Rafia Sabih <rafia.pghackers@gmail.com> wrote:
>
> On Tue, 2 Nov 2021 at 09:00, Dilip Kumar <dilipbalaut@gmail.com> wrote:
> >

> > About the patch, IIUC earlier all the idle time was accumulated in the
> > "pgStatTransactionIdleTime" counter, now with your patch you have
> > introduced one more counter which specifically tracks the
> > STATE_IDLEINTRANSACTION state.  But my concern is that the
> > STATE_IDLEINTRANSACTION_ABORTED is still computed under STATE_IDLE and
> > that looks odd to me.  Either STATE_IDLEINTRANSACTION_ABORTED should
> > be accumulated in the "pgStatTransactionIdleInTxnTime" counter or
> > there should be a separate counter for that.  But after your patch we
> > can not accumulate this in the "pgStatTransactionIdleTime" counter.
> >
> As per your comments I have added it in pgStatTransactionIdleInTxnTime.
> Please let me know if there are any further comments.

I have a few comments,

             nulls[29] = true;
+            values[30] = true;
+            values[31] = true;
+            values[32] = true;

This looks wrong, this should be nulls[] = true not values[]=true.

if ((beentry->st_state == STATE_RUNNING ||
  beentry->st_state == STATE_FASTPATH ||
  beentry->st_state == STATE_IDLEINTRANSACTION ||
  beentry->st_state == STATE_IDLEINTRANSACTION_ABORTED) &&
  state != beentry->st_state)
{
if (beentry->st_state == STATE_RUNNING ||
beentry->st_state == STATE_FASTPATH)
{
   pgstat_count_conn_active_time((PgStat_Counter) secs * 1000000 + usecs);
  beentry->st_active_time = pgStatActiveTime;
}
else if (beentry->st_state == STATE_IDLEINTRANSACTION ||
   beentry->st_state == STATE_IDLEINTRANSACTION_ABORTED)
{
  pgstat_count_conn_txn_idle_in_txn_time((PgStat_Counter) secs *
1000000 + usecs);
  beentry->st_idle_in_transaction_time = pgStatTransactionIdleInTxnTime;
}
else
{
  pgstat_count_conn_txn_idle_time((PgStat_Counter) secs * 1000000 + usecs);
  beentry->st_idle_time = pgStatTransactionIdleTime;
}

It seems that in beentry->st_idle_time, you want to compute the
STATE_IDLE, but that state is not handled in the outer "if", that
means whenever it comes out of the
STATE_IDLE, it will not enter inside this if check.  You can run and
test, I am sure that with this patch the "idle_time" will always
remain 0.


-- 
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com



Re: Add connection active, idle time to pg_stat_activity

От
Rafia Sabih
Дата:
On Wed, 10 Nov 2021 at 09:05, Dilip Kumar <dilipbalaut@gmail.com> wrote:
>
> On Tue, Nov 9, 2021 at 8:28 PM Rafia Sabih <rafia.pghackers@gmail.com> wrote:
> >
> > On Tue, 2 Nov 2021 at 09:00, Dilip Kumar <dilipbalaut@gmail.com> wrote:
> > >
>
> > > About the patch, IIUC earlier all the idle time was accumulated in the
> > > "pgStatTransactionIdleTime" counter, now with your patch you have
> > > introduced one more counter which specifically tracks the
> > > STATE_IDLEINTRANSACTION state.  But my concern is that the
> > > STATE_IDLEINTRANSACTION_ABORTED is still computed under STATE_IDLE and
> > > that looks odd to me.  Either STATE_IDLEINTRANSACTION_ABORTED should
> > > be accumulated in the "pgStatTransactionIdleInTxnTime" counter or
> > > there should be a separate counter for that.  But after your patch we
> > > can not accumulate this in the "pgStatTransactionIdleTime" counter.
> > >
> > As per your comments I have added it in pgStatTransactionIdleInTxnTime.
> > Please let me know if there are any further comments.
>
> I have a few comments,
>
>              nulls[29] = true;
> +            values[30] = true;
> +            values[31] = true;
> +            values[32] = true;
>
> This looks wrong, this should be nulls[] = true not values[]=true.
>
> if ((beentry->st_state == STATE_RUNNING ||
>   beentry->st_state == STATE_FASTPATH ||
>   beentry->st_state == STATE_IDLEINTRANSACTION ||
>   beentry->st_state == STATE_IDLEINTRANSACTION_ABORTED) &&
>   state != beentry->st_state)
> {
> if (beentry->st_state == STATE_RUNNING ||
> beentry->st_state == STATE_FASTPATH)
> {
>    pgstat_count_conn_active_time((PgStat_Counter) secs * 1000000 + usecs);
>   beentry->st_active_time = pgStatActiveTime;
> }
> else if (beentry->st_state == STATE_IDLEINTRANSACTION ||
>    beentry->st_state == STATE_IDLEINTRANSACTION_ABORTED)
> {
>   pgstat_count_conn_txn_idle_in_txn_time((PgStat_Counter) secs *
> 1000000 + usecs);
>   beentry->st_idle_in_transaction_time = pgStatTransactionIdleInTxnTime;
> }
> else
> {
>   pgstat_count_conn_txn_idle_time((PgStat_Counter) secs * 1000000 + usecs);
>   beentry->st_idle_time = pgStatTransactionIdleTime;
> }
>
> It seems that in beentry->st_idle_time, you want to compute the
> STATE_IDLE, but that state is not handled in the outer "if", that
> means whenever it comes out of the
> STATE_IDLE, it will not enter inside this if check.  You can run and
> test, I am sure that with this patch the "idle_time" will always
> remain 0.
>
Thank you Dilip for your time on this.
And yes you are right in both your observations.
Please find the attached patch for the updated version.

-- 
Regards,
Rafia Sabih

Вложения

Re: Add connection active, idle time to pg_stat_activity

От
Dilip Kumar
Дата:
On Wed, Nov 10, 2021 at 1:47 PM Rafia Sabih <rafia.pghackers@gmail.com> wrote:
>
> > It seems that in beentry->st_idle_time, you want to compute the
> > STATE_IDLE, but that state is not handled in the outer "if", that
> > means whenever it comes out of the
> > STATE_IDLE, it will not enter inside this if check.  You can run and
> > test, I am sure that with this patch the "idle_time" will always
> > remain 0.
> >
> Thank you Dilip for your time on this.
> And yes you are right in both your observations.
> Please find the attached patch for the updated version.

Looks fine now except these variable names,

 PgStat_Counter pgStatTransactionIdleTime = 0;
+PgStat_Counter pgStatTransactionIdleInTxnTime = 0;

Now, pgStatTransactionIdleTime is collecting just the Idle time so
pgStatTransactionIdleTime should be renamed to "pgStatIdleTime" and
pgStatTransactionIdleInTxnTime should be renamed to
"pgStatTransactionIdleTime"

-- 
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com



Re: Add connection active, idle time to pg_stat_activity

От
Rafia Sabih
Дата:
On Mon, 15 Nov 2021 at 10:24, Dilip Kumar <dilipbalaut@gmail.com> wrote:
>
> On Wed, Nov 10, 2021 at 1:47 PM Rafia Sabih <rafia.pghackers@gmail.com> wrote:
> >
> > > It seems that in beentry->st_idle_time, you want to compute the
> > > STATE_IDLE, but that state is not handled in the outer "if", that
> > > means whenever it comes out of the
> > > STATE_IDLE, it will not enter inside this if check.  You can run and
> > > test, I am sure that with this patch the "idle_time" will always
> > > remain 0.
> > >
> > Thank you Dilip for your time on this.
> > And yes you are right in both your observations.
> > Please find the attached patch for the updated version.
>
> Looks fine now except these variable names,
>
>  PgStat_Counter pgStatTransactionIdleTime = 0;
> +PgStat_Counter pgStatTransactionIdleInTxnTime = 0;
>
> Now, pgStatTransactionIdleTime is collecting just the Idle time so
> pgStatTransactionIdleTime should be renamed to "pgStatIdleTime" and
> pgStatTransactionIdleInTxnTime should be renamed to
> "pgStatTransactionIdleTime"
>
Good point!
Done.


-- 
Regards,
Rafia Sabih

Вложения

Re: Add connection active, idle time to pg_stat_activity

От
Dilip Kumar
Дата:
On Mon, Nov 15, 2021 at 4:46 PM Rafia Sabih <rafia.pghackers@gmail.com> wrote:
>
> On Mon, 15 Nov 2021 at 10:24, Dilip Kumar <dilipbalaut@gmail.com> wrote:
> >
> > On Wed, Nov 10, 2021 at 1:47 PM Rafia Sabih <rafia.pghackers@gmail.com> wrote:
> > >
> > > > It seems that in beentry->st_idle_time, you want to compute the
> > > > STATE_IDLE, but that state is not handled in the outer "if", that
> > > > means whenever it comes out of the
> > > > STATE_IDLE, it will not enter inside this if check.  You can run and
> > > > test, I am sure that with this patch the "idle_time" will always
> > > > remain 0.
> > > >
> > > Thank you Dilip for your time on this.
> > > And yes you are right in both your observations.
> > > Please find the attached patch for the updated version.
> >
> > Looks fine now except these variable names,
> >
> >  PgStat_Counter pgStatTransactionIdleTime = 0;
> > +PgStat_Counter pgStatTransactionIdleInTxnTime = 0;
> >
> > Now, pgStatTransactionIdleTime is collecting just the Idle time so
> > pgStatTransactionIdleTime should be renamed to "pgStatIdleTime" and
> > pgStatTransactionIdleInTxnTime should be renamed to
> > "pgStatTransactionIdleTime"
> >
> Good point!
> Done.

@@ -1018,7 +1019,7 @@ pgstat_send_tabstat(PgStat_MsgTabstat *tsmsg,
TimestampTz now)
  pgLastSessionReportTime = now;
  tsmsg->m_session_time = (PgStat_Counter) secs * 1000000 + usecs;
  tsmsg->m_active_time = pgStatActiveTime;
- tsmsg->m_idle_in_xact_time = pgStatTransactionIdleTime;
+ tsmsg->m_idle_in_xact_time = pgStatIdleTime;

I think this change is wrong,  basically, "tsmsg->m_idle_in_xact_time"
is used for counting the database level idle in transaction count, you
can check "pg_stat_get_db_idle_in_transaction_time" function for that.
So "pgStatTransactionIdleTime" is the variable counting the idle in
transaction time, pgStatIdleTime is just counting the idle time
outside the transaction so if we make this change we are changing the
meaning of tsmsg->m_idle_in_xact_time.

-- 
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com



Re: Add connection active, idle time to pg_stat_activity

От
Rafia Sabih
Дата:
On Mon, 15 Nov 2021 at 12:40, Dilip Kumar <dilipbalaut@gmail.com> wrote:
>
> On Mon, Nov 15, 2021 at 4:46 PM Rafia Sabih <rafia.pghackers@gmail.com> wrote:
> >
> > On Mon, 15 Nov 2021 at 10:24, Dilip Kumar <dilipbalaut@gmail.com> wrote:
> > >
> > > On Wed, Nov 10, 2021 at 1:47 PM Rafia Sabih <rafia.pghackers@gmail.com> wrote:
> > > >
> > > > > It seems that in beentry->st_idle_time, you want to compute the
> > > > > STATE_IDLE, but that state is not handled in the outer "if", that
> > > > > means whenever it comes out of the
> > > > > STATE_IDLE, it will not enter inside this if check.  You can run and
> > > > > test, I am sure that with this patch the "idle_time" will always
> > > > > remain 0.
> > > > >
> > > > Thank you Dilip for your time on this.
> > > > And yes you are right in both your observations.
> > > > Please find the attached patch for the updated version.
> > >
> > > Looks fine now except these variable names,
> > >
> > >  PgStat_Counter pgStatTransactionIdleTime = 0;
> > > +PgStat_Counter pgStatTransactionIdleInTxnTime = 0;
> > >
> > > Now, pgStatTransactionIdleTime is collecting just the Idle time so
> > > pgStatTransactionIdleTime should be renamed to "pgStatIdleTime" and
> > > pgStatTransactionIdleInTxnTime should be renamed to
> > > "pgStatTransactionIdleTime"
> > >
> > Good point!
> > Done.
>
> @@ -1018,7 +1019,7 @@ pgstat_send_tabstat(PgStat_MsgTabstat *tsmsg,
> TimestampTz now)
>   pgLastSessionReportTime = now;
>   tsmsg->m_session_time = (PgStat_Counter) secs * 1000000 + usecs;
>   tsmsg->m_active_time = pgStatActiveTime;
> - tsmsg->m_idle_in_xact_time = pgStatTransactionIdleTime;
> + tsmsg->m_idle_in_xact_time = pgStatIdleTime;
>
> I think this change is wrong,  basically, "tsmsg->m_idle_in_xact_time"
> is used for counting the database level idle in transaction count, you
> can check "pg_stat_get_db_idle_in_transaction_time" function for that.
> So "pgStatTransactionIdleTime" is the variable counting the idle in
> transaction time, pgStatIdleTime is just counting the idle time
> outside the transaction so if we make this change we are changing the
> meaning of tsmsg->m_idle_in_xact_time.

Got it.
Updated

-- 
Regards,
Rafia Sabih

Вложения

Re: Add connection active, idle time to pg_stat_activity

От
Dilip Kumar
Дата:
On Tue, Nov 16, 2021 at 5:06 PM Rafia Sabih <rafia.pghackers@gmail.com> wrote:
> > I think this change is wrong,  basically, "tsmsg->m_idle_in_xact_time"
> > is used for counting the database level idle in transaction count, you
> > can check "pg_stat_get_db_idle_in_transaction_time" function for that.
> > So "pgStatTransactionIdleTime" is the variable counting the idle in
> > transaction time, pgStatIdleTime is just counting the idle time
> > outside the transaction so if we make this change we are changing the
> > meaning of tsmsg->m_idle_in_xact_time.
>
> Got it.
> Updated

Okay, thanks, I will look into it one more time early next week and if
I see no issues then I will move it to RFC.

-- 
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com



Re: Add connection active, idle time to pg_stat_activity

От
Bharath Rupireddy
Дата:
On Tue, Nov 16, 2021 at 5:06 PM Rafia Sabih <rafia.pghackers@gmail.com> wrote:
> Got it.
> Updated

Thanks for the patch. +1 for adding the idle/idle_in_txn_time/active
time. I believe these are the total times a backend in its lifetime
accumulates. For instance, if a backend runs 100 txns, then these new
columns show the total time that the backend spent during these 100
txns, right?

Few comments on the patch:

1) Patch is missing a commit message. It is good to have a commit
message describing the high-level of the feature.
2) This patch needs to bump the catalog version, at the end of the
commit message, we usually keep a note "Bump the catalog version".
3) It looks like the documentation is missing [1], for the new columns.
4) When will these backend variables be reset? Is it at the backend
startup? Or some other? If these variables are reset only at the
backend startup and do they keep growing during the entire life of the
backend process? If yes, what happens for a long running backend/user
session, don't they get overflowed?

+
+ int64 st_active_time;
+ int64 st_transaction_idle_time;
+ int64 st_idle_time;
 } PgBackendStatus;

5) Is there any way you can get them tested?
6) What will be entries of st_active_time, st_transaction_idle_time,
st_idle_time for non-backend processes, like bg writer, checkpointer,
parallel worker, bg worker, logical replication launcher, stats
collector, sys logger etc?

[1] https://www.postgresql.org/docs/devel/monitoring-stats.html#MONITORING-PG-STAT-ACTIVITY-VIEW

Regards,
Bharath Rupireddy.



Re: Add connection active, idle time to pg_stat_activity

От
Dilip Kumar
Дата:
On Sat, Nov 27, 2021 at 8:00 AM Bharath Rupireddy
<bharath.rupireddyforpostgres@gmail.com> wrote:
>
> On Tue, Nov 16, 2021 at 5:06 PM Rafia Sabih <rafia.pghackers@gmail.com> wrote:
> > Got it.
> > Updated
>
> Thanks for the patch. +1 for adding the idle/idle_in_txn_time/active
> time. I believe these are the total times a backend in its lifetime
> accumulates. For instance, if a backend runs 100 txns, then these new
> columns show the total time that the backend spent during these 100
> txns, right?
>
> Few comments on the patch:
>
> 1) Patch is missing a commit message. It is good to have a commit
> message describing the high-level of the feature.
> 2) This patch needs to bump the catalog version, at the end of the
> commit message, we usually keep a note "Bump the catalog version".
> 3) It looks like the documentation is missing [1], for the new columns.
> 4) When will these backend variables be reset? Is it at the backend
> startup? Or some other? If these variables are reset only at the
> backend startup and do they keep growing during the entire life of the
> backend process? If yes, what happens for a long running backend/user
> session, don't they get overflowed?

This is a 64-bit variable so I am not sure do we really need to worry
about overflow? I mean if we are storing microseconds then also this
will be able to last for ~300,000 years no?

-- 
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com



Re: Add connection active, idle time to pg_stat_activity

От
Kuntal Ghosh
Дата:
On Fri, Oct 22, 2021 at 1:53 PM Rafia Sabih <rafia.pghackers@gmail.com> wrote:
> To provide this information I was digging into how the statistics
> collector is working and found out there is already information like
> total time that a connection is active as well as idle computed in
> pgstat_report_activity[1]. Ideally, this would be the values we would
> like to see per process in pg_stat_activity.
It's definitely useful to know how much time a backend has spent for
query executions. Once you've this info, you can easily calculate the
idle time using this information: (now() - backend_start) -
active_time. But, I'm wondering why you need to distinguish between
idle and idle in transactions - what's the usage? Either the backend
is doing some work or it sits idle. Another useful information would
be when the last query execution was ended. From this information, you
can figure out whether a backend is idle for a long time since the
last execution and the execution time of the last query (query_end -
query_start).

You also need to update the documentation.

-- 
Thanks & Regards,
Kuntal Ghosh



Re: Add connection active, idle time to pg_stat_activity

От
Julien Rouhaud
Дата:
Hi,

On Mon, Nov 29, 2021 at 11:04 PM Kuntal Ghosh
<kuntalghosh.2007@gmail.com> wrote:
>
> You also need to update the documentation.

You also need to update rules.sql: https://cirrus-ci.com/task/6145265819189248



Re: Add connection active, idle time to pg_stat_activity

От
Julien Rouhaud
Дата:
Hi,

On Wed, Jan 12, 2022 at 02:16:35PM +0800, Julien Rouhaud wrote:
> 
> On Mon, Nov 29, 2021 at 11:04 PM Kuntal Ghosh
> <kuntalghosh.2007@gmail.com> wrote:
> >
> > You also need to update the documentation.
> 
> You also need to update rules.sql: https://cirrus-ci.com/task/6145265819189248

There has been multiple comments in the last two months that weren't addressed
since, and also the patch doesn't pass the regression tests anymore.

Rafia, do you plan to send a new version soon?  Without update in the next few
days this patch will be closed as Returned with Feedback, per the commitfest
rules.



Re: Add connection active, idle time to pg_stat_activity

От
Sergey Dudoladov
Дата:
Hello,

>   Without update in the next few
> days this patch will be closed as Returned with Feedback,

Thank you for the reminder, Julien.

Per agreement with Rafia I have reworked the patch in the past days.
The new version 6 is now ready for review.

Regards,
Sergey Dudoladov

Вложения

Re: Add connection active, idle time to pg_stat_activity

От
Julien Rouhaud
Дата:
Hi,

On Thu, Jan 27, 2022 at 11:43:26AM +0100, Sergey Dudoladov wrote:
> 
> Per agreement with Rafia I have reworked the patch in the past days.
> The new version 6 is now ready for review.

Great, thanks a lot Sergey!

The cfbot is happy with this new version:
https://cirrus-ci.com/github/postgresql-cfbot/postgresql/commitfest/36/3405



Re: Add connection active, idle time to pg_stat_activity

От
Kyotaro Horiguchi
Дата:
Hi.

At Thu, 27 Jan 2022 20:36:56 +0800, Julien Rouhaud <rjuju123@gmail.com> wrote in
> On Thu, Jan 27, 2022 at 11:43:26AM +0100, Sergey Dudoladov wrote:
> > 
> > Per agreement with Rafia I have reworked the patch in the past days.
> > The new version 6 is now ready for review.
> 
> Great, thanks a lot Sergey!
> 
> The cfbot is happy with this new version:
> https://cirrus-ci.com/github/postgresql-cfbot/postgresql/commitfest/36/3405

I think we can easily add the duration of the current state to the two
in pg_stat_get_activity and it would offer better information.


         if (beentry->st_state == STATE_RUNNING ||
             beentry->st_state == STATE_FASTPATH)
-            pgstat_count_conn_active_time((PgStat_Counter) secs * 1000000 + usecs);
+        {
+            pgstat_count_conn_active_time((PgStat_Counter) usecs_diff);
+            beentry->st_total_active_time += usecs_diff;
+        }

The two lines operates exactly the same way on variables with slightly
different behavior. pgStatActiveTime is reported at transaction end
and reset at every tabstat reporting. st_total_active_time is reported
immediately and reset at session end. Since we do the latter, the
first can be omitted by remembering the last values for the local
variables at every reporting.  This needs additional two exporting
function in pgstatfuncs like pgstat_get_my_queryid so others might
think differently.

The write operation to beentry needs to be enclosed by
PGSTAT_BEGIN/END_WRITE_ACTIVITY().  In that perspective, it would be
better to move that writes to the PGSTAT_WRITE_ACTIVITY section just
below.

regards.

-- 
Kyotaro Horiguchi
NTT Open Source Software Center



Re: Add connection active, idle time to pg_stat_activity

От
Kyotaro Horiguchi
Дата:
At Mon, 29 Nov 2021 20:34:14 +0530, Kuntal Ghosh <kuntalghosh.2007@gmail.com> wrote in 
> active_time. But, I'm wondering why you need to distinguish between
> idle and idle in transactions - what's the usage? Either the backend
> is doing some work or it sits idle. Another useful information would

I believe many people suffer from mysterious long idle in
transactions, which harm server performance many ways.  In many cases
transactions with unexpectedly long idle time is an omen or a cause of
trouble.

regards.

-- 
Kyotaro Horiguchi
NTT Open Source Software Center



Re: Add connection active, idle time to pg_stat_activity

От
Kyotaro Horiguchi
Дата:
At Fri, 28 Jan 2022 14:36:31 +0900 (JST), Kyotaro Horiguchi <horikyota.ntt@gmail.com> wrote in 
> Hi.
> 
> At Thu, 27 Jan 2022 20:36:56 +0800, Julien Rouhaud <rjuju123@gmail.com> wrote in
> > On Thu, Jan 27, 2022 at 11:43:26AM +0100, Sergey Dudoladov wrote:
> > > 
> > > Per agreement with Rafia I have reworked the patch in the past days.
> > > The new version 6 is now ready for review.
> > 
> > Great, thanks a lot Sergey!
> > 
> > The cfbot is happy with this new version:
> > https://cirrus-ci.com/github/postgresql-cfbot/postgresql/commitfest/36/3405
> 
> I think we can easily add the duration of the current state to the two
> in pg_stat_get_activity and it would offer better information.
> 
> 
>          if (beentry->st_state == STATE_RUNNING ||
>              beentry->st_state == STATE_FASTPATH)
> -            pgstat_count_conn_active_time((PgStat_Counter) secs * 1000000 + usecs);
> +        {
> +            pgstat_count_conn_active_time((PgStat_Counter) usecs_diff);
> +            beentry->st_total_active_time += usecs_diff;
> +        }
> 
> The two lines operates exactly the same way on variables with slightly
> different behavior. pgStatActiveTime is reported at transaction end
> and reset at every tabstat reporting. st_total_active_time is reported
> immediately and reset at session end. Since we do the latter, the

> first can be omitted by remembering the last values for the local
> variables at every reporting.  This needs additional two exporting

Of course it's typo(?) of "values of the shared variables".
Sorry for the mistake.

> function in pgstatfuncs like pgstat_get_my_queryid so others might
> think differently.
> 
> The write operation to beentry needs to be enclosed by
> PGSTAT_BEGIN/END_WRITE_ACTIVITY().  In that perspective, it would be
> better to move that writes to the PGSTAT_WRITE_ACTIVITY section just
> below.

regards.

-- 
Kyotaro Horiguchi
NTT Open Source Software Center



Re: Add connection active, idle time to pg_stat_activity

От
Sergey Dudoladov
Дата:
Hi,

Thank you for the reviews.

> > The write operation to beentry needs to be enclosed by
> > PGSTAT_BEGIN/END_WRITE_ACTIVITY().  In that perspective, it would be
> > better to move that writes to the PGSTAT_WRITE_ACTIVITY section just
> > below.

I have fixed it in the new version.

> >               if (beentry->st_state == STATE_RUNNING ||
> >                       beentry->st_state == STATE_FASTPATH)
> > -                     pgstat_count_conn_active_time((PgStat_Counter) secs * 1000000 + usecs);
> > +             {
> > +                     pgstat_count_conn_active_time((PgStat_Counter) usecs_diff);
> > +                     beentry->st_total_active_time += usecs_diff;
> > +             }
> >
> > The two lines operates exactly the same way on variables with slightly
> > different behavior. pgStatActiveTime is reported at transaction end
> > and reset at every tabstat reporting. st_total_active_time is reported
> > immediately and reset at session end. Since we do the latter, the
> > first can be omitted by remembering the last values for the local
> > variables at every reporting.  This needs additional two exporting
>
> Of course it's typo(?) of "values of the shared variables".

Could you please elaborate on this idea ?
So we have pgStatActiveTime and pgStatIdleInTransactionTime ultimately
used to report respective metrics in pg_stat_database.
Now beentry's st_total_active_time / st_total_transaction_idle_time
duplicates this info, so one may get rid of  pgStat*Time counters. Is
the idea to report  instead of them at every tabstat reporting the
difference between the last memorized value of  st_total_*_time and
its current value ?

> > This needs additional two exporting
> > function in pgstatfuncs like pgstat_get_my_queryid so others might
> > think differently.

What would be example functions to look at ?

Regards,
Sergey

Вложения

Re: Add connection active, idle time to pg_stat_activity

От
Kyotaro Horiguchi
Дата:
At Mon, 31 Jan 2022 15:11:56 +0100, Sergey Dudoladov <sergey.dudoladov@gmail.com> wrote in 
> > >               if (beentry->st_state == STATE_RUNNING ||
> > >                       beentry->st_state == STATE_FASTPATH)
> > > -                     pgstat_count_conn_active_time((PgStat_Counter) secs * 1000000 + usecs);
> > > +             {
> > > +                     pgstat_count_conn_active_time((PgStat_Counter) usecs_diff);
> > > +                     beentry->st_total_active_time += usecs_diff;
> > > +             }
> > >
> > > The two lines operates exactly the same way on variables with slightly
> > > different behavior. pgStatActiveTime is reported at transaction end
> > > and reset at every tabstat reporting. st_total_active_time is reported
> > > immediately and reset at session end. Since we do the latter, the
> > > first can be omitted by remembering the last values for the local
> > > variables at every reporting.  This needs additional two exporting
> >
> > Of course it's typo(?) of "values of the shared variables".
> 
> Could you please elaborate on this idea ?
> So we have pgStatActiveTime and pgStatIdleInTransactionTime ultimately
> used to report respective metrics in pg_stat_database.
> Now beentry's st_total_active_time / st_total_transaction_idle_time
> duplicates this info, so one may get rid of  pgStat*Time counters. Is
> the idea to report  instead of them at every tabstat reporting the
> difference between the last memorized value of  st_total_*_time and
> its current value ?

Exactly. The attached first diff is the schetch of that.

> > > This needs additional two exporting
> > > function in pgstatfuncs like pgstat_get_my_queryid so others might
> > > think differently.
> 
> What would be example functions to look at ?

pgstat_get_my_queryid..


And, it seems like I forgot to mention this, but as Kuntal suggested
(in a different context and objective, though) upthraed, I think that
we can show realtime values in the two time fields by adding the time
of the current state.  See the attached second diff.

regards.

-- 
Kyotaro Horiguchi
NTT Open Source Software Center
diff --git a/src/backend/postmaster/pgstat.c b/src/backend/postmaster/pgstat.c
index 0646f53098..27419c1851 100644
--- a/src/backend/postmaster/pgstat.c
+++ b/src/backend/postmaster/pgstat.c
@@ -249,8 +249,8 @@ static int    pgStatXactRollback = 0;
 PgStat_Counter pgStatBlockReadTime = 0;
 PgStat_Counter pgStatBlockWriteTime = 0;
 static PgStat_Counter pgLastSessionReportTime = 0;
-PgStat_Counter pgStatActiveTime = 0;
-PgStat_Counter pgStatTransactionIdleTime = 0;
+PgStat_Counter pgStatLastActiveTime = 0;
+PgStat_Counter pgStatLastTransactionIdleTime = 0;
 SessionEndType pgStatSessionEndCause = DISCONNECT_NORMAL;
 
 /* Record that's written to 2PC state file when pgstat state is persisted */
@@ -1026,8 +1026,13 @@ pgstat_send_tabstat(PgStat_MsgTabstat *tsmsg, TimestampTz now)
             TimestampDifference(pgLastSessionReportTime, now, &secs, &usecs);
             pgLastSessionReportTime = now;
             tsmsg->m_session_time = (PgStat_Counter) secs * 1000000 + usecs;
-            tsmsg->m_active_time = pgStatActiveTime;
-            tsmsg->m_idle_in_xact_time = pgStatTransactionIdleTime;
+
+            /* send the difference since the last report */
+            tsmsg->m_active_time =
+                pgstat_get_my_active_time() - pgStatLastActiveTime;
+            tsmsg->m_idle_in_xact_time =
+                pgstat_get_my_transaction_idle_time() -
+                pgStatLastTransactionIdleTime;
         }
         else
         {
@@ -1039,8 +1044,8 @@ pgstat_send_tabstat(PgStat_MsgTabstat *tsmsg, TimestampTz now)
         pgStatXactRollback = 0;
         pgStatBlockReadTime = 0;
         pgStatBlockWriteTime = 0;
-        pgStatActiveTime = 0;
-        pgStatTransactionIdleTime = 0;
+        pgStatLastActiveTime = pgstat_get_my_active_time();
+        pgStatLastTransactionIdleTime =  pgstat_get_my_transaction_idle_time();
     }
     else
     {
diff --git a/src/backend/utils/activity/backend_status.c b/src/backend/utils/activity/backend_status.c
index 5f15dcdc05..8b6836a662 100644
--- a/src/backend/utils/activity/backend_status.c
+++ b/src/backend/utils/activity/backend_status.c
@@ -613,15 +613,9 @@ pgstat_report_activity(BackendState state, const char *cmd_str)
          */
         if (beentry->st_state == STATE_RUNNING ||
             beentry->st_state == STATE_FASTPATH)
-        {
-            pgstat_count_conn_active_time((PgStat_Counter) usecs_diff);
             active_time_diff = usecs_diff;
-        }
         else
-        {
-            pgstat_count_conn_txn_idle_time((PgStat_Counter) usecs_diff);
             transaction_idle_time_diff = usecs_diff;
-        }
     }
 
     /*
@@ -1078,6 +1072,48 @@ pgstat_get_my_query_id(void)
 }
 
 
+/* ----------
+ * pgstat_get_my_active_time() -
+ *
+ * Return current backend's accumulated active time.
+ */
+uint64
+pgstat_get_my_active_time(void)
+{
+    if (!MyBEEntry)
+        return 0;
+
+    /*
+     * There's no need for a lock around pgstat_begin_read_activity /
+     * pgstat_end_read_activity here as it's only called from
+     * pg_stat_get_activity which is already protected, or from the same
+     * backend which means that there won't be concurrent writes.
+     */
+    return MyBEEntry->st_total_active_time;
+}
+
+
+/* ----------
+ * pgstat_get_my_transaction_idle_time() -
+ *
+ * Return current backend's accumulated in-transaction idel time.
+ */
+uint64
+pgstat_get_my_transaction_idle_time(void)
+{
+    if (!MyBEEntry)
+        return 0;
+
+    /*
+     * There's no need for a lock around pgstat_begin_read_activity /
+     * pgstat_end_read_activity here as it's only called from
+     * pg_stat_get_activity which is already protected, or from the same
+     * backend which means that there won't be concurrent writes.
+     */
+    return MyBEEntry->st_total_transaction_idle_time;
+}
+
+
 /* ----------
  * pgstat_fetch_stat_beentry() -
  *
diff --git a/src/include/pgstat.h b/src/include/pgstat.h
index e10d20222a..382d7202c1 100644
--- a/src/include/pgstat.h
+++ b/src/include/pgstat.h
@@ -1185,10 +1185,6 @@ extern void pgstat_initstats(Relation rel);
     (pgStatBlockReadTime += (n))
 #define pgstat_count_buffer_write_time(n)                            \
     (pgStatBlockWriteTime += (n))
-#define pgstat_count_conn_active_time(n)                            \
-    (pgStatActiveTime += (n))
-#define pgstat_count_conn_txn_idle_time(n)                            \
-    (pgStatTransactionIdleTime += (n))
 
 extern void pgstat_count_heap_insert(Relation rel, PgStat_Counter n);
 extern void pgstat_count_heap_update(Relation rel, bool hot);
diff --git a/src/include/utils/backend_status.h b/src/include/utils/backend_status.h
index 96d432ce49..1791dd6842 100644
--- a/src/include/utils/backend_status.h
+++ b/src/include/utils/backend_status.h
@@ -309,6 +309,8 @@ extern const char *pgstat_get_backend_current_activity(int pid, bool checkUser);
 extern const char *pgstat_get_crashed_backend_activity(int pid, char *buffer,
                                                        int buflen);
 extern uint64 pgstat_get_my_query_id(void);
+extern uint64 pgstat_get_my_active_time(void);
+extern uint64 pgstat_get_my_transaction_idle_time(void);
 
 
 /* ----------
diff --git a/src/backend/utils/activity/backend_status.c b/src/backend/utils/activity/backend_status.c
index 8b6836a662..996f4e88d7 100644
--- a/src/backend/utils/activity/backend_status.c
+++ b/src/backend/utils/activity/backend_status.c
@@ -587,10 +587,7 @@ pgstat_report_activity(BackendState state, const char *cmd_str)
      * If the state has changed from "active" or "idle in transaction",
      * calculate the duration.
      */
-    if ((beentry->st_state == STATE_RUNNING ||
-         beentry->st_state == STATE_FASTPATH ||
-         beentry->st_state == STATE_IDLEINTRANSACTION ||
-         beentry->st_state == STATE_IDLEINTRANSACTION_ABORTED) &&
+    if ((PGSTAT_IS_ACTIVE(beentry) || PGSTAT_IS_IDLEINTRANSACTION(beentry)) &&
         state != beentry->st_state)
     {
         long        secs;
@@ -611,8 +608,7 @@ pgstat_report_activity(BackendState state, const char *cmd_str)
          * 2. The latter values are reset to 0 once the data has been sent
          * to the statistics collector.
          */
-        if (beentry->st_state == STATE_RUNNING ||
-            beentry->st_state == STATE_FASTPATH)
+        if (PGSTAT_IS_ACTIVE(beentry))
             active_time_diff = usecs_diff;
         else
             transaction_idle_time_diff = usecs_diff;
diff --git a/src/backend/utils/adt/pgstatfuncs.c b/src/backend/utils/adt/pgstatfuncs.c
index 7c2776c14c..48c0ffa33a 100644
--- a/src/backend/utils/adt/pgstatfuncs.c
+++ b/src/backend/utils/adt/pgstatfuncs.c
@@ -675,6 +675,7 @@ pg_stat_get_activity(PG_FUNCTION_ARGS)
         {
             SockAddr    zero_clientaddr;
             char       *clipped_activity;
+            int64        tmp_time;
 
             switch (beentry->st_state)
             {
@@ -917,9 +918,25 @@ pg_stat_get_activity(PG_FUNCTION_ARGS)
             else
                 values[29] = UInt64GetDatum(beentry->st_query_id);
 
-            /* convert to msec for display */
-            values[30] = Float8GetDatum(beentry->st_total_active_time / 1000.0) ;
-            values[31] = Float8GetDatum(beentry->st_total_transaction_idle_time / 1000.0);
+            tmp_time = beentry->st_total_active_time;
+
+            /* add the realtime value to the counter if needed */
+            if (PGSTAT_IS_ACTIVE(beentry))
+                tmp_time +=
+                    GetCurrentTimestamp() - beentry->st_state_start_timestamp;
+
+            /* convert it to msec */
+            values[30] = Float8GetDatum(tmp_time / 1000.0) ;
+
+            tmp_time = beentry->st_total_transaction_idle_time;
+
+            /* add the realtime value to the counter if needed */
+            if (PGSTAT_IS_IDLEINTRANSACTION(beentry))
+                tmp_time +=
+                    GetCurrentTimestamp() - beentry->st_state_start_timestamp;
+
+            /* convert it to msec */
+            values[31] = Float8GetDatum(tmp_time);
         }
         else
         {
diff --git a/src/include/utils/backend_status.h b/src/include/utils/backend_status.h
index 1791dd6842..a03225c4f0 100644
--- a/src/include/utils/backend_status.h
+++ b/src/include/utils/backend_status.h
@@ -235,6 +235,12 @@ typedef struct PgBackendStatus
     ((before_changecount) == (after_changecount) && \
      ((before_changecount) & 1) == 0)
 
+/* macros to identify the states for time accounting */
+#define PGSTAT_IS_ACTIVE(s)                                                \
+    ((s)->st_state == STATE_RUNNING || (s)->st_state == STATE_FASTPATH)
+#define PGSTAT_IS_IDLEINTRANSACTION(s)             \
+    ((s)->st_state == STATE_IDLEINTRANSACTION ||        \
+     (s)->st_state == STATE_IDLEINTRANSACTION_ABORTED)
 
 /* ----------
  * LocalPgBackendStatus

Re: Add connection active, idle time to pg_stat_activity

От
Sergey Dudoladov
Дата:
Hi,

> > Could you please elaborate on this idea ?
> > So we have pgStatActiveTime and pgStatIdleInTransactionTime ultimately
> > used to report respective metrics in pg_stat_database.
> > Now beentry's st_total_active_time / st_total_transaction_idle_time
> > duplicates this info, so one may get rid of  pgStat*Time counters. Is
> > the idea to report  instead of them at every tabstat reporting the
> > difference between the last memorized value of  st_total_*_time and
> > its current value ?
>
> Exactly. The attached first diff is the schetch of that.

This diff actually adds more code than it removes and somewhat bloats the patch.
I decided  to incorporate it anyway because the diff explicitly shows
that time differences since the last report
are send to the statistics collector,which is not immediately evident
from the existing PgStat*Time counters.
That point may be worth further discussion though.


> And, it seems like I forgot to mention this, but as Kuntal suggested
> (in a different context and objective, though) upthraed, I think that
> we can show realtime values in the two time fields by adding the time
> of the current state.  See the attached second diff.

That is exactly what we need in our infra, also included into the patch.


@Kyotaro Horiguchi
Thank you for the contribution. I included both of your diffs with
minor changes.
Should I add you to the authors of the patch given that now half of it
is basically your code ?

Regards,
Sergey

Вложения

Re: Add connection active, idle time to pg_stat_activity

От
Andres Freund
Дата:
Hi,

On 2022-02-04 10:58:24 +0100, Sergey Dudoladov wrote:
> Thank you for the contribution. I included both of your diffs with
> minor changes.

This currently doesn't apply: http://cfbot.cputube.org/patch_37_3405.log

Could you rebase? Marking as waiting on author for now.

- Andres



Re: Add connection active, idle time to pg_stat_activity

От
Sergey Dudoladov
Дата:
Hello,

I've updated the patch in preparation for the upcoming commitfest.

Regards,
Sergey.

Вложения

Re: Add connection active, idle time to pg_stat_activity

От
"Drouvot, Bertrand"
Дата:
Hi,

On 6/13/22 4:51 PM, Sergey Dudoladov wrote:
> Hello,
>
> I've updated the patch in preparation for the upcoming commitfest.

I really like the idea of adding additional information like the ones in 
this patch, so +1 for the patch.

As far the patch:

@@ -864,7 +864,9 @@ CREATE VIEW pg_stat_activity AS
              s.backend_xmin,
              S.query_id,
              S.query,
-            S.backend_type
+            S.backend_type,
+            S.active_time,
+            S.idle_in_transaction_time

what about using total_active_time and total_idle_in_transaction_time?

I think that would avoid any confusion and "total_" is also already used 
in other pg_stat_* views when appropriate.

@@ -468,6 +468,13 @@ pgstat_beshutdown_hook(int code, Datum arg)

         beentry->st_procpid = 0;        /* mark invalid */

+       /*
+        * Reset per-backend counters so that accumulated values for the 
current
+        * backend are not used for future backends.
+        */
+       beentry->st_total_active_time = 0;
+       beentry->st_total_transaction_idle_time = 0;

shouldn't that be in pgstat_bestart() instead? (and just let 
pgstat_beshutdown_hook() set st_procpid to 0)

         /* so that functions can check if backend_status.c is up via 
MyBEEntry */
@@ -524,6 +531,8 @@ pgstat_report_activity(BackendState state, const 
char *cmd_str)
         TimestampTz start_timestamp;
         TimestampTz current_timestamp;
         int                     len = 0;
+       int64           active_time_diff = 0;
+       int64           transaction_idle_time_diff = 0;

I think here we can use only a single variable say "state_time_diff" for 
example, as later only one of those two is incremented anyway.

+++ b/src/backend/utils/adt/pgstatfuncs.c
@@ -539,7 +539,7 @@ pg_stat_get_progress_info(PG_FUNCTION_ARGS)
  Datum
  pg_stat_get_activity(PG_FUNCTION_ARGS)
  {
-#define PG_STAT_GET_ACTIVITY_COLS      30
+#define PG_STAT_GET_ACTIVITY_COLS      32
         int                     num_backends = 
pgstat_fetch_stat_numbackends();
         int                     curr_backend;
         int                     pid = PG_ARGISNULL(0) ? -1 : 
PG_GETARG_INT32(0);
@@ -621,6 +621,7 @@ pg_stat_get_activity(PG_FUNCTION_ARGS)
                 {
                         SockAddr        zero_clientaddr;
                         char       *clipped_activity;
+                       int64           time_to_report;

what about total_time_to_report instead?

Also, maybe not for this patch but I think that would be also useful to 
get the total time waited (so that we would get more inside of what the 
"active" time was made of).

Regards,

-- 

Bertrand Drouvot
Amazon Web Services:https://aws.amazon.com




Re: Add connection active, idle time to pg_stat_activity

От
Sergey Dudoladov
Дата:
Hello,

thanks for the helpful review. I have incorporated most of the
suggestions into the patch. I have also rebased and tested the patch
on top of the current master (2cd2569c72b89200).

> +       int64           active_time_diff = 0;
> +       int64           transaction_idle_time_diff = 0;
>
> I think here we can use only a single variable say "state_time_diff" for
> example, as later only one of those two is incremented anyway.

I have written it this way to avoid cluttering the critical section
between PGSTAT_(BEGIN|END)_WRITE_ACTIVITY.
With two variable one can leave only actual increments in the section
and check conditions / call TimestampDifference outside of it.

Regards,
Sergey

Вложения

Re: Add connection active, idle time to pg_stat_activity

От
Aleksander Alekseev
Дата:
Rafia, Sergey,

Many thanks for working on this!

> I have incorporated most of the suggestions into the patch. I have also rebased and tested the patch on top of the
currentmaster
 

I noticed that this patch is marked as "Needs Review" and decided to
take a look.

I believe there is a bug in the implementation. Here is what I did:

```
57033 (master) =# select * from pg_stat_activity where pid = 57033;
...
total_active_time              | 9.128
total_idle_in_transaction_time | 0

57033 (master) =# select * from pg_stat_activity where pid = 57033;
...
total_active_time              | 10.626
total_idle_in_transaction_time | 0

57033 (master) =# BEGIN;
57033 (master) =# select * from pg_stat_activity where pid = 57033;
...
total_active_time              | 17.443
total_idle_in_transaction_time | 2314.703

57033 (master) =# select * from pg_stat_activity where pid = 57033;
...
total_active_time              | 2514.635
total_idle_in_transaction_time | 2314.703

57033 (master) =# COMMIT;
57033 (master) =# select * from pg_stat_activity where pid = 57033;
...
total_active_time              | 22.048
total_idle_in_transaction_time | 7300.911
```

So it looks like total_active_time tracks seconds when a user executes
single expressions and milliseconds when running a transaction. It
should always track milliseconds.

Please use `git format-patch` for the next patch and provide a commit
message, as it was previously pointed out by Bharath. Please specify
the list of the authors and reviewers and add a note about
incrementing the catalog version.

-- 
Best regards,
Aleksander Alekseev



Re: Add connection active, idle time to pg_stat_activity

От
Aleksander Alekseev
Дата:
Hi again,

> 57033 (master) =# select * from pg_stat_activity where pid = 57033;
> ...
> total_active_time              | 2514.635
> total_idle_in_transaction_time | 2314.703
>
> 57033 (master) =# COMMIT;
> 57033 (master) =# select * from pg_stat_activity where pid = 57033;
> ...
> total_active_time              | 22.048
> total_idle_in_transaction_time | 7300.911
> ```

My previous message was wrong, total_active_time doesn't track
seconds. I got confused by the name of this column. Still I'm pretty
confident it shouldn't decrease.

-- 
Best regards,
Aleksander Alekseev



Re: Add connection active, idle time to pg_stat_activity

От
torikoshia
Дата:
Rafia, Sergey,

+1 for adding the total_active_time and total_idle_in_transaction_time 
to pg_stat_activity.

I reviewed the patch and here are some comments.

+      <entry role="catalog_table_entry"><para role="column_definition">
+       <structfield>total_active_time</structfield> <type>double 
precision</type>
+      </para>
+      <para>
+       Time in milliseconds this backend spent in 
<literal>active</literal> and
+       <literal>fastpath</literal> states.

Is 'fastpath' an abbreviation of 'fastpath function call'?
If so, I feel it's clearer '<literal>fastpath function call</literal>' 
than '<literal>fastpath</literal>'.


+extern uint64 pgstat_get_my_active_time(void);
+extern uint64 pgstat_get_my_transaction_idle_time(void);

Are these functions necessary?
It seems they are not called from anywhere, doesn't it?

-- 
Regards,

--
Atsushi Torikoshi
NTT DATA CORPORATION



Re: Add connection active, idle time to pg_stat_activity

От
Sergey Dudoladov
Дата:
Hello,

I have addressed the reviews.

@Aleksander Alekseev thanks for reporting the issue. I have altered
the patch to respect the behavior of pg_stat_activity, specifically
[1]

> Another important point is that when a server process is asked to display any of these statistics,
> it first fetches the most recent report emitted by the collector process and then continues to use this snapshot
> for all statistical views and functions until the end of its current transaction.
> So the statistics will show static information as long as you continue the current transaction.

For the patch it means no computing of real-time values of
total_*_time. Here is an example to illustrate the new behavior:

=# begin;

=*# select total_active_time, total_idle_in_transaction_time from
pg_stat_activity where pid = pg_backend_pid();
 total_active_time | total_idle_in_transaction_time
-------------------+--------------------------------
             0.124 |                      10505.098

postgres=*# select pg_sleep(10);

postgres=*# select total_active_time, total_idle_in_transaction_time
from pg_stat_activity where pid = pg_backend_pid();
 total_active_time | total_idle_in_transaction_time
-------------------+--------------------------------
             0.124 |                      10505.098

postgres=*# commit;

postgres=# select total_active_time, total_idle_in_transaction_time
from pg_stat_activity where pid = pg_backend_pid();
 total_active_time | total_idle_in_transaction_time
-------------------+--------------------------------
         10015.796 |                      29322.831


[1] https://www.postgresql.org/docs/14/monitoring-stats.html#MONITORING-STATS-VIEWS

Regards,
Sergey

Вложения

Re: Add connection active, idle time to pg_stat_activity

От
Aleksander Alekseev
Дата:
Hi Sergey,

> @Aleksander Alekseev thanks for reporting the issue. I have altered
> the patch to respect the behavior of pg_stat_activity, specifically
> [1]
>
> > Another important point is that when a server process is asked to display any of these statistics,
> > it first fetches the most recent report emitted by the collector process and then continues to use this snapshot
> > for all statistical views and functions until the end of its current transaction.
> > So the statistics will show static information as long as you continue the current transaction.
>
> For the patch it means no computing of real-time values of
> total_*_time. Here is an example to illustrate the new behavior:
>
> =# begin;
>
> =*# select total_active_time, total_idle_in_transaction_time from
> pg_stat_activity where pid = pg_backend_pid();
>  total_active_time | total_idle_in_transaction_time
> -------------------+--------------------------------
>              0.124 |                      10505.098
>
> postgres=*# select pg_sleep(10);
>
> postgres=*# select total_active_time, total_idle_in_transaction_time
> from pg_stat_activity where pid = pg_backend_pid();
>  total_active_time | total_idle_in_transaction_time
> -------------------+--------------------------------
>              0.124 |                      10505.098
>
> postgres=*# commit;
>
> postgres=# select total_active_time, total_idle_in_transaction_time
> from pg_stat_activity where pid = pg_backend_pid();
>  total_active_time | total_idle_in_transaction_time
> -------------------+--------------------------------
>          10015.796 |                      29322.831
>
>
> [1] https://www.postgresql.org/docs/14/monitoring-stats.html#MONITORING-STATS-VIEWS

This looks reasonable.

What concerns me though is the fact that total_idle_in_transaction_time for given session doesn't seem to updated from the perspective of another session:

```
session1 (78376) =# BEGIN;
session1 (78376) =# select * from pg_stat_activity where pid = 78376;
...
total_active_time              | 40.057
total_idle_in_transaction_time | 34322.171

session1 (78376) =# select * from pg_stat_activity where pid = 78376;
...
total_active_time              | 40.057
total_idle_in_transaction_time | 34322.171

session2 (78382) =# select * from pg_stat_activity where pid = 78376;
...
total_active_time              | 46.908
total_idle_in_transaction_time | 96933.518

session2 (78382) =# select * from pg_stat_activity where pid = 78376;
...
total_active_time              | 46.908
total_idle_in_transaction_time | 96933.518 <--- doesn't change!

session1 (78376) =# COMMIT;
session1 (78376) =# select * from pg_stat_activity where pid = 78376;
...
total_active_time              | 47.16
total_idle_in_transaction_time | 218422.143

session2 (78382) =# select * from pg_stat_activity where pid = 78376;
total_active_time              | 50.631
total_idle_in_transaction_time | 218422.143
```

This is consistent with the current documentation:

> Each individual server process transmits new statistical counts to the collector just before going idle; so a query or transaction still in progress does not affect the displayed totals.

But it makes me wonder if there will be a lot of use of total_idle_in_transaction_time and if the patch should actually alter this behavior.

Thoughts?

--
Best regards,
Aleksander Alekseev

Re: Add connection active, idle time to pg_stat_activity

От
Aleksander Alekseev
Дата:
Hi hackers,

All in all the patch seems to be in good shape.

> This is consistent with the current documentation:
>
> > Each individual server process transmits new statistical counts to the collector just before going idle; so a query
ortransaction still in progress does not affect the displayed totals.
 
>
> But it makes me wonder if there will be a lot of use of total_idle_in_transaction_time and if the patch should
actuallyalter this behavior.
 
>
> Thoughts?

On second thought, this is arguably out of scope of this particular
patch and this particular discussion. In any case, having some stats
is better than none.

I'm going to change the status of the patch to "Ready for Committer"
in a short time unless anyone has a second opinion.

-- 
Best regards,
Aleksander Alekseev



Re: Add connection active, idle time to pg_stat_activity

От
Sergey Dudoladov
Дата:
Hello hackers,

Is there anything we can do to facilitate merging of this patch ?
It has been in the "ready-for-commiter" state for 3 commitfests in a row now.

We would appreciate if the patch makes it to version 16: the need to
monitor idle-in-transaction connections is very real for us.

Regards,
Sergey Dudoladov



Re: Add connection active, idle time to pg_stat_activity

От
Andres Freund
Дата:
Hi,

On 2022-07-21 18:22:51 +0200, Sergey Dudoladov wrote:
> From b5298301a3f5223bd78c519ddcddbd1bec9cf000 Mon Sep 17 00:00:00 2001
> From: Sergey Dudoladov <sergey.dudoladov@gmail.com>
> Date: Wed, 20 Apr 2022 23:47:37 +0200
> Subject: [PATCH] pg_stat_activity: add 'total_active_time' and
>  'total_idle_in_transaction_time'
> 
> catversion bump because of the change in the contents of pg_stat_activity
> 
> Author: Sergey Dudoladov, based on the initial version by Rafia Sabih
> 
> Reviewed-by: Aleksander Alekseev, Bertrand Drouvot, and Atsushi Torikoshi
> 
> Discussion:
https://www.postgresql.org/message-id/flat/CA%2BFpmFcJF0vwi-SWW0wYO-c-FbhyawLq4tCpRDCJJ8Bq%3Dja-gA%40mail.gmail.com

Isn't this patch breaking pg_stat_database? You removed
pgstat_count_conn_active_time() etc and the declaration for pgStatActiveTime /
pgStatTransactionIdleTime (but left the definition in pgstat_database.c), but
didn't replace it with anything afaics.


Separately from that, I'm a bit worried about starting to add accumulative
counters to pg_stat_activity. It's already gotten hard to use interactively
due to the number of columns - and why stop with the columns you suggest? Why
not show e.g. the total number of reads/writes, tuples inserted / deleted,
etc. as well?

I wonder if we shouldn't add a pg_stat_session or such for per-connection
counters that show not the current state, but accumulated per-session state.

Greetings,

Andres Freund



Re: Add connection active, idle time to pg_stat_activity

От
"David G. Johnston"
Дата:
On Tue, Nov 8, 2022 at 6:56 PM Andres Freund <andres@anarazel.de> wrote:

Separately from that, I'm a bit worried about starting to add accumulative
counters to pg_stat_activity. It's already gotten hard to use interactively
due to the number of columns - and why stop with the columns you suggest? Why
not show e.g. the total number of reads/writes, tuples inserted / deleted,
etc. as well?

I wonder if we shouldn't add a pg_stat_session or such for per-connection
counters that show not the current state, but accumulated per-session state.


I would much rather go down this route than make the existing table wider.

pg_stat_activity_state_duration (this patch) [the table - for a given backend - would be empty if track_activities is off]
pg_stat_activity_bandwidth_usage (if someone feels like implementing the other items you mention)


I'm not really buying into the idea of having multiple states sum their times together.  I would expect one column per state.  Actually two, because I also suggest that not only is the duration recorded, but a counter be incremented each time a given state becomes the currently active state.  Seems like having access to a divisor of some form may be useful.

So 10 columns of data plus pid to join back to pg_stat_activity proper.

David J.

Re: Add connection active, idle time to pg_stat_activity

От
Andres Freund
Дата:
On 2022-11-08 19:25:27 -0700, David G. Johnston wrote:
> Actually two, because I also suggest that not only is the duration recorded,
> but a counter be incremented each time a given state becomes the currently
> active state.  Seems like having access to a divisor of some form may be
> useful.

What for?



Re: Add connection active, idle time to pg_stat_activity

От
"David G. Johnston"
Дата:
On Tue, Nov 8, 2022 at 7:37 PM Andres Freund <andres@anarazel.de> wrote:
On 2022-11-08 19:25:27 -0700, David G. Johnston wrote:
> Actually two, because I also suggest that not only is the duration recorded,
> but a counter be incremented each time a given state becomes the currently
> active state.  Seems like having access to a divisor of some form may be
> useful.

What for?

Because 5 hours of idle-in-transaction time in a single block means something different than the same 5 hours accumulated across 300 mini-idles.

David J.

Re: Add connection active, idle time to pg_stat_activity

От
Sergey Dudoladov
Дата:
Hello hackers,

I've sketched the first version of a patch to add pg_stat_session.
Please review this early version.

Regards,
Sergey.

Вложения

Re: Add connection active, idle time to pg_stat_activity

От
Andrey Borodin
Дата:
On Wed, Feb 1, 2023 at 12:46 PM Sergey Dudoladov
<sergey.dudoladov@gmail.com> wrote:
>
> I've sketched the first version of a patch to add pg_stat_session.
> Please review this early version.

Hi Sergey!

I've taken a look into the patch and got some notes.
1. It is hard to understand what fastpath backend state is. What do
fastpath metrics mean for a user?
2. Anyway, the path "if (PGSTAT_IS_FASTPATH(beentry))" seems
unreachable to me. I'm a bit surprised that compilers do not produce
warnings about it. Maybe I'm just wrong.
3. Tests do not check any incrementation logic. I think we can have
some test that verifies delta for select some_counter from
pg_stat_session where pid = pg_backend_pid();
4. Macroses like PGSTAT_IS_RUNNING do not look like net win in code
readability and PGSTAT prefix have no semantic load.


That's all I've found so far. Thank you!

Best regards, Andrey Borodin.

PS. We were doing on-air review session [0], I hope Nik will chime-in
with "usability part of a review".

[0] https://youtu.be/vTV8XhWf3mo?t=2404



Re: Add connection active, idle time to pg_stat_activity

От
Sergey Dudoladov
Дата:
Hello hackers,

Andrey and Nik, thank you for selecting this patch for review in
Postgres Hacking 101: I've modified the patch based both on your email
and the video.

1. Session statistics is now collected only for client backends. PG
internal processes like wal sender seem to stop sending statistics
after they have entered their respective main loops.
2. Fastpath state now counts towards the running state. I think this
special case does not justify tracking two extra numbers for every
client backend.
3. I've added a small test for pg_stat_session similar to other tests
in src/test/regress/sql/sysviews.sql
4.  Here are the pb_bench results requested in the video review:

Conditions: no assertions, number of transactions = 1000
The query: SELECT generate_series(1, 10000000) OFFSET 10000000;
With pg_stat_session:
    latency average = 324.480 ms
    tps = 3.081857 (without initial connection time)

Without pg_stat_session:
    latency average = 327.370 ms
    tps = 3.054651 (without initial connection time)


Regards,
Sergey

Вложения

Re: Add connection active, idle time to pg_stat_activity

От
Andrei Zubkov
Дата:
Hi Sergey,

I've done a review of this patch. I found the patch idea very useful,
thank you for the patch. I've noted something observing this patch:
1. Patch can't be applied on the current master. My review is based on
   application of this patch over ac68323a878
2. Being applied over ac68323a878 patch works as expected.
3. Field names seems quite long to me (and they should be uniformly
   named with the same statistics in other views. For example
   "running" term is called "active" in pg_stat_database)
4. Meaningless spaces at the end of line:
   - backend_status.c:586
   - monitoring.sgml:5857
5. Patch adds

     usecs_diff = secs * 1000000 + usecs;

   at backend_status.c:pgstat_report_activity() to optimize
   calculations. But

     pgstat_count_conn_active_time((PgStat_Counter) secs * 1000000 +
usecs);
   and
     pgstat_count_conn_txn_idle_time((PgStat_Counter) secs * 1000000 +
usecs);

   are left in place after that.
6. I'm not sure that I can understand the comment
     /* Keep statistics for pg_stat_database intact */
   at backend_status.c:600 correctly. Can you please explain it a
   little?
7. Tests seems incomplete. It looks like we can check increments in
   all fields playing with transactions in tests.

Also, I have a thought about other possible improvements fitting to
this patch.

The view pg_stat_session is really needed in Postgres but I think it
should have much more statistics. I mean all resource statistics
related to sessions. Every backend has instrumentation that tracks
resource consumption. Data of this instrumentation goes to the
cumulative statistics system and is used in monitoring extensions
(like pg_stat_statements). I think pg_stat_session view is able to add
one more dimension of monitoring - a dimension of sessions. In my
opinion this view should provide resource consumption statistics of
current sessions in two cumulative sets of statistics - since backend
start and since transaction start.  Such view will be really useful in
monitoring of long running sessions and transactions providing
resource consumption information besides timing statistics.

regards, Andrei Zubkov
Postgres Professional



Re: Add connection active, idle time to pg_stat_activity

От
Aleksander Alekseev
Дата:
Hi,

> I've done a review of this patch. I found the patch idea very useful,
> thank you for the patch. I've noted something observing this patch:
> 1. Patch can't be applied on the current master. My review is based on
>    application of this patch over ac68323a878

On top of that not sure if I see the patch on the November commitfest
[1]. Please make sure it's there so that cfbot will check the patch.

[1]: https://commitfest.postgresql.org/45/

-- 
Best regards,
Aleksander Alekseev



Re: Add connection active, idle time to pg_stat_activity

От
Andrei Zubkov
Дата:
Hi Aleksander,

On Wed, 2023-10-25 at 16:17 +0300, Aleksander Alekseev wrote:
> On top of that not sure if I see the patch on the November commitfest
> [1]. Please make sure it's there so that cfbot will check the patch.

Yes, this patch is listed on the November commitfest. cfbot says rebase
needed since 2023-08-21.

regards, Andrei Zubkov




Re: Add connection active, idle time to pg_stat_activity

От
Aleksander Alekseev
Дата:
Hi,

> On Wed, 2023-10-25 at 16:17 +0300, Aleksander Alekseev wrote:
> > On top of that not sure if I see the patch on the November commitfest
> > [1]. Please make sure it's there so that cfbot will check the patch.
>
> Yes, this patch is listed on the November commitfest. cfbot says rebase
> needed since 2023-08-21.

You are right, I missed the corresponding entry [1].

[1]: https://commitfest.postgresql.org/45/3405/


-- 
Best regards,
Aleksander Alekseev



Re: Add connection active, idle time to pg_stat_activity

От
vignesh C
Дата:
On Wed, 25 Oct 2023 at 19:06, Andrei Zubkov <zubkov@moonset.ru> wrote:
>
> Hi Aleksander,
>
> On Wed, 2023-10-25 at 16:17 +0300, Aleksander Alekseev wrote:
> > On top of that not sure if I see the patch on the November commitfest
> > [1]. Please make sure it's there so that cfbot will check the patch.
>
> Yes, this patch is listed on the November commitfest. cfbot says rebase
> needed since 2023-08-21.

I have changed the status of commitfest entry to "Returned with
Feedback" as Andrei Zubkov's comments have not yet been resolved.
Please feel free to post an updated version of the patch and update
the commitfest entry accordingly.

Regards,
Vignesh



Re: Add connection active, idle time to pg_stat_activity

От
Sergey Dudoladov
Дата:
Hi all,

@Andrei Zubkov 
I've modify the patch to address most of your comments.

>  I have a thought about other possible improvements fitting to this patch.
> I think pg_stat_session view is able to add one more dimension of monitoring - a dimension of sessions

I would like to remind here about the initial scope of this patch. The main goal of it was to ease tracking "idle in transactions" connections, a feature that would really help in my work. The "pg_stat_session" came into play only because the "pg_stat_activity" was seen as an unsuitable place for the relevant counters. With that I still would like to maintaint the focus on committing the "idle in transactions" part of pg_stat_session first.

Regards,
Sergey
Вложения

Re: Add connection active, idle time to pg_stat_activity

От
Sergey Dudoladov
Дата:

Hi again,

> It looks like we can check increments in all fields playing with transactions in tests.

I've added such tests.

Regards,
Sergey
Вложения

Re: Add connection active, idle time to pg_stat_activity

От
Andrei Zubkov
Дата:
Hi Sergei,

> I still would like to maintaint the focus on committing the "idle in
transactions" part of pg_stat_session first.

Agreed.

I've done a review of version 0004. This version is applied successful
over ce571434ae7, installcheck passed. The behavior of pg_stat_session
view and corresponding function looks correct. I've didn't found any
issues in the code.

Notes about the current state of a patch:

Naming
the view and function names 'pg_stat_session' seems correct for this
particular scope of a patch. However possible future resource
consumption statistics are valid for all backends (vacuum for example).
Right now it is not clear for me if we can get resource statistics from
those backends while those are listed in the pg_stat_activity view but
renaming to something like 'pg_stat_backend' seems reasonable to me.

Docs
1. session states referenced in monitoring.sgml is not uniform with
those of the pg_stat_activity view.
monitoring.sgml:4635
monitoring.sgml:4644
+       Time in milliseconds this backend spent in the
<literal>running</literal> or <literal>fastpath</literal> state.
I think those states should be referenced uniformly with
pg_stat_activity.

2. Description of the 'pg_stat_get_session()' function might be as
follows:

  Returns a row, showing statistics about the client backend with the
  specified process ID, or one row per client backend if
  <literal>NULL</literal> is specified. The fields returned are the
  same as those of <structname>pg_stat_session</structname> view.

The main thought here is to get rid of 'each active backend' because
'active backend' looks like backend in the 'active' state.

Tests
Call to a non-existing function is depend on non-existence of a
function, which can't be guaranteed absolutely. How about to do some
kind of obvious error here? Couple examples follows:

SELECT 0/0;

- or -

DO $$
BEGIN
RAISE 'test error';
END;
$$ LANGUAGE plpgsql;

My personal choice would be the last one.

--
regards, Andrei Zubkov
Postgres Professional