Обсуждение: [PROPOSAL] timestamp informations to pg_stat_statements
Hi hackers!
Following is a proposal to add timestamp informations to `pg_stat_statements`.
# Use case
- If we want to gather list and stats for queries executed at least once last 1 hour, we had to reset a hours ago. There is no way if we didn't.
- If we found some strange query from `pg_stat_statments`, we might want to identify when it ran firstly.
If we have timestamp of first and last executed, we can easily gather thess informations and there are tons of more use cases.
# Implementations
Firstly, I added API version 1.5 to add additional fields and I added two fields to Counters structure. Now it has 25 fields in total.
```
/*
Following is a proposal to add timestamp informations to `pg_stat_statements`.
# Use case
- If we want to gather list and stats for queries executed at least once last 1 hour, we had to reset a hours ago. There is no way if we didn't.
- If we found some strange query from `pg_stat_statments`, we might want to identify when it ran firstly.
If we have timestamp of first and last executed, we can easily gather thess informations and there are tons of more use cases.
# Implementations
Firstly, I added API version 1.5 to add additional fields and I added two fields to Counters structure. Now it has 25 fields in total.
```
@@ -156,6 +158,8 @@ typedef struct Counters
double blk_read_time; /* time spent reading, in msec */
double blk_write_time; /* time spent writing, in msec */
double usage; /* usage factor */
+ TimestampTz created; /* timestamp of created time */
+ TimestampTz last_updated; /* timestamp of last updated */
} Counters;
```
The `created` field is filled at the first time the entry will added to hash table.
```
The `created` field is filled at the first time the entry will added to hash table.
```
@@ -1663,6 +1690,8 @@ entry_alloc(pgssHashKey *key, Size query_offset, int query_len, int encoding,
/* reset the statistics */
memset(&entry->counters, 0, sizeof(Counters));
+ /* set the created timestamp */
+ entry->counters.created = GetCurrentTimestamp();
/* set the appropriate initial usage count */
entry->counters.usage = sticky ? pgss->cur_median_usage : USAGE_INIT;
/* re-initialize the mutex each time ... we assume no one using it */```
The `last_updated` will be updated every time `pgss_store()` updates stats.
```
The `last_updated` will be updated every time `pgss_store()` updates stats.
```
@@ -1251,6 +1256,7 @@ pgss_store(const char *query, uint32 queryId,
e->counters.blk_read_time += INSTR_TIME_GET_MILLISEC(bufusage->blk_read_time);
e->counters.blk_write_time += INSTR_TIME_GET_MILLISEC(bufusage->blk_write_time);
e->counters.usage += USAGE_EXEC(total_time);
+ e->counters.last_updated = GetCurrentTimestamp();
SpinLockRelease(&e->mutex);
}```
The attached is my first implementation.
The attached is my first implementation.
Regards,
Jason Kim.
Вложения
On Sun, Jul 17, 2016 at 12:22 AM, Jun Cheol Gim <dialogbox@gmail.com> wrote: > If we have timestamp of first and last executed, we can easily gather thess > informations and there are tons of more use cases. -1 from me. I think that this is the job of a tool that aggregates things from pg_stat_statements. It's unfortunate that there isn't a good third-party tool that does that, but there is nothing that prevents it. -- Peter Geoghegan
Peter Geoghegan <pg@heroku.com> writes: > On Sun, Jul 17, 2016 at 12:22 AM, Jun Cheol Gim <dialogbox@gmail.com> wrote: >> If we have timestamp of first and last executed, we can easily gather thess >> informations and there are tons of more use cases. > -1 from me. > I think that this is the job of a tool that aggregates things from > pg_stat_statements. It's unfortunate that there isn't a good > third-party tool that does that, but there is nothing that prevents > it. The concern I've got about this proposal is that the results get very questionable as soon as we start dropping statement entries for lack of space. last_executed would be okay, perhaps, but first_executed not so much. regards, tom lane
On Sun, Jul 17, 2016 at 4:15 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote: > The concern I've got about this proposal is that the results get very > questionable as soon as we start dropping statement entries for lack > of space. last_executed would be okay, perhaps, but first_executed > not so much. Agreed. Also, for what it's worth, I should point out to Jun that GetCurrentTimestamp() should definitely not be called when a spinlock is held like that. -- Peter Geoghegan
On 18/07/2016 01:06, Peter Geoghegan wrote: > On Sun, Jul 17, 2016 at 12:22 AM, Jun Cheol Gim <dialogbox@gmail.com> wrote: >> If we have timestamp of first and last executed, we can easily gather thess >> informations and there are tons of more use cases. > > -1 from me. > > I think that this is the job of a tool that aggregates things from > pg_stat_statements. It's unfortunate that there isn't a good > third-party tool that does that, but there is nothing that prevents > it. > FWIW there's https://github.com/dalibo/powa-archivist which does that, assuming you're using a 9.4+ server. -- Julien Rouhaud http://dalibo.com - http://dalibo.org
Hi guys, Thank you for feedbacks. > I think that this is the job of a tool that aggregates things from > pg_stat_statements. It's unfortunate that there isn't a good > third-party tool that does that, but there is nothing that prevents > it. Right. We can do this if we aggregate it frequently enough. However, third-parties can do it better if we have more informations. I think these are fundamental informations to strong third-parties could be. > The concern I've got about this proposal is that the results get very > questionable as soon as we start dropping statement entries for lack > of space. last_executed would be okay, perhaps, but first_executed > not so much. If we set pg_stat_statements.max to large enough, there could be long lived entries and short lived ones simultaneously. In this case, per call statistics could be accurate but per seconds stats can not. The idea of I named it as created and last_updated (not first_executed and last_executed) was this. It represents timestamp ofstats are created and updated, so we can calculateper second stats with simple math. > Also, for what it's worth, I should point out to Jun that > GetCurrentTimestamp() should definitely not be called when a spinlock > is held like that. I moved it outside of spinlock. @@ -1204,6 +1209,11 @@ pgss_store(const char *query, uint32 queryId, */ volatile pgssEntry *e = (volatile pgssEntry*) entry; + /* + * Read a timestamp before grab the spinlock + */ + TimestampTz last_updated = GetCurrentTimestamp(); + SpinLockAcquire(&e->mutex); /* "Unstick" entry if it was previously sticky */ @@ -1251,6 +1261,7 @@ pgss_store(const char *query, uint32 queryId, e->counters.blk_read_time += INSTR_TIME_GET_MILLISEC(bufusage->blk_read_time); e->counters.blk_write_time += INSTR_TIME_GET_MILLISEC(bufusage->blk_write_time); e->counters.usage += USAGE_EXEC(total_time); + e->counters.last_updated = last_updated; SpinLockRelease(&e->mutex); } pg_stat_statements_with_timestamp_v2.patch <http://postgresql.nabble.com/file/n5912461/pg_stat_statements_with_timestamp_v2.patch> Regards, Jason Kim. -- View this message in context: http://postgresql.nabble.com/PROPOSAL-timestamp-informations-to-pg-stat-statements-tp5912306p5912461.html Sent from the PostgreSQL - hackers mailing list archive at Nabble.com.
On Sun, Jul 17, 2016 at 7:15 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote: > Peter Geoghegan <pg@heroku.com> writes: >> On Sun, Jul 17, 2016 at 12:22 AM, Jun Cheol Gim <dialogbox@gmail.com> wrote: >>> If we have timestamp of first and last executed, we can easily gather thess >>> informations and there are tons of more use cases. > >> -1 from me. > >> I think that this is the job of a tool that aggregates things from >> pg_stat_statements. It's unfortunate that there isn't a good >> third-party tool that does that, but there is nothing that prevents >> it. > > The concern I've got about this proposal is that the results get very > questionable as soon as we start dropping statement entries for lack > of space. last_executed would be okay, perhaps, but first_executed > not so much. ISTM that last_executed is useful - you can then see for yourself which of the statements that you see in the pg_stat_statements output have been issued recently, and which are older. I mean, you could also do that, as Peter says, with an additional tool that takes periodic snapshots of the data and then figures out an approximate last_executed time, but if you had this, you wouldn't need an additional tool, at least not for simple cases. Better yet, the data would be more exact. I dunno what's not to like about that, unless we're worried that tracking it will incur too much overhead. first_executed doesn't seem as useful as last_executed, but it isn't useless either. It can't properly be read as the first time that statement was ever executed, but it can be properly read as the amount of time that has passed during which that statement has been executed frequently enough to stay in the hash table, which is something that someone might want to know. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
+1 Having the time of first occurence of a statement is very usefull for trouble shouting, it permits for exemple to retrieve the order of operations in some complex cases (and thoses informations aren't taken by any third party collecting tool, that will only be able to provide a time range of occurence). I thougth that pgss rows where removed randomly when max rows was reached, wouldn't having last_executed information permit a kind of LRU removal ? Regards PAscal -- Sent from: http://www.postgresql-archive.org/PostgreSQL-hackers-f1928748.html
On 03/10/2018 04:43 PM, legrand legrand wrote: > +1 > Having the time of first occurence of a statement is very usefull > for trouble shouting, it permits for exemple to retrieve the order of > operations in some complex cases (and thoses informations aren't > taken by any third party collecting tool, that will only be able to > provide a time range of occurence). > I really don't see how this would be useful in reconstructing order of operations, particularly in complex cases where I'd expect the queries to be executed repeatedly / interleaved in different ways. Furthermore, it would only work for the very first execution of all statements, so you would probably have to reset the stats over and over - which seems to directly contradict the purpose of pg_stat_statements (aggregation of data over longer periods of time). So unfortunately this seems rather useless, and log_statements=all seems like a much better / reliable approach to achieve that. I also doubt it really allows computation of averages, e.g. queries per second, because practical workloads usually have daily/weekly patterns (and different queries may follow different patterns). So I agree with Peter Geoghegan that tools regularly snapshotting pg_stat_statements and processing that are a better approach. I've seen a bunch of simple scripts doing just that, actually. > I thougth that pgss rows where removed randomly when max rows was > reached, wouldn't having last_executed information permit a kind of > LRU removal ? > No, the entries are not removed randomly. We track "usage" for each entry (essentially +1 for each time the query got executed, with a decay factor applied on each eviction (and we evict 5% at a time). It's not immediately obvious why something based on time of the first/last execution would be better than the current algorithm. regards -- Tomas Vondra http://www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
> No, the entries are not removed randomly. We track "usage" for each > entry (essentially +1 for each time the query got executed, with a decay > factor applied on each eviction (and we evict 5% at a time). OK I apologize, I hadn't looked in source code in details, and effectively the "Usage" algorithm based on execution_time and calls will keep the longest queries (and that's a good thing). In some cases (after an application release or a postgres patch for exemple) we may be interested in leastiests ones. I'm still convinced that adding those kinds of informations (with a planid as done in pg_stat_plans) would be an improvement. Maybe because having used v$sql, first_load_time, last_active_time and plan_hash_value for (too) many years). It's always important to know when a new plan (good or bad) occurs, and pgss could give this kind of informations even without aggregation collector. Last point (for developer), after testing this patch, in most cases when calls=1: created and last_updated values are identical, they don't even differ to reflect execution (nor planing) duration. Is that a precision or coding problem ? Regards PAscal -- Sent from: http://www.postgresql-archive.org/PostgreSQL-hackers-f1928748.html