Обсуждение: [PATCH] pg_stat_statements: add last_execution_start column

Поиск
Список
Период
Сортировка

[PATCH] pg_stat_statements: add last_execution_start column

От
"Pavlo Golub"
Дата:
Hi hackers,

This patch adds a `last_execution_start` column to `pg_stat_statements`,
recording the start timestamp of the most recent execution of each
tracked statement.

It supersedes the `stats_last_updated` series discussed here:

https://www.postgresql.org/message-id/flat/CAK7ymc+FxoVswo1ok_xDW-xPG-ZEZ8SAqCUkJ7WF04=0aQDvVQ@mail.gmail.com

The main criticism of that series was performance using
`GetCurrentTimestamp()` inside the stats accumulation.  pgbench testing
confirmed the concern of roughly 5–6% TPS regression on a
short-transaction workload.

This patch takes a different approach.  Instead of calling
`GetCurrentTimestamp()`, it uses `GetCurrentStatementStartTimestamp()`,
which simply is a variable reading.
There is no syscall and no additional work in the hot path.

Benchmark (16-vCPU, pgbench -c8 -j4 -T60, explicit transactions with 15
SELECT statements each):

   master HEAD:  ~4574 TPS (runs: 4636, 4585, 4500)
   patched:      ~4571 TPS (runs: 4577, 4560, 4575)
   difference:   ~0.1%

The column is initialized to the entry allocation time and updated on
every call to `pgss_store()`.  It is reset by
`pg_stat_statements_reset()` but preserved across minmax-only resets,
consistent with `stats_since` semantics.

A monitoring query to find statements that have executed since the last
observation could look like:

   SELECT query, calls, last_execution_start
   FROM pg_stat_statements
   WHERE last_execution_start >= $1  -- e.g. last check timestamp
   ORDER BY last_execution_start DESC;

Patch attached.

Best regards,
Pavlo Golub
Вложения

Re: [PATCH] pg_stat_statements: add last_execution_start column

От
Sami Imseih
Дата:
Hi,

It would be really great to get this field in, but I think the current
implementation still suffers from the same issue that is mentioned
here [1]. We cannot rely on GetCurrentStatementStartTimestamp()
in-line because ExecutorEnd is deferred to the next execution in the
case of extended query protocol. I think we need to add it to track the
start timestamp in queryDesc. What do you think?

```
select pg_stat_statements_reset();

BEGIN;
select now() as now, clock_timestamp() as clock_timestamp,
pg_sleep($1) \bind 10 \g

\! sleep 10

SELECT now() as now, clock_timestamp() as clock_timestamp, $1 \bind 1 \g
END;

select last_execution_start, total_exec_time, substr(query, 1, 150) as
query from pg_stat_statements;
```
Notice how the last_execution_start reflects when the
"SELECT now() as now, clock_timestamp() as clock_timestamp, $1 " runs

--
Sami

[1] [https://www.postgresql.org/message-id/CAA5RZ0sxPWP2xm8fxhscE%2BcUqC2VSFi9UZ9882BdGZ0MbGQNUA%40mail.gmail.com]