Обсуждение: compute_query_id
hi,
I noticed this new param compute_query_id in pg14beta.
it is interesting as I was long wanting to identify a query with a unique id like we have for http requests etc so that we can trace the query all the way to shards via FDW etc.
I noticed this new param compute_query_id in pg14beta.
it is interesting as I was long wanting to identify a query with a unique id like we have for http requests etc so that we can trace the query all the way to shards via FDW etc.
but i cannot see them in the logs even after setting compute_query_id on.
i also read
compute_query_id_query for the same.
how is the compute_query_id actually calculated?
why does it show 0 in logs for random sql queries.
log_line_prefix = '%Q :'
0 :LOG: statement: select * from pg_stat_activity;
i mean pid already was doing the job to identify the query and its children even it logs,
but i know pid will get recycled.
tldr;
how is compute_query_id different from pid to identify some query running ?
can it be passed on to FDW queries ? for tracing etc ?
am i totally getting its use case totally wrong :)
--
Thanks,
Vijay
Mumbai, India
On Thu, Jun 17, 2021 at 08:09:54PM +0530, Vijaykumar Jain wrote: > how is the compute_query_id actually calculated? It's the exact same implementation that was extracted from pg_stat_statements. You have some implementation details at https://www.postgresql.org/docs/current/pgstatstatements.html. > why does it show 0 in logs for random sql queries. > log_line_prefix = '%Q :' > 0 :LOG: statement: select * from pg_stat_activity; It means that you haven't enabled it: 2021-06-17 22:46:16.231 CST [11246] queryid=0 LOG: duration: 4.971 ms statement: select * from pg_stat_activity ; 2021-06-17 22:46:25.383 CST [11246] queryid=0 LOG: duration: 0.284 ms statement: set compute_query_id = on; 2021-06-17 22:46:28.744 CST [11246] queryid=941978042436931562 LOG: duration: 1.725 ms statement: select * from pg_stat_activity; > i mean pid already was doing the job to identify the query and its children > even it logs, > but i know pid will get recycled. I'm not sure that I understand that question. The pid will identify a backend, and that backend can execute 0, 1 or a lot of different queries. The query_id will uniquely identify statements after some normalization and removing the constant parts (so for instance "select 1;" and "Select 2 ;" will have the same identifier). Having only that information in the log can be useful on its own, but you usually get way more benefit using additional modules like pg_stat_statements.
On Thu, 17 Jun 2021 at 20:20, Julien Rouhaud <rjuju123@gmail.com> wrote:
>
> On Thu, Jun 17, 2021 at 08:09:54PM +0530, Vijaykumar Jain wrote:
> > how is the compute_query_id actually calculated?
>
> > why does it show 0 in logs for random sql queries.
> > log_line_prefix = '%Q :'
> > 0 :LOG: statement: select * from pg_stat_activity;
>
> It means that you haven't enabled it:
>
> 2021-06-17 22:46:16.231 CST [11246] queryid=0 LOG: duration: 4.971 ms statement: select * from pg_stat_activity ;
> 2021-06-17 22:46:25.383 CST [11246] queryid=0 LOG: duration: 0.284 ms statement: set compute_query_id = on;
> 2021-06-17 22:46:28.744 CST [11246] queryid=941978042436931562 LOG: duration: 1.725 ms statement: select * from
pg_stat_activity;
>
psql test
psql (14beta1)
Type "help" for help.
test=# show log_line_prefix;
log_line_prefix
--------------------------------
[timestamp=%t] [query_id=%Q] :
(1 row)
test=# show compute_query_id;
compute_query_id
------------------
on
(1 row)
test=# show log_statement;
log_statement
---------------
all
(1 row)
test=# select query_id, query, pid from pg_stat_activity where pid =
pg_backend_pid();
-[ RECORD 1 ]-----------------------------------------------------------------------------
query_id | -4293879703199833131
query | select query_id, query, pid from pg_stat_activity where pid
= pg_backend_pid();
pid | 2640
from the logs, i get the id for some queries, but not all
[timestamp=2021-06-17 20:37:59 IST] [query_id=488416992746849793]
:ERROR: relation "t" already exists
[timestamp=2021-06-17 20:37:59 IST] [query_id=488416992746849793]
:STATEMENT: create table t(id int);
[timestamp=2021-06-17 20:38:22 IST] [query_id=0] :LOG: statement:
prepare qq(int) as select count(1) from t where id = $1;
[timestamp=2021-06-17 20:38:29 IST] [query_id=0] :LOG: statement:
execute qq(1);
[timestamp=2021-06-17 20:38:29 IST] [query_id=0] :DETAIL: prepare:
prepare qq(int) as select count(1) from t where id = $1;
[timestamp=2021-06-17 20:38:32 IST] [query_id=0] :LOG: statement:
execute qq(2);
[timestamp=2021-06-17 20:38:32 IST] [query_id=0] :DETAIL: prepare:
prepare qq(int) as select count(1) from t where id = $1;
[timestamp=2021-06-17 20:39:25 IST] [query_id=0] :LOG: statement:
select query_id, query, pid from pg_stat_activity where pid = 0;
[timestamp=2021-06-17 20:40:36 IST] [query_id=0] :LOG: statement:
select count(1) from t;
[timestamp=2021-06-17 20:40:47 IST] [query_id=0] :LOG: statement:
select count(1) from t where id < 100;
test=# explain (analyze,verbose) select * from t where id <
floor((random() * 100)::int);
QUERY PLAN
---------------------------------------------------------------------------------------------------------------------------
Append (cost=0.00..3.10 rows=3 width=4) (actual time=0.009..0.014
rows=3 loops=1)
....
Query Identifier: 1051405225525186795
Planning Time: 0.090 ms
Execution Time: 0.030 ms
(13 rows)
test=# select query_id, query, pid from pg_stat_activity where pid =
pg_backend_pid();
query_id | query
| pid
----------------------+---------------------------------------------------------------------------------+------
-4293879703199833131 | select query_id, query, pid from
pg_stat_activity where pid = pg_backend_pid(); | 2671
(1 row)
but in logs
[timestamp=2021-06-17 20:46:47 IST] [query_id=0] :LOG: statement:
explain select query_id, query, pid from pg_stat_activity where pid =
pg_backend_pid();
[timestamp=2021-06-17 20:46:54 IST] [query_id=0] :LOG: statement:
explain select query_id, query, pid from pg_stat_activity where pid >
100;
[timestamp=2021-06-17 20:46:58 IST] [query_id=0] :LOG: statement:
explain analyze select query_id, query, pid from pg_stat_activity
where pid > 100;
[timestamp=2021-06-17 20:47:25 IST] [query_id=0] :LOG: statement:
explain analyze select * from t where id < floor((random() *
100)::int);
[timestamp=2021-06-17 20:48:16 IST] [query_id=0] :LOG: statement:
explain (analyze,verbose) select * from t where id < floor((random()
* 100)::int);
[timestamp=2021-06-17 20:48:38 IST] [query_id=0] :LOG: statement:
select query_id, query, pid from pg_stat_activity where pid =
pg_backend_pid();
not sure, if i am missing something obvious?
> I'm not sure that I understand that question. The pid will identify a backend,
> and that backend can execute 0, 1 or a lot of different queries. The query_id
> will uniquely identify statements after some normalization and removing the
> constant parts (so for instance "select 1;" and "Select 2 ;" will have the
> same identifier). Having only that information in the log can be useful on its
> own, but you usually get way more benefit using additional modules like
> pg_stat_statements.
Thanks, that helps, but I still do not see them in logs.
************************************* corrects myself
..... ok now it works, when i set log_min_duration_statement =0 to log
all statements.
test=# show log_min_duration_statement;
log_min_duration_statement
----------------------------
0
(1 row)
test=# explain (analyze,verbose) select * from t where id <
floor((random() * 100)::int);
QUERY PLAN
---------------------------------------------------------------------------------------------------------------------------
Append (cost=0.00..3.10 rows=3 width=4) (actual time=0.018..0.022
rows=3 loops=1)
-> Seq Scan on public.t0 t_1 (cost=0.00..1.03 rows=1 width=4)
(actual time=0.018..0.018 rows=1 loops=1)
Output: t_1.id
Filter: ((t_1.id)::double precision < floor((((random() *
'100'::double precision))::integer)::double precision))
-> Seq Scan on public.t1 t_2 (cost=0.00..1.03 rows=1 width=4)
(actual time=0.001..0.002 rows=1 loops=1)
Output: t_2.id
Filter: ((t_2.id)::double precision < floor((((random() *
'100'::double precision))::integer)::double precision))
-> Seq Scan on public.t2 t_3 (cost=0.00..1.03 rows=1 width=4)
(actual time=0.001..0.001 rows=1 loops=1)
Output: t_3.id
Filter: ((t_3.id)::double precision < floor((((random() *
'100'::double precision))::integer)::double precision))
Query Identifier: 1051405225525186795
Planning Time: 0.353 ms
Execution Time: 0.034 ms
(13 rows)
test=# select query_id, query, pid from pg_stat_activity where pid =
pg_backend_pid();
query_id | query
| pid
----------------------+---------------------------------------------------------------------------------+------
-4293879703199833131 | select query_id, query, pid from
pg_stat_activity where pid = pg_backend_pid(); | 2777
(1 row)
[timestamp=2021-06-17 20:54:12 IST] [query_id=-1477439429101745134]
:LOG: duration: 0.040 ms statement: show log_min_duration_statement;
[timestamp=2021-06-17 20:54:16 IST] [query_id=1308619569072270555]
:LOG: duration: 0.882 ms statement: explain (analyze,verbose) select
* from t where id < floor((random() * 100)::int);
[timestamp=2021-06-17 20:54:19 IST] [query_id=-4293879703199833131]
:LOG: duration: 1.201 ms statement: select query_id, query, pid from
pg_stat_activity where pid = pg_backend_pid();
all good, Julien.
--
Thanks,
Vijay
Mumbai, India
On Thu, Jun 17, 2021 at 08:57:02PM +0530, Vijaykumar Jain wrote: > > test=# show log_line_prefix; > log_line_prefix > -------------------------------- > [timestamp=%t] [query_id=%Q] : > (1 row) > > test=# show compute_query_id; > compute_query_id > ------------------ > on > (1 row) > > test=# show log_statement; > log_statement > --------------- > all > (1 row) > > ************************************* corrects myself > ..... ok now it works, when i set log_min_duration_statement =0 to log > all statements. > > test=# show log_min_duration_statement; > log_min_duration_statement > ---------------------------- > 0 > (1 row) Yes, unfortunately log_statements is not compatible with compute_query_id. This is documented at https://www.postgresql.org/docs/devel/runtime-config-logging.html#GUC-LOG-LINE-PREFIX: > The %Q escape always reports a zero identifier for lines output by > log_statement because log_statement generates output before an identifier can > be calculated, including invalid statements for which an identifier cannot be > calculated.