RE: [EXTERNAL] Re: pg_stat_statements

Поиск
Список
Период
Сортировка
От Godfrin, Philippe E
Тема RE: [EXTERNAL] Re: pg_stat_statements
Дата
Msg-id SA0PR15MB39337CC1622DDFEC7D0E609F82C29@SA0PR15MB3933.namprd15.prod.outlook.com
обсуждение исходный текст
Ответ на Re: pg_stat_statements  (Julien Rouhaud <rjuju123@gmail.com>)
Ответы Re: [EXTERNAL] Re: pg_stat_statements  (Julien Rouhaud <rjuju123@gmail.com>)
Список pgsql-hackers

 

 

From: Julien Rouhaud <rjuju123@gmail.com>
Sent: Wednesday, May 4, 2022 10:24 PM
To: Godfrin, Philippe E <Philippe.Godfrin@nov.com>
Cc: pgsql-hackers@lists.postgresql.org
Subject: [EXTERNAL] Re: pg_stat_statements

 

Hi,

On Tue, May 03, 2022 at 01:30:32PM +0000, Godfrin, Philippe E wrote:
>
> I wasn't exactly clear about the queries. The values clauses themselves are not long -
> We are using repeated values clauses:
>
> INSERT INTO timeseries.dvc_104 (tag_id, event_ts, bool_val, float_val, int_val, string_val, last_updt_id)
> VALUES ($1,$2,$3,$4,$5,$6,$7),($8,$9,$10,$11,$12,$13,$14),($15,$16,$17,$18,$19,$20,$21),
> ($22,$23,$24,$25,$26,$27,$28),($29,$30,$31,$32,$33,$34,$35),($36,$37,$38,$39,$40,$41,$42),
> ($43,$44,$45,$46,$47,$48,$49),($50,$51,$52,$53,$54,$55,$56),($57,$58,$59,$60,$61,$62,$63),
> ($64,$65,$66,$67,$68,$69,$70),($71,$72,$73,$74,$75,$76,$77),($78,$79,$80,$81,$82,$83,$84),
> ($85,$86,$87,$88,$89,$90,$91),($92,$93,$94,$95,$96,$97,$98)
>
> This one's not long, but some 'load statements' have 10,000 values clauses, others add up to 10,000 more
> in an ON CONFLICT clause. I've checked the external Query file and it's currently not large
> at all. But I will keep an eye on that. When I had The settings at 1000 statements
> the file was indeed over 1GB. For the record, development is reducing those statement
> lengths.
> [...]
> The first observation is how long a simple query took:
>
> # select count(*) from pg_stat_statements;
> count
> -------
> 971
> Time: 6457.985 ms (00:06.458)
>
> MORE than six seconds for a mere 971 rows! Furthermore, when removing the long queries:
> # select count(*) from pg_stat_statements(showtext:=false);
> count
> -------
> 970
> Time: 10.644 ms
>
> Only 10ms...

Well, 10ms is still quite slow.

You're not removing the long queries texts, you're removing all queries texts.
I don't know if the overhead comes from processing at least some long
statements or is mostly due to having to retrieve the query file. Do you get
the same times if you run the query twice? Maybe you're short on RAM and have
somewhat slow disks, and the text file has to be read from disk rather than OS
cache?

Also I don't know what you mean by "not large at all", so it's hard to compare
or try to reproduce. FWIW on some instance I have around, I have a 140kB file
and querying pg_stat_statements *with* the query text file only takes a few ms.

You could try to query that view with some unprivileged user. This way you
will still retrieve the query text file but will only emit "<insufficient
privilege>" rather than processing the query texts, this may narrow down the
problem. Or better, if you could run perf [1] to see where the overhead really
is.

> Second, we have Datadog installed. Datadoq queries the pg_stat_statements table
> every 10 seconds. The real pain point is querying the pg_stat_statements seems
> to have an impact on running queries, specifically inserts in my case.

I think this is a side effect of having a very low pg_stat_statements.max, if
of course you have more queries than the current value.

If the extra time is due to loading the query text file and if it's loaded
after acquiring the lightweight lock, then you will prevent evicting or
creating new entries for a long time, which means that the query execution for
those queries will be blocked until the query on pg_stat_statements ends.

There are unfortunately *a lot* of unknowns here, so I can't do anything apart
from guessing.

> I believe this is an actual impact that needs a solution.

First, if you have an OLTP workload you have to make sure that
pg_stat_statements.max is high enough so that you don't have to evict entries,
or at least not often. Then, I think that querying pg_stat_statements every
10s is *really* aggressive, that's always going to have some noticeable
overhead. For the rest, we need more information to understand where the
slowdown is coming from.

[1] https://wiki.postgresql.org/wiki/Profiling_with_perf

 

Hello Julien,

Thanks very much for looking closely at this. To answer your questions:

I misspoke the query file at the time of the queries above was around 1GB.

 

I don’t believe I am short on RAM, although I will re-examine that aspect. I’m running 32GB

with a 22GB shared pool, which seems OK to me. The disk are SSD (AWS EBS) and

the disk volumes are the same as the data volumes. If a regular file on disk at 1GB

took 6 seconds to read, the rest of the system would be in serious degradation.

 

The impact on running queries was observed when the max was set at 1000. I don’t

quite understand what you keep saying about evictions and other things relative to the

pgss file. Can you refer me to some detailed documentation or a good article which

describes the processes you’re alluding to?

 

Insofar as querying the stats table every 10 seconds, I think that’s not aggressive enough as

I want to have statement monitoring as close to realtime as possible.

 

You are indeed correct insofar as unknowns – the biggest one for me is I don’t know

enough about how the stats extension works – as I asked before, more detail on the

internals of the extension would be useful. Is my only choice in that regard to browse

the source code?

 

Short of running the profile that should deal with the unknowns, any other ideas?

Thanks,

philippe

 

В списке pgsql-hackers по дате отправления:

Предыдущее
От: vignesh C
Дата:
Сообщение: Re: Handle infinite recursion in logical replication setup
Следующее
От: Jacob Champion
Дата:
Сообщение: Re: [PATCH] Log details for client certificate failures