Re: Adding wait events statistics

Поиск
Список
Период
Сортировка
От Bertrand Drouvot
Тема Re: Adding wait events statistics
Дата
Msg-id aHnj9tIzAd6/2kuK@ip-10-97-1-34.eu-west-3.compute.internal
обсуждение исходный текст
Ответ на Adding wait events statistics  (Bertrand Drouvot <bertranddrouvot.pg@gmail.com>)
Список pgsql-hackers
Hi,

On Tue, Jul 15, 2025 at 04:13:49PM +0200, Bertrand Drouvot wrote:
> Hi,
> 
> On Fri, Jul 11, 2025 at 11:59:39AM -0400, Andres Freund wrote:
> > Hi,
> > 
> > On 2025-07-10 08:21:56 +0000, Bertrand Drouvot wrote:
> > > - Let's define a list of "cheap" and a list of "expensive" wait events
> > > - Change the code path to increment counters only for the "expensive" ones
> > > - Optimize the counter incrementation as much as possible
> > > 
> > > Does that sound reasonable to you?
> > 
> > That does seem like the minimum.
> > 
> > Unfortunately I'm rather doubtful this provides enough value to be worth the
> > cost. But I'm rather willing to be proven wrong.
> > 
> 
> I guess that would (among other things) depend of how many wait events would be 
> tracked. I'll do some study to measure the counters's impact on the wait events. 
>                                                                                 

Here what I’ve done:

1. Added 2 probes: one in pgstat_report_wait_start() and one in pgstat_report_wait_end()
to measure the wait events duration. Then, I ran make check-world and computed 
the duration percentiles (in microseconds) per wait event (see attached).

Note that this is without the counters patch, just to get an idea of wait events
duration on my lab.

Due to the way I ran the eBPF script I’ve been able to trap 159 wait events
(out of 273) during "make check-world" (I think that’s enough for a first
analysis, see below). 

2. Applied the patch with the counters (one that I did not share yet), removed
the probes in pgstat_report_wait_start() and pgstat_report_wait_end() and added
one probe before incrementing the counters (i.e before calling waitEventIncrementCounter())
in pgstat_report_wait_end() and one probe after incrementing the counters to measure
the counters increment duration.

Then I ran make check-world and computed counter increment duration percentiles
(in nanoseconds) and got:

P50.00: 1138.000000
P90.00: 2674.000000
P95.00: 3741.000000
P99.00: 7657.000000
P99.90: 16973.881000
P99.99: 55614.830000

Then I compared the P50.00 for the counter and for the wait events and rejected
the wait events where incrementing the counter produces more than 5% overhead.

-> 59 wait events (out of the 159 observed during make check-world) got less
than 5% overhead, they are mixed across classes but among those 59 we have:

All the trapped ones in the Lock class (8 events) (0.7% avg impact):

spectoken
advisory
tuple
relation
transactionid
virtualxid
extend
applytransaction
 
All the trapped ones in the Timeout class (7 events) (0.03% avg impact):

RECOVERY_APPLY_DELAY
RECOVERY_RETRIEVE_RETRY_INTERVAL
CHECKPOINT_WRITE_DELAY
VACUUM_TRUNCATE
BASE_BACKUP_THROTTLE
PG_SLEEP
SPIN_DELAY

And the BufferPin one (about 4%).

Note that I would get the same behavior (just a little bit more overhead) if I
add the wait events duration (INSTR_*) to the game.
 
With the above it looks like that it would be possible to add the counters and
their timings on the Lock and Timeout classes without producing a noticeable
performance impact.

I think that:

- That’s easy enough to reason about and kind of expected for those 2 classes
- We would miss other wait events but I think the trade-off is worth the simplicity
- Having the ability to get the counters and the timings for those 2 wait classes
is better than nothing
- Having those metrics for the Lock class would be a nice addition to pg_locks

Thoughts?

-- 
Bertrand Drouvot
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com

Вложения

В списке pgsql-hackers по дате отправления: