Re: Adding wait events statistics
От | Bertrand Drouvot |
---|---|
Тема | Re: Adding wait events statistics |
Дата | |
Msg-id | aHnj9tIzAd6/2kuK@ip-10-97-1-34.eu-west-3.compute.internal обсуждение исходный текст |
Ответ на | Adding wait events statistics (Bertrand Drouvot <bertranddrouvot.pg@gmail.com>) |
Список | pgsql-hackers |
Hi, On Tue, Jul 15, 2025 at 04:13:49PM +0200, Bertrand Drouvot wrote: > Hi, > > On Fri, Jul 11, 2025 at 11:59:39AM -0400, Andres Freund wrote: > > Hi, > > > > On 2025-07-10 08:21:56 +0000, Bertrand Drouvot wrote: > > > - Let's define a list of "cheap" and a list of "expensive" wait events > > > - Change the code path to increment counters only for the "expensive" ones > > > - Optimize the counter incrementation as much as possible > > > > > > Does that sound reasonable to you? > > > > That does seem like the minimum. > > > > Unfortunately I'm rather doubtful this provides enough value to be worth the > > cost. But I'm rather willing to be proven wrong. > > > > I guess that would (among other things) depend of how many wait events would be > tracked. I'll do some study to measure the counters's impact on the wait events. > Here what I’ve done: 1. Added 2 probes: one in pgstat_report_wait_start() and one in pgstat_report_wait_end() to measure the wait events duration. Then, I ran make check-world and computed the duration percentiles (in microseconds) per wait event (see attached). Note that this is without the counters patch, just to get an idea of wait events duration on my lab. Due to the way I ran the eBPF script I’ve been able to trap 159 wait events (out of 273) during "make check-world" (I think that’s enough for a first analysis, see below). 2. Applied the patch with the counters (one that I did not share yet), removed the probes in pgstat_report_wait_start() and pgstat_report_wait_end() and added one probe before incrementing the counters (i.e before calling waitEventIncrementCounter()) in pgstat_report_wait_end() and one probe after incrementing the counters to measure the counters increment duration. Then I ran make check-world and computed counter increment duration percentiles (in nanoseconds) and got: P50.00: 1138.000000 P90.00: 2674.000000 P95.00: 3741.000000 P99.00: 7657.000000 P99.90: 16973.881000 P99.99: 55614.830000 Then I compared the P50.00 for the counter and for the wait events and rejected the wait events where incrementing the counter produces more than 5% overhead. -> 59 wait events (out of the 159 observed during make check-world) got less than 5% overhead, they are mixed across classes but among those 59 we have: All the trapped ones in the Lock class (8 events) (0.7% avg impact): spectoken advisory tuple relation transactionid virtualxid extend applytransaction All the trapped ones in the Timeout class (7 events) (0.03% avg impact): RECOVERY_APPLY_DELAY RECOVERY_RETRIEVE_RETRY_INTERVAL CHECKPOINT_WRITE_DELAY VACUUM_TRUNCATE BASE_BACKUP_THROTTLE PG_SLEEP SPIN_DELAY And the BufferPin one (about 4%). Note that I would get the same behavior (just a little bit more overhead) if I add the wait events duration (INSTR_*) to the game. With the above it looks like that it would be possible to add the counters and their timings on the Lock and Timeout classes without producing a noticeable performance impact. I think that: - That’s easy enough to reason about and kind of expected for those 2 classes - We would miss other wait events but I think the trade-off is worth the simplicity - Having the ability to get the counters and the timings for those 2 wait classes is better than nothing - Having those metrics for the Lock class would be a nice addition to pg_locks Thoughts? -- Bertrand Drouvot PostgreSQL Contributors Team RDS Open Source Databases Amazon Web Services: https://aws.amazon.com
Вложения
В списке pgsql-hackers по дате отправления: