Re: [Proposal] Add accumulated statistics for wait event
От | Jehan-Guillaume de Rorthais |
---|---|
Тема | Re: [Proposal] Add accumulated statistics for wait event |
Дата | |
Msg-id | 20210614161032.60172b5b@firost обсуждение исходный текст |
Ответ на | Re: [Proposal] Add accumulated statistics for wait event (Andres Freund <andres@anarazel.de>) |
Ответы |
Re: [Proposal] Add accumulated statistics for wait event
(Andres Freund <andres@anarazel.de>)
|
Список | pgsql-hackers |
Hi Andres, Hi all, First, thank you for your feedback! Please find in attachment a patch implementing accumulated wait event stats only from the backend point of view. As I wrote when I reviewed and rebased the existing patch, I was uncomfortable with the global approach. I still volunteer to work/review the original approach is required. See bellow for comments and some more explanations about what I think might be improvements over the previous patch. On Fri, 11 Jun 2021 12:18:07 -0700 Andres Freund <andres@anarazel.de> wrote: > On 2021-06-05 00:53:44 +0200, Jehan-Guillaume de Rorthais wrote: > > From 88c2779679c5c9625ca5348eec0543daab5ccab4 Mon Sep 17 00:00:00 2001 > > From: Jehan-Guillaume de Rorthais <jgdr@dalibo.com> > > Date: Tue, 1 Jun 2021 13:25:57 +0200 > > Subject: [PATCH 1/2] Add pg_stat_waitaccum view. > > > > pg_stat_waitaccum shows counts and duration of each wait events. > > Each backend/backgrounds counts and measures the time of wait event > > in every pgstat_report_wait_start and pgstat_report_wait_end. They > > store those info into their local variables and send to Statistics > > Collector. We can get those info via Statistics Collector. > > > > For reducing overhead, I implemented statistic hash instead of > > dynamic hash. I also implemented track_wait_timing which > > determines wait event duration is collected or not. > > I object to adding this overhead. The whole selling point for wait > events was that they are low overhead. I since spent time reducing the > overhead further, because even just the branches for checking if > track_activity is enabled are measurable (225a22b19ed). Agree. The previous patch I rebased was to review it and reopen this discussion, I even added a small FIXME in pgstat_report_wait_end and pgstat_report_wait_start about your work: //FIXME: recent patch to speed up this call. In the patch in attachment, I tried to fix this by using kind of an internal hook for pgstat_report_wait_start and pgstat_report_wait_end. This allows to "instrument" wait events only when required, on the fly, dynamically. Moreover, I removed the hash structure for a simple static array for faster access. > > From ddb1adc5cd9acc9bc9de16d0cf057124b09fe1e3 Mon Sep 17 00:00:00 2001 > > From: Jehan-Guillaume de Rorthais <jgdr@dalibo.com> > > Date: Fri, 4 Jun 2021 18:14:51 +0200 > > Subject: [PATCH 2/2] [POC] Change measuring method of wait event time from > > INSTR_TIME to rdtsc. > > > > This patch changes measuring method of wait event time from INSTR_TIME > > (which uses gettimeofday or clock_gettime) to rdtsc. This might reduce the > > overhead of measuring overhead. > > > > Any supports like changing clock cycle to actual time or error handling are > > not currently implemented. > > rdtsc is a serializing (*) instruction - that's the expensive part. On linux > clock_gettime() doesn't actually need a syscall. While the vdso call > implies a bit of overhead over a raw rdtsc, it's a relatively small part > of the overhead. See > https://www.postgresql.org/message-id/20200612232810.f46nbqkdhbutzqdg%40alap3.anarazel.de I choose to remove all this rdtsc part from my patch as this wasn't clear how much faster it was compare to simpler vdso functions and how to accurately extract a human time. About my take on $subject, for the sake of simplicity of this PoC, I added instrumentation to log_statement_stats. Despite the query context of the reported log, they are really accumulated stats. The patch updated pg_stat_get_waitaccum() as well to be able to report the accumulated wait events from your interactive or batch session. So using my previous fe-time demo client, you can test it using: PGOPTIONS="--log_statement_stats=on" ./fe-time 100 From logs, I now have (notice the last line): LOG: duration: 3837.194 ms statement: SELECT * FROM pgbench_accounts LOG: QUERY STATISTICS DETAIL: ! system usage stats: ! 0.087444 s user, 0.002106 s system, 3.837202 s elapsed ! [0.087444 s user, 0.003974 s system total] ! 25860 kB max resident size ! 0/0 [0/0] filesystem blocks in/out ! 0/303 [0/697] page faults/reclaims, 0 [0] swaps ! 0 [0] signals rcvd, 0/0 [0/0] messages rcvd/sent ! 4/18 [5/18] voluntary/involuntary context switches ! Client/ClientWrite 4 calls, 3747102 us elapsed Using pgbench scale factor 10, the copy query for pgbench_accounts looks like: LOG: duration: 2388.081 ms statement: copy pgbench_accounts from stdin LOG: QUERY STATISTICS DETAIL: ! system usage stats: ! 1.373756 s user, 0.252860 s system, 2.388100 s elapsed ! [1.397015 s user, 0.264951 s system total] ! 37788 kB max resident size ! 0/641584 [0/642056] filesystem blocks in/out ! 194/4147 [195/4728] page faults/reclaims, 0 [0] swaps ! 0 [0] signals rcvd, 0/0 [0/0] messages rcvd/sent ! 3263/92 [3390/102] voluntary/involuntary context switches ! LWLock/WALBufMapping 3 calls, 654 us elapsed ! LWLock/WALWrite 3 calls, 60680 us elapsed ! LWLock/CheckpointerComm 3 calls, 318 us elapsed ! Client/ClientRead 1 calls, 151 us elapsed ! IO/DataFileExtend 16397 calls, 94000 us elapsed ! IO/DataFileWrite 14346 calls, 45911 us elapsed ! IO/WALInitSync 6 calls, 334840 us elapsed ! IO/WALInitWrite 6 calls, 48040 us elapsed ! IO/WALSync 17 calls, 353334 us elapsed ! IO/WALWrite 8362 calls, 38401 us elapsed Now, I'm on a fence about the user facing interaction. I'll keep thinking about this and report this week. In the meantime, any feedback about the current implementation backbone is welcome. Thanks! Regards,
Вложения
В списке pgsql-hackers по дате отправления:
Предыдущее
От: Bharath RupireddyДата:
Сообщение: Re: A new function to wait for the backend exit after termination