Re: RFC: replace pg_stat_activity.waiting with something more descriptive

Поиск
Список
Период
Сортировка
От Amit Kapila
Тема Re: RFC: replace pg_stat_activity.waiting with something more descriptive
Дата
Msg-id CAA4eK1Je=qc0D=kWiLwyYXF4s=vjXuZYFMV_DLXCA5BaKy--9A@mail.gmail.com
обсуждение исходный текст
Ответ на Re: RFC: replace pg_stat_activity.waiting with something more descriptive  (Ildus Kurbangaliev <i.kurbangaliev@postgrespro.ru>)
Ответы Re: RFC: replace pg_stat_activity.waiting with something more descriptive  (Ildus Kurbangaliev <i.kurbangaliev@postgrespro.ru>)
Список pgsql-hackers
On Mon, Jul 13, 2015 at 3:26 PM, Ildus Kurbangaliev <i.kurbangaliev@postgrespro.ru> wrote:

On 07/12/2015 06:53 AM, Amit Kapila wrote:
For having duration, I think you need to use gettimeofday or some
similar call to calculate the wait time, now it will be okay for the
cases where wait time is longer, however it could be problematic for
the cases if the waits are very small (which could probably be the
case for LWLocks)
gettimeofday already used in our patch and it gives enough accuracy (in microseconds), especially when lwlock become a problem. Also we tested our realization and it gives overhead less than 1%. (http://www.postgresql.org/message-id/559D4729.9080704@postgrespro.ru, testing part).

I think that test is quite generic, we should test more combinations
(like use -M prepared option as that can stress LWLock machinery
somewhat more) and other type of tests which can stress the part
of code where gettimeofday() is used in patch.
 
We need help here with testing on other platforms. I used gettimeofday because of builtin module "instr_time.h" that already gives cross-platform tested functions for measuring, but I'm planning to make similar implementation for monotonic functions based on clock_gettime for more accuracy.
 
> 2) Accumulate per backend statistics about each wait event type: number of occurrences and total duration. With this statistics user can identify system bottlenecks again without sampling.
>
> Number #2 will be provided as a separate patch.
> Number #1 require different concurrency model. ldus will extract it from "waits monitoring" patch shortly.
>  

Sure, I think those should be evaluated as separate patches,
and I can look into those patches and see if something more
can be exposed as part of this patch which we can be reused in
those patches.
If you agree I'l do some modifications to your patch, so we can later extend it with our other modifications. Main issue is that one variable for all types is not enough. For flexibity in the future we need at least two - class and event, for example class=LWLock, event=ProcArrayLock, or class=Storage, and event=READ.


I have already proposed something very similar in this thread [1]
(where instead of class, I have used wait_event_type) to which
Robert doesn't agree, so here I think before writing code, it seems
prudent to get an agreement about what kind of User-Interface
would satisfy the requirement and will be extendible for future as 
well.  I think it will be better if you can highlight some points about
what kind of user-interface is better (extendible) and the reasons for
same.



With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

В списке pgsql-hackers по дате отправления:

Предыдущее
От: Sawada Masahiko
Дата:
Сообщение: Re: Freeze avoidance of very large table.
Следующее
От: Amit Kapila
Дата:
Сообщение: Re: Freeze avoidance of very large table.