Re: RFC: replace pg_stat_activity.waiting with something more descriptive

Поиск
Список
Период
Сортировка
От Robert Haas
Тема Re: RFC: replace pg_stat_activity.waiting with something more descriptive
Дата
Msg-id CA+TgmobDC4fbFpa49J6oK-iKazYVhW71VSSoTpvmf4tdyzRoCQ@mail.gmail.com
обсуждение исходный текст
Ответ на Re: RFC: replace pg_stat_activity.waiting with something more descriptive  (Alexander Korotkov <a.korotkov@postgrespro.ru>)
Ответы Re: RFC: replace pg_stat_activity.waiting with something more descriptive  (Tom Lane <tgl@sss.pgh.pa.us>)
Список pgsql-hackers
On Fri, Jul 10, 2015 at 12:33 PM, Alexander Korotkov
<a.korotkov@postgrespro.ru> wrote:
> I can propose following:
>
> 1) Expose more information about current lock to user. For instance, having
> duration of current wait event, user can determine if backend is getting
> stuck on particular event without sampling.

Although this is appealing from a functionality perspective, I predict
that the overhead of it will be quite significant.  We'd need to do
gettimeofday() every time somebody calls pgstat_report_waiting(), and
if we do that every time we (say) initiate a disk write, I think
that's going to be pretty costly even on platforms where
gettimeofday() is fast, let alone those where it's slow.  If somebody
does a sequential scan of a non-cached table, I don't care to add a
gettimeofday() call for every read().

> 2) Accumulate per backend statistics about each wait event type: number of
> occurrences and total duration. With this statistics user can identify
> system bottlenecks again without sampling.

This is even more expensive: now you've got to do TWO gettimeofday()
calls per wait event, one when it starts and one when it ends.  Plus,
you've got to do updates to a backend-local hash table.  It might be
that this is tolerable for wait events that only happen in contended
paths - e.g. when a lock or lwlock acquisition actually blocks, or
when we decide to do a spin-delay - but I suspect it's going to stink
for things that happen frequently even when things are going well,
like reading and writing blocks.  So the effect will either add a lot
of performance overhead, or else we just can't add some of the wait
events that people would like to see.

I really think we should do the simple thing first.  If we make this
complicated and add lots of bells and whistles, it is going to be much
harder to get anything committed, because there will be more things
for somebody to object to.  If we start with something simple, we can
always improve it later, if we are confident that the design for
improving it is good.  The hardest thing about a proposal like this is
going to be getting down the overhead to a level that is acceptable,
and every expansion of the basic design that has been proposed -
gathering more than one byte of information, or gathering times, or
having the backend update a tracking hash - adds *significant*
overhead to the design I proposed.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



В списке pgsql-hackers по дате отправления:

Предыдущее
От: Fujii Masao
Дата:
Сообщение: Re: Default Roles (was: Additional role attributes)
Следующее
От: Robert Haas
Дата:
Сообщение: Re: security labels on databases are bad for dump & restore