Re: Dynamic LWLock tracing via pg_stat_lwlock (proof of concept)

Поиск
Список
Период
Сортировка
От Bruce Momjian
Тема Re: Dynamic LWLock tracing via pg_stat_lwlock (proof of concept)
Дата
Msg-id 20141003183911.GI14522@momjian.us
обсуждение исходный текст
Ответ на Re: Dynamic LWLock tracing via pg_stat_lwlock (proof of concept)  (Ilya Kosmodemiansky <ilya.kosmodemiansky@postgresql-consulting.com>)
Список pgsql-hackers
On Fri, Oct  3, 2014 at 05:53:59PM +0200, Ilya Kosmodemiansky wrote:
> > What that gives us is almost zero overhead on backends, high
> > reliability, and the ability of the scan daemon to give higher weights
> > to locks that are held longer.  Basically, if you just stored the locks
> > you held and released, you either have to add timing overhead to the
> > backends, or you have no timing information collected.  By scanning
> > active locks, a short-lived lock might not be seen at all, while a
> > longer-lived lock might be seen by multiple scans.  What that gives us
> > is a weighting of the lock time with almost zero overhead.   If we want
> > finer-grained lock statistics, we just increase the number of scans per
> > second.
> 
> So I could add the function, which will accumulate the data in some
> view/table (with weights etc). How it should be called? From specific
> process? From some existing maintenance process such as autovacuum?
> Should I implement GUC for example lwlock_pull_rate, 0 for off, from 1
> to 10 for 1 to 10 samples pro second?

Yes, that's the right approach.  You would implement it as a background
worker process, and a GUC as you described.  I assume it would populate
a view like we already do for the pg_stat_ views, and the counters could
be reset somehow.  I would pattern it after how we handle the pg_stat_
views.

> > I am assuming almost no one cares about the number of locks, but rather
> > they care about cummulative lock durations.
> 
> Oracle and DB2 measure both,  cummulative durations and counts.

Well, the big question is whether counts are really useful.  You did a
good job of explaining that when you find heavy clog or xlog lock usage
you would adjust your server.  What I am unclear about is why you would
adjust your server based on lock _counts_ and not cummulative lock
duration.  I don't think we want the overhead of accumulating
information that isn't useful.

> > I am having trouble seeing any other option that has such a good
> > cost/benefit profile.
> 
> At least cost. In Oracle documentation clearly stated, that it is all
> about diagnostic convenience, performance impact is significant.

Oh, we don't want to go there then, and I think this approach is a big
win.

--  Bruce Momjian  <bruce@momjian.us>        http://momjian.us EnterpriseDB
http://enterprisedb.com
 + Everyone has their own god. +



В списке pgsql-hackers по дате отправления:

Предыдущее
От: Robert Haas
Дата:
Сообщение: Re: test_shm_mq failing on anole (was: Sending out a request for more buildfarm animals?)
Следующее
От: Bruce Momjian
Дата:
Сообщение: Re: Fixed xloginsert_locks for 9.4