Re: Dynamic LWLock tracing via pg_stat_lwlock (proof of concept)

Поиск
Список
Период
Сортировка
От Andres Freund
Тема Re: Dynamic LWLock tracing via pg_stat_lwlock (proof of concept)
Дата
Msg-id 20141002095014.GH7158@awork2.anarazel.de
обсуждение исходный текст
Ответ на Dynamic LWLock tracing via pg_stat_lwlock (proof of concept)  (Ilya Kosmodemiansky <ilya.kosmodemiansky@postgresql-consulting.com>)
Ответы Re: Dynamic LWLock tracing via pg_stat_lwlock (proof of concept)
Re: Dynamic LWLock tracing via pg_stat_lwlock (proof of concept)
Re: Dynamic LWLock tracing via pg_stat_lwlock (proof of concept)
Список pgsql-hackers
On 2014-10-01 18:19:05 +0200, Ilya Kosmodemiansky wrote:
> I have a patch which is actually not commitfest-ready now, but it
> always better to start discussing proof of concept having some patch
> instead of just an idea.

That's a good way to start work on a topic like this.

> From an Oracle DBA's point of view, currently we have a lack of
> performance diagnostics tools.

Not just from a oracle DBA POV ;). Generally.

So I'm happy to see some focus on this!

> Saying that, principally they mean an
> Oracle Wait Interface analogue. The Basic idea is to have counters or
> sensors all around database kernel to measure what a particular
> backend is currently waiting for and how long/how often it waits.

Yes, I can see that. I'm not sure whether lwlocks are the primary point
I'd start with though. In many cases you'll wait on so called
'heavyweight' locks too...

> Suppose we have a PostgreSQL instance under heavy write workload, but
> we do not know any details. We could pull from time to time
> pg_stat_lwlock function which would say pid n1 currently in
> WALWriteLock and pid n2 in WALInsertLock. That means we should think
> about write ahead log tuning. Or pid n1 is in some clog-related
> LWLock, which means we need move clog to ramdisk. This is a stupid
> example, but it shows how useful LWLock tracing could be for DBAs.
> Even better idea is to collect daily LWLock distribution, find most
> frequent of them etc.

I think it's more complicated than that - but I also think it'd be a
great help for DBAs and us postgres hackers.

> An idea of this patch is to trace LWLocks with the lowest possible
> performance impact. We put integer lwLockID into procarray, then
> acquiring the LWLock we put its id to procarray and now we could pull
> procarray using a function to see if particular pid holds LWLock.

But a backend can hold more than one lwlock at the same time? I don't
think that's something we can ignore.

> Not
> perfect, but if we see sometimes somebody consumes a lot of particular
> LWLocks, we could investigate this matter in a more precise way using
> another tool. Something like that was implemented in the attached
> patch:
> 
> issuing pgbench  -c 50 -t 1000 -j 50
> 
> we have something like that:
> 
> postgres=# select now(),* from pg_stat_lwlock ;
>               now              | lwlockid | pid
> -------------------------------+----------+------
>  2014-10-01 15:11:43.848765+02 |       57 | 4257
> (1 row)

Hm. So you just collect the lwlockid and the pid? That doesn't sound
particularly interesting to me. In my opinion, you'd need at least:
* pid
* number of exclusive/shared acquirations
* number of exclusive/shared acquirations that had to wait
* total wait time of exclusive/shared acquirations

> Questions.
> 
> 1. I've decided to put pg_stat_lwlock into extension pg_stat_lwlock
> (simply for test purposes). Is it OK, or better to implement it
> somewhere inside pg_catalog or in another extension (for example
> pg_stat_statements)?

I personally am doubtful that it makes much sense to move this into an
extension. It'll likely be tightly enough interlinked to backend code
that I don't see the point. But I'd not be surprised if others feel
differently.

I generally don't think you'll get interesting data without a fair bit
of additional work.

The first problem that comes to my mind about collecting enough data is
that we have a very large number of lwlocks (fixed_number + 2 *
shared_buffers). One 'trivial' way of implementing this is to have a per
backend array collecting the information, and then a shared one
accumulating data from it over time. But I'm afraid that's not going to
fly :(. Hm. With the above sets of stats that'd be ~50MB per backend...

Perhaps we should somehow encode this different for individual lwlock
tranches? It's far less problematic to collect all this information for
all but the buffer lwlocks...

Greetings,

Andres Freund

-- Andres Freund                       http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training &
Services



В списке pgsql-hackers по дате отправления:

Предыдущее
От: Andres Freund
Дата:
Сообщение: Re: Replication identifiers, take 3
Следующее
От: Kyotaro HORIGUCHI
Дата:
Сообщение: Re: Escaping from blocked send() reprised.