Обсуждение: Waits monitoring

Поиск
Список
Период
Сортировка

Waits monitoring

От
Ildus Kurbangaliev
Дата:
Hello.

Currently, PostgreSQL offers many metrics for monitoring. However, detailed
monitoring of waits is still not supported yet. Such monitoring would 
let dba know how long backend waited for particular event and therefore identify
bottlenecks. This functionality is very useful, especially for highload
databases. Metric for waits monitoring are provided by many popular commercial
DBMS. We currently have requests of this feature from companies migrating to
PostgreSQL from commercial DBMS. Thus, I think it would be nice for PostgreSQL
to have it too. 

Main problem of monitoring waits is that waits could be very short and it's
hard to implement monitoring so that it introduce very low overhead.
For instance, there were couple of tries to implement LWLocks monitoring for
PostgreSQL:

Attached patch implements waits monitoring for PostgreSQL. Following of monitoring was implemented:

1) History of waits (by sampling)
2) Waits profiling
3) Trace backend waits to file

Monitored waits:

1) HW Locks (lock.c)
2) LW Locks (lwlock.c)
3) IO (md.c)
3) Network (be-secure.c)
5) Latch (pg_latch.c)

Two new GUC variables added:

* waits_monitoring = on/off - turn on/off all monitoring
* waits_flush_period = on/off - defines period in milliseconds, after that
backend local waits will be flushed to shared memory

## Profiling

Implementation: before some wait each process calls function `StartWait`. This
function saves wait's data and remembers current time. After the wait process
calls `StopWait` that increases count, calculates time of wait and saves this
info to backend's local memory. Every `waits_flush_period` milliseconds 
collected data from local memory will be flushed to shared memory.

In Unix-systems `gettimeofday` function is used that gives enough accuracy
for measuring wait times in microseconds.

`pg_stat_wait_get_profile` view show collected information from shared memory. 
At this all functions implemented in `pg_stat_wait` extension. In the future
maybe it has the point to move them to base codebase.

I used atomic functions to avoid locks in the backends. By using TAS operation
backend can skip writing to shared memory, when it's reading by some query,
so user always gets consistent data.

I also did little refactoring in lwlock.c for lwlocks grouping support. 
Now LWLock contains `group` field that used for lwlock identification .
Array `main_lwlock_groups` contains offsets of each group in main tranche.

## Waits history

Implementation: collector background worker cycle through the list of processes 
(every `history_period` ms) and stores current timestamp and waits info to history.

View `pg_stat_wait_history` can be used for reading the history.

Sampling of history uses double buffering. Backend has two 
blocks for current waits, and when collector reads from one, backend writes
to other. Like any sampling it can skip some waits, but by using this method
backends and collector does not require locks and don't block each other.

History GUC parameters:

* shared_preload_libraries = 'pg_stat_wait.so' - for background worker that
will be sample waits.
* pg_stat_wait.history = on/off - turn on/off history recording
* pg_stat_wait.history_size = how many records keep in history
* pg_stat_wait.history_period - period in millseconds between the sampling

## Views

* pg_stat_wait_history - waits history
* pg_stat_wait_profile - profile
* pg_stat_wait_current - current waits
* pg_wait_events - full list of class and event of waits

## Functions

* pg_wait_class_list() - returns wait classes
* pg_wait_event_list() - returns wait events
* pg_stat_wait_get_history() - returns history
* pg_stat_wait_get_profile(backend_pid int4, reset bool) - returns current profile,
`reset` parameter can be used for resetting data
* pg_stat_wait_get_current(backend_pid int4) - current waits
* pg_stat_wait_reset_profile() - resets profile

If `backend_pid` is NULL, these functions returns information about all
processes.

### Trace functions

* pg_start_trace(backend_pid int4, filename cstring) - start waits timing to file
* pg_is_in_trace(backend_pid int4) - returns true if tracing is on in process
* pg_stop_trace(backend_pid int4) - stops waits tracing to file

## Testing

I tested patch mostly on Linux and FreeBSD (Intel processors). PostgreSQL support very wide
variety of platforms and OS. I hope community will help me with testing on other platforms. 

Configuration: Intel(R) Xeon(R) CPU X5675@3.07GHz, 24 cores, pgbench initialized
with -S 200, fsync off, database on tmpfs. I got stable results only in
select queries. With update queries results variety between runs is too wide to
measure overhead of monitoring. Here is sample results:

Monitoring off:
 
[ildus@1-i-kurbangaliev postgrespro]$ pgbench -S b1 -c 96 -j 4 -T 300
starting vacuum...end.
transaction type: SELECT only
scaling factor: 500
query mode: simple
number of clients: 96
number of threads: 4
duration: 300 s
number of transactions actually processed: 39316058
latency average: 0.733 ms
tps = 131052.608425 (including connections establishing)
tps = 131076.202085 (excluding connections establishing)
[ildus@1-i-kurbangaliev postgrespro]$ 
 
Monitoring on:
 
[ildus@1-i-kurbangaliev postgrespro]$ pgbench -S b1 -c 96 -j 4 -T 300
starting vacuum...end.
transaction type: SELECT only
scaling factor: 500
query mode: simple
number of clients: 96
number of threads: 4
duration: 300 s
number of transactions actually processed: 39172607
latency average: 0.735 ms
tps = 130574.626755 (including connections establishing)
tps = 130600.767440 (excluding connections establishing)

According to this tests overhead is less than 1%

Monitoring results look like this:

1) select * from pg_stat_wait_profile;
 
  pid  | class_id | class_name | event_id | event_name        | wait_time | wait_count
-------+----------+------------+----------+--------------------------+-----------+------------
 26393 |        1 | LWLocks    |        0 | BufFreelistLock          |        73 |          3
 26396 |        1 | LWLocks    |        0 | BufFreelistLock          |       259 |          1
 26395 |        1 | LWLocks    |        0 | BufFreelistLock          |       150 |          2
 26394 |        1 | LWLocks    |        0 | BufFreelistLock          |        12 |          1
 26394 |        1 | LWLocks    |        7 | WALBufMappingLock        |       391 |          2
 26395 |        1 | LWLocks    |       38 | BufferPartitionLock      |        17 |          1
 26393 |        1 | LWLocks    |       39 | LockManagerPartitionLock |        30 |          3
 26395 |        1 | LWLocks    |       39 | LockManagerPartitionLock |        14 |          1
 26392 |        1 | LWLocks    |       41 | SharedBufferLocks        |      2687 |          9
 26393 |        1 | LWLocks    |       41 | SharedBufferLocks        |      1633 |         10
 26396 |        1 | LWLocks    |       41 | SharedBufferLocks        |      3154 |         16
 26394 |        1 | LWLocks    |       41 | SharedBufferLocks        |      1597 |         11
 26395 |        1 | LWLocks    |       41 | SharedBufferLocks        |      2936 |         18
 26394 |        2 | Locks      |        4 | Transaction              |       457 |          1
 26393 |        2 | Locks      |        4 | Transaction              |       554 |          1
 26395 |        3 | Storage    |        0 | READ                     |      7292 |         60
 26391 |        3 | Storage    |        0 | READ                     |     36604 |       1526
 26396 |        3 | Storage    |        0 | READ                     |      6512 |         67
 26392 |        3 | Storage    |        0 | READ                     |      7411 |         62
 26344 |        3 | Storage    |        0 | READ                     |       560 |        100
 26393 |        3 | Storage    |        0 | READ                     |      8227 |         79
 26394 |        3 | Storage    |        0 | READ                     |      8385 |         73
 26342 |        3 | Storage    |        0 | READ                     |         9 |          1
 26340 |        4 | Latch      |        0 | Latch                    |  51270008 |        256
 26341 |        4 | Latch      |        0 | Latch                    |  49850661 |         57
 26394 |        5 | Network    |        0 | READ                     |      7660 |         71
 
 
2) select * from pg_stat_wait_current;
 
  pid  |           sample_ts           | class_id | class_name | event_id |    event_name     | wait_time |  p1  |  p2   |  p3   | p4 |  p5
-------+-------------------------------+----------+------------+----------+-------------------+-----------+------+-------+-------+----+-------
 26437 | 2015-06-25 11:00:13.561607-04 |        1 | LWLocks    |       11 | CLogControlLock   |   1002935 |    1 |     0 |     0 |  0 |     0
 26438 | 2015-06-25 11:00:13.561607-04 |        1 | LWLocks    |       41 | SharedBufferLocks |   1002970 |    1 |     0 |     0 |  0 |     0
 26440 | 2015-06-25 11:00:13.561607-04 |        3 | Storage    |        0 | READ              |   1001948 | 1663 | 16384 | 49759 |  0 |  9821
 26340 | 2015-06-25 11:00:13.561607-04 |        3 | Storage    |        1 | WRITE             |    995053 | 1663 | 16384 | 49752 |  0 | 53093
 26341 | 2015-06-25 11:00:13.561607-04 |        4 | Latch      |        0 | Latch             |    986350 |    0 |     0 |     0 |  0 |     0
 26339 | 2015-06-25 11:00:13.561607-04 |        4 | Latch      |        0 | Latch             |    869452 |    0 |     0 |     0 |  0 |     0
 26439 | 2015-06-25 11:00:13.561607-04 |        5 | Network    |        1 | WRITE             |   1002672 |    0 |     0 |     0 |  0 |     0
 
3) select * from pg_stat_wait_history;
 
  pid  |           sample_ts           | class_id | class_name | event_id | event_name | wait_time |  p1  |  p2   |  p3   | p4 | p5
-------+-------------------------------+----------+------------+----------+------------+-----------+------+-------+-------+----+----
 27564 | 2015-06-25 11:35:29.331803-04 |        3 | Storage    |        0 | READ       |     19855 | 1664 |     0 | 12171 |  0 |  0
 27568 | 2015-06-25 11:36:29.351151-04 |        3 | Storage    |        0 | READ       |      2216 | 1664 |     0 | 12173 |  0 |  0
 27567 | 2015-06-25 11:35:32.493138-04 |        3 | Storage    |        0 | READ       |     17818 | 1664 |     0 | 11917 |  0 |  0
 27567 | 2015-06-25 11:35:33.671287-04 |        3 | Storage    |        0 | READ       |      5700 | 1663 | 16384 | 12036 |  0 |  0
 27567 | 2015-06-25 11:35:56.731703-04 |        5 | Network    |        1 | WRITE      |     18339 |    0 |     0 |     0 |  0 |  0

-- 
Ildus Kurbangaliev
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company
Вложения

Re: Waits monitoring

От
Haribabu Kommi
Дата:
On Thu, Jul 9, 2015 at 1:52 AM, Ildus Kurbangaliev
<i.kurbangaliev@postgrespro.ru> wrote:
> Hello.
>
> Currently, PostgreSQL offers many metrics for monitoring. However, detailed
> monitoring of waits is still not supported yet. Such monitoring would
> let dba know how long backend waited for particular event and therefore
> identify
> bottlenecks. This functionality is very useful, especially for highload
> databases. Metric for waits monitoring are provided by many popular
> commercial
> DBMS. We currently have requests of this feature from companies migrating to
> PostgreSQL from commercial DBMS. Thus, I think it would be nice for
> PostgreSQL
> to have it too.

Yes, It is good have such wait monitoring views in PostgreSQL.
you can add this patch to the open commitfest.

Regards,
Hari Babu
Fujitsu Australia



Re: Waits monitoring

От
Fujii Masao
Дата:
On Thu, Jul 9, 2015 at 2:12 PM, Haribabu Kommi <kommi.haribabu@gmail.com> wrote:
> On Thu, Jul 9, 2015 at 1:52 AM, Ildus Kurbangaliev
> <i.kurbangaliev@postgrespro.ru> wrote:
>> Hello.
>>
>> Currently, PostgreSQL offers many metrics for monitoring. However, detailed
>> monitoring of waits is still not supported yet. Such monitoring would
>> let dba know how long backend waited for particular event and therefore
>> identify
>> bottlenecks. This functionality is very useful, especially for highload
>> databases. Metric for waits monitoring are provided by many popular
>> commercial
>> DBMS. We currently have requests of this feature from companies migrating to
>> PostgreSQL from commercial DBMS. Thus, I think it would be nice for
>> PostgreSQL
>> to have it too.
>
> Yes, It is good have such wait monitoring views in PostgreSQL.
> you can add this patch to the open commitfest.

Robert and Amit proposed very similar idea and its patch is now being
reviewed in the current CommitFest. So I think that you should attend
that discussion rather than starting new one.

http://www.postgresql.org/message-id/CA+TgmoYd3GTz2_mJfUHF+RPe-bCy75ytJeKVv9x-o+SonCGApw@mail.gmail.com

Regards,

-- 
Fujii Masao



Re: Waits monitoring

От
Ildus Kurbangaliev
Дата:

On Jul 9, 2015, at 5:18 PM, Fujii Masao <masao.fujii@gmail.com> wrote:

On Thu, Jul 9, 2015 at 2:12 PM, Haribabu Kommi <kommi.haribabu@gmail.com> wrote:
On Thu, Jul 9, 2015 at 1:52 AM, Ildus Kurbangaliev
<i.kurbangaliev@postgrespro.ru> wrote:
Hello.

Currently, PostgreSQL offers many metrics for monitoring. However, detailed
monitoring of waits is still not supported yet. Such monitoring would
let dba know how long backend waited for particular event and therefore
identify
bottlenecks. This functionality is very useful, especially for highload
databases. Metric for waits monitoring are provided by many popular
commercial
DBMS. We currently have requests of this feature from companies migrating to
PostgreSQL from commercial DBMS. Thus, I think it would be nice for
PostgreSQL
to have it too.

Yes, It is good have such wait monitoring views in PostgreSQL.
you can add this patch to the open commitfest.

Robert and Amit proposed very similar idea and its patch is now being
reviewed in the current CommitFest. So I think that you should attend
that discussion rather than starting new one.

http://www.postgresql.org/message-id/CA+TgmoYd3GTz2_mJfUHF+RPe-bCy75ytJeKVv9x-o+SonCGApw@mail.gmail.com

This thread raising waits monitoring problem much more general than just adding one more column to pg_stat_activity. This patch contains profiling, history and much more details about current wait events. We appreciate joint work in this direction, but we have just started with publishing our current work and raise waits monitoring question in its full weight.

----
Ildus Kurbangaliev
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

Re: Waits monitoring

От
Alexander Korotkov
Дата:
On Thu, Jul 9, 2015 at 6:00 PM, Ildus Kurbangaliev <i.kurbangaliev@postgrespro.ru> wrote:

On Jul 9, 2015, at 5:18 PM, Fujii Masao <masao.fujii@gmail.com> wrote:

On Thu, Jul 9, 2015 at 2:12 PM, Haribabu Kommi <kommi.haribabu@gmail.com> wrote:
On Thu, Jul 9, 2015 at 1:52 AM, Ildus Kurbangaliev
<i.kurbangaliev@postgrespro.ru> wrote:
Hello.

Currently, PostgreSQL offers many metrics for monitoring. However, detailed
monitoring of waits is still not supported yet. Such monitoring would
let dba know how long backend waited for particular event and therefore
identify
bottlenecks. This functionality is very useful, especially for highload
databases. Metric for waits monitoring are provided by many popular
commercial
DBMS. We currently have requests of this feature from companies migrating to
PostgreSQL from commercial DBMS. Thus, I think it would be nice for
PostgreSQL
to have it too.

Yes, It is good have such wait monitoring views in PostgreSQL.
you can add this patch to the open commitfest.

Robert and Amit proposed very similar idea and its patch is now being
reviewed in the current CommitFest. So I think that you should attend
that discussion rather than starting new one.

http://www.postgresql.org/message-id/CA+TgmoYd3GTz2_mJfUHF+RPe-bCy75ytJeKVv9x-o+SonCGApw@mail.gmail.com

This thread raising waits monitoring problem much more general than just adding one more column to pg_stat_activity. This patch contains profiling, history and much more details about current wait events. We appreciate joint work in this direction, but we have just started with publishing our current work and raise waits monitoring question in its full weight.

Ildus,

I see that patch of Robert and Amit adds current lock type into PgBackendStatus and, in turn, adds it into pg_stat_activity. This is small part of what you implement in waits monitoring patch. Dividing big features into small pieces makes it much easier to include PostgreSQL. I think you should join work of Robert and Amit in inclusion current lock type into PgBackendStatus. There are things you could help them to improve. And then you can rebase waits monitoring patch according to it.

------
Alexander Korotkov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

Re: Waits monitoring

От
Simon Riggs
Дата:
On 9 July 2015 at 21:10, Alexander Korotkov <a.korotkov@postgrespro.ru> wrote:
 
I see that patch of Robert and Amit adds current lock type into PgBackendStatus and, in turn, adds it into pg_stat_activity. This is small part of what you implement in waits monitoring patch. Dividing big features into small pieces makes it much easier to include PostgreSQL. I think you should join work of Robert and Amit in inclusion current lock type into PgBackendStatus. There are things you could help them to improve. And then you can rebase waits monitoring patch according to it.

I see the need for both current wait information and for cumulative historical detail.

I'm willing to wait before reviewing this, but not for more than 1 more CF.

Andres, please decide whether we should punt to next CF now, based upon other developments. Thanks

--
Simon Riggs                http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Re: Waits monitoring

От
Andres Freund
Дата:
On 2015-09-04 23:44:21 +0100, Simon Riggs wrote:
> I see the need for both current wait information and for cumulative
> historical detail.
> 
> I'm willing to wait before reviewing this, but not for more than 1 more CF.
> 
> Andres, please decide whether we should punt to next CF now, based upon
> other developments. Thanks

I think we can do some of the work concurrently - the whole lwlock
infrastructure piece is rather independent and what currently most of
the arguments are about. I agree that the actual interface will need to
be coordinated.

Ildus, could you please review Amit & Robert's patch?

Greetings,

Andres Freund



Re: Waits monitoring

От
Amit Kapila
Дата:
On Sun, Sep 6, 2015 at 5:58 PM, Andres Freund <andres@anarazel.de> wrote:
>
> On 2015-09-04 23:44:21 +0100, Simon Riggs wrote:
> > I see the need for both current wait information and for cumulative
> > historical detail.
> >
> > I'm willing to wait before reviewing this, but not for more than 1 more CF.
> >
> > Andres, please decide whether we should punt to next CF now, based upon
> > other developments. Thanks
>
> I think we can do some of the work concurrently - the whole lwlock
> infrastructure piece is rather independent and what currently most of
> the arguments are about. I agree that the actual interface will need to
> be coordinated.
>
> Ildus, could you please review Amit & Robert's patch?
>

Are you talking about patch where I have fixed few issues in Robert's
patch [1] or the original patch [3] written by me.

IIUC, this is somewhat different than what Ildus is doing in his latest
patch [2].

Sorry, but I think there is some confusion about that patch [1] development.
Let me try to summarize what I think has happened and why I feel there is
some confusion.  Initially Robert has proposed the idea of having a
column in pg_stat_activity for wait_event on hackers and then I wrote an
initial patch so that we can discuss the same in a more meaningful way
and wanted to extend that patch based on consensus and what any other
patch like Waits monitoring would need from it.  In-between Ildus has proposed
Waits monitoring patch and also started hacking the other pg_stat_activity
patch and that was the starting point of confusion.  Now I think that the current
situation is there's one patch [1] of Robert (with some fixes by myself) for standardising
LWLock stuff, so that we can build pg_stat_activity patch on top of it and then
a patch [2] from Ildus for doing something similar but I think it hasn't used Robert's
patch.

I think the intention of having multiple people develop same patch is to get
the work done faster, but I think here it is causing confusion and I feel that
is one reason the patch is getting dragged as well.  Let me know your thoughts
about what is the best way to proceed?


P.S. - This mail is not to point anything wrong with any particular individual,
rather about the development of a particular patch getting haphazard because
of some confusion.  I am not sure that this is the right thread to write about
it, but as it has been asked here to review the patch in other related thread,
so I have mentioned my thoughts on the same. 


With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

Re: Waits monitoring

От
Alexander Korotkov
Дата:
On Mon, Sep 7, 2015 at 6:28 AM, Amit Kapila <amit.kapila16@gmail.com> wrote:
On Sun, Sep 6, 2015 at 5:58 PM, Andres Freund <andres@anarazel.de> wrote:
>
> On 2015-09-04 23:44:21 +0100, Simon Riggs wrote:
> > I see the need for both current wait information and for cumulative
> > historical detail.
> >
> > I'm willing to wait before reviewing this, but not for more than 1 more CF.
> >
> > Andres, please decide whether we should punt to next CF now, based upon
> > other developments. Thanks
>
> I think we can do some of the work concurrently - the whole lwlock
> infrastructure piece is rather independent and what currently most of
> the arguments are about. I agree that the actual interface will need to
> be coordinated.
>
> Ildus, could you please review Amit & Robert's patch?
>

Are you talking about patch where I have fixed few issues in Robert's
patch [1] or the original patch [3] written by me.

AFAICS, Andres meant [3]. 


For me, major issues of [3] are:
1) We fit all information about current wait event into single byte.
2) We put this byte into PgBackendStatus.
Both of these issues are essential for design of [3].

Issue #1 means that we actually can expose to user only event type (until there are less then 256 types) with no parameters [4].
Issue #2 means that we have to monitor only backend not auxiliary processes [5].

Both of these issues seems to be serious limitations for me. It makes me think we either need a different design or need to expose current wait event in some other way additionally. Anyway, I think attracting more attention to this problem is good. It would be very nice if Simon or Andres give review of this.


------
Alexander Korotkov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

Re: Waits monitoring

От
Ildus Kurbangaliev
Дата:
On Mon, 7 Sep 2015 08:58:15 +0530
Amit Kapila <amit.kapila16@gmail.com> wrote:

> On Sun, Sep 6, 2015 at 5:58 PM, Andres Freund <andres@anarazel.de>
> wrote:
> >
> > On 2015-09-04 23:44:21 +0100, Simon Riggs wrote:
> > > I see the need for both current wait information and for
> > > cumulative historical detail.
> > >
> > > I'm willing to wait before reviewing this, but not for more than
> > > 1 more
> CF.
> > >
> > > Andres, please decide whether we should punt to next CF now,
> > > based upon other developments. Thanks
> >
> > I think we can do some of the work concurrently - the whole lwlock
> > infrastructure piece is rather independent and what currently most
> > of the arguments are about. I agree that the actual interface will
> > need to be coordinated.
> >
> > Ildus, could you please review Amit & Robert's patch?
> >
> 
> Are you talking about patch where I have fixed few issues in Robert's
> patch [1] or the original patch [3] written by me.
> 
> IIUC, this is somewhat different than what Ildus is doing in his
> latest patch [2].
> 
> Sorry, but I think there is some confusion about that patch [1]
> development. Let me try to summarize what I think has happened and
> why I feel there is some confusion.  Initially Robert has proposed
> the idea of having a column in pg_stat_activity for wait_event on
> hackers and then I wrote an initial patch so that we can discuss the
> same in a more meaningful way and wanted to extend that patch based
> on consensus and what any other patch like Waits monitoring would
> need from it.  In-between Ildus has proposed
> Waits monitoring patch and also started hacking the other
> pg_stat_activity patch and that was the starting point of confusion.
> Now I think that the current
> situation is there's one patch [1] of Robert (with some fixes by
> myself) for standardising
> LWLock stuff, so that we can build pg_stat_activity patch on top of
> it and then
> a patch [2] from Ildus for doing something similar but I think it
> hasn't used Robert's
> patch.
> 
> I think the intention of having multiple people develop same patch is
> to get the work done faster, but I think here it is causing confusion
> and I feel that
> is one reason the patch is getting dragged as well.  Let me know your
> thoughts
> about what is the best way to proceed?
> 
> [1] -
> http://www.postgresql.org/message-id/CAA4eK1KdeC1Tm5ya9gkV85Vtn4qqsRxzKJrU-tu70G_tL1xkFA@mail.gmail.com
> [2] -
> http://www.postgresql.org/message-id/3F71DA37-A17B-4961-9908-016E6323E612@postgrespro.ru
> [3] -
> http://www.postgresql.org/message-id/CAA4eK1Kt2e6XhViGisR5o1aC9NfO0j2wTb8N0ggD1_JkLdeKdQ@mail.gmail.com
> 
> P.S. - This mail is not to point anything wrong with any particular
> individual,
> rather about the development of a particular patch getting haphazard
> because of some confusion.  I am not sure that this is the right
> thread to write about
> it, but as it has been asked here to review the patch in other related
> thread,
> so I have mentioned my thoughts on the same.

About [1] and [2]. They are slightly conflicting, but if [1] is
going to be committed I can easily use it in [2]. And it will simplify
my patch, so I have no objections here. And the same time [3] can be
significantly simplified and improved on top of [1] and [2].

For example this code from [3]:

static void
LWLockReportStat(LWLock *lock)
{int            lockid;const char    *tranche_name;
tranche_name = T_NAME(lock);
if (strcmp(tranche_name, "main") == 0){    lockid = T_ID(lock);
    if (lockid < NUM_INDIVIDUAL_LWLOCKS)        pgstat_report_wait_event(LWLockTypeToWaitEvent(lockid));    else if
(lockid>= BUFFER_MAPPING_LWLOCK_OFFSET &&             lockid <  LOCK_MANAGER_LWLOCK_OFFSET)
pgstat_report_wait_event(WaitEvent_BufferMappingLock);   else if (lockid >= LOCK_MANAGER_LWLOCK_OFFSET &&
lockid<    PREDICATELOCK_MANAGER_LWLOCK_OFFSET)    pgstat_report_wait_event(WaitEvent_LockManagerLock);    else if
(lockid>=    PREDICATELOCK_MANAGER_LWLOCK_OFFSET&& lockid <    NUM_FIXED_LWLOCKS)
pgstat_report_wait_event(WaitEvent_PredicateManagerLock);   else pgstat_report_wait_event(WaitEvent_OtherLWLock); }else
if(strcmp(tranche_name, "WALInsertLocks") == 0)    pgstat_report_wait_event(WaitEvent_WALInsertLocks);else if
(strcmp(tranche_name,"ReplicationOrigins") == 0)    pgstat_report_wait_event(WaitEvent_ReplicationOrigins);else
pgstat_report_wait_event(WaitEvent_OtherTrancheLWLock);
}

can be changed to something like:

#define LWLOCK_WAIT_ID(lock) \(lock->tranche == 0? T_ID(lock) : lock->tranche +NUM_INDIVIDUAL_LWLOCKS)

static void
LWLockReportStat(LWLock *lock)
{int offset = <pgstat offset for lwlocks>;pgstat_report_wait_event(offset + LWLOCK_WAIT_ID(lock));
}

So it will almost not affect to performance of LWLockAcquire.

----
Ildus Kurbangaliev
Postgres Professional: http://www.postgrespro.com
<http://www.postgrespro.com/> The Russian Postgres Company



Re: Waits monitoring

От
Robert Haas
Дата:
On Mon, Sep 7, 2015 at 7:33 AM, Ildus Kurbangaliev
<i.kurbangaliev@postgrespro.ru> wrote:
>> > Ildus, could you please review Amit & Robert's patch?
>> [1] -
>> http://www.postgresql.org/message-id/CAA4eK1KdeC1Tm5ya9gkV85Vtn4qqsRxzKJrU-tu70G_tL1xkFA@mail.gmail.com
> About [1] and [2]. They are slightly conflicting, but if [1] is
> going to be committed I can easily use it in [2]. And it will simplify
> my patch, so I have no objections here. And the same time [3] can be
> significantly simplified and improved on top of [1] and [2].

Great - let's try to deal with [1] first, then.

Does anyone wish to object to me committing that part?

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



Re: Waits monitoring

От
Alexander Korotkov
Дата:
On Tue, Sep 8, 2015 at 8:01 PM, Robert Haas <robertmhaas@gmail.com> wrote:
On Mon, Sep 7, 2015 at 7:33 AM, Ildus Kurbangaliev
<i.kurbangaliev@postgrespro.ru> wrote:
>> > Ildus, could you please review Amit & Robert's patch?
>> [1] -
>> http://www.postgresql.org/message-id/CAA4eK1KdeC1Tm5ya9gkV85Vtn4qqsRxzKJrU-tu70G_tL1xkFA@mail.gmail.com
> About [1] and [2]. They are slightly conflicting, but if [1] is
> going to be committed I can easily use it in [2]. And it will simplify
> my patch, so I have no objections here. And the same time [3] can be
> significantly simplified and improved on top of [1] and [2].

Great - let's try to deal with [1] first, then.

Does anyone wish to object to me committing that part?

No objections from me.

------
Alexander Korotkov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company 

Re: Waits monitoring

От
Kyotaro HORIGUCHI
Дата:
Hi,

> Great - let's try to deal with [1] first, then.
> 
> Does anyone wish to object to me committing that part?

I have no objection to commiting this, too. But some fix would be
needed.

===
Generated lwlocknames.[ch] don't have header comment because
generate-lwlocknames.pl writes them into wrong place.

lmgr/Makefile looks to have some mistakes.
- lwlocknames.c is not generated from (or using) lwlocknames.c  so the entry "lwlocknames.c: lwlocknames.h" doesn't
looksto  be appropriate.
 
- maintainer-clean in lmgr/Makefile forgets to remove lwlocknames.c.

lwlock.c now includes lwlockname.c so such entry would be
needed. (But might not be needed because it is naturally
generated along with .h)

Perhaps uncommenting in pg_config_manual.h is left alone.
(This is not included in the diff below)


regards,


At Tue, 8 Sep 2015 13:01:23 -0400, Robert Haas <robertmhaas@gmail.com> wrote in
<CA+TgmoaKyg2p5ni_HeVxp-7U7AJ3B9k3rqPshmcBA6Zp1HE2iQ@mail.gmail.com>
> On Mon, Sep 7, 2015 at 7:33 AM, Ildus Kurbangaliev
> <i.kurbangaliev@postgrespro.ru> wrote:
> >> > Ildus, could you please review Amit & Robert's patch?
> >> [1] -
> >> http://www.postgresql.org/message-id/CAA4eK1KdeC1Tm5ya9gkV85Vtn4qqsRxzKJrU-tu70G_tL1xkFA@mail.gmail.com
> > About [1] and [2]. They are slightly conflicting, but if [1] is
> > going to be committed I can easily use it in [2]. And it will simplify
> > my patch, so I have no objections here. And the same time [3] can be
> > significantly simplified and improved on top of [1] and [2].
> 
> Great - let's try to deal with [1] first, then.
> 
> Does anyone wish to object to me committing that part?


========================
--- generate-lwlocknames.pl.org    2015-09-10 15:46:02.291079423 +0900
+++ generate-lwlocknames.pl    2015-09-10 15:53:22.177903045 +0900
@@ -8,5 +8,6 @@
-print
-  "/* autogenerated from src/backend/storage/lmgr/lwlocknames.txt, do not edit */\n";
-print "/* there is deliberately not an #ifndef LWLOCKNAMES_H here */\n";
+my $head_comment_c = 
+    "/* autogenerated from src/backend/storage/lmgr/lwlocknames.txt, do not edit */\n\n";
+my $head_comment_h = $head_comment_c .
+  "/* there is deliberately not an #ifndef LWLOCKNAMES_H here */\n\n";
@@ -23,2 +24,6 @@
+# Write the caution comment at the beginning of the files
+print H $head_comment_h;
+print C $head_comment_c;
+print C "static char *MainLWLockNames[] = {";

--- Makefile.org    2015-09-10 16:16:38.610660526 +0900
+++ Makefile    2015-09-10 16:39:09.718916378 +0900
@@ -26,6 +26,6 @@
-# see explanation in ../../parser/Makefile
-lwlocknames.c: lwlocknames.h ;
+lwlock.c: lwlocknames.c
-lwlocknames.h,lwlocknames.c: $(top_srcdir)/src/backend/storage/lmgr/lwlocknames.txt generate-lwlocknames.pl
+# see explanation in ../../parser/Makefile
+lwlocknames.h lwlocknames.c: $(top_srcdir)/src/backend/storage/lmgr/lwlocknames.txt generate-lwlocknames.pl    $(PERL)
$(srcdir)/generate-lwlocknames.pl$<
 
@@ -39,2 +39,2 @@maintainer-clean: clean
-    rm -f lwlocknames.h
+    rm -f lwlocknames.[ch]

-- 
Kyotaro Horiguchi
NTT Open Source Software Center




Re: Waits monitoring

От
Robert Haas
Дата:
On Thu, Sep 10, 2015 at 3:43 AM, Kyotaro HORIGUCHI
<horiguchi.kyotaro@lab.ntt.co.jp> wrote:
> Generated lwlocknames.[ch] don't have header comment because
> generate-lwlocknames.pl writes them into wrong place.
>
> lmgr/Makefile looks to have some mistakes.

Fixed.

>  - lwlocknames.c is not generated from (or using) lwlocknames.c
>    so the entry "lwlocknames.c: lwlocknames.h" doesn't looks to
>    be appropriate.

I think that's a pretty standard way of handling a case where a single
command generates multiple files.

>  - maintainer-clean in lmgr/Makefile forgets to remove lwlocknames.c.

Fixed.

> Perhaps uncommenting in pg_config_manual.h is left alone.
> (This is not included in the diff below)

Fixed.

And committed.  Thanks for the review, let's see what the buildfarm thinks.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company