Обсуждение: proposal for 9.5: monitoring lock time for slow queries
Hello
I was asked about possibility to show a lock time of slow queries. Any idea how to show a lock time in some practical form together with logged slow query?
Regards
Pavel
On Wed, Aug 13, 2014 at 4:59 AM, Pavel Stehule <pavel.stehule@gmail.com> wrote: > Any idea how to show a lock time in some practical form together with logged > slow query? Doing a join on pg_stat_activity and pg_locks is not going to help much as you could only get the moment when query has started or its state has changed. Have you thought about the addition of a new column in pg_locks containing the timestamp of the moment a lock has been taken? I am sure that we are concerned about the performance impact that extra calls to gettimeofday could have though... Regards, -- Michael
Michael Paquier <michael.paquier@gmail.com> writes:
> Doing a join on pg_stat_activity and pg_locks is not going to help
> much as you could only get the moment when query has started or its
> state has changed. Have you thought about the addition of a new column
> in pg_locks containing the timestamp of the moment a lock has been
> taken? I am sure that we are concerned about the performance impact
> that extra calls to gettimeofday could have though...
In theory this could be driven off the same gettimeofday needed to
start the deadlock_timeout timer.  Not sure how messy that'd be.
        regards, tom lane
			
		Hi
			
		2014-08-13 6:18 GMT+02:00 Michael Paquier <michael.paquier@gmail.com>:
 
On Wed, Aug 13, 2014 at 4:59 AM, Pavel Stehule <pavel.stehule@gmail.com> wrote:Doing a join on pg_stat_activity and pg_locks is not going to help
> Any idea how to show a lock time in some practical form together with logged
> slow query?
much as you could only get the moment when query has started or its
state has changed. Have you thought about the addition of a new column
in pg_locks containing the timestamp of the moment a lock has been
taken? I am sure that we are concerned about the performance impact
that extra calls to gettimeofday could have though...
Regards,
There are two relative independent tasks 
a) monitor and show total lock time of living queries
b) monitor and log total lock time of executed queries.
I am interested by @b now. When we work with slow query log, then we would to identify reason for long duration. Locks are important source of these queries on some systems.
What I know, we do monitoring these times for deadlock identification trigger and for log_lock_waits - so there should not be any new pressure to performance.
Regards
Pavel
Pavel
--
Michael
2014-08-13 7:19 GMT+02:00 Tom Lane <tgl@sss.pgh.pa.us>:
 
Michael Paquier <michael.paquier@gmail.com> writes:In theory this could be driven off the same gettimeofday needed to
> Doing a join on pg_stat_activity and pg_locks is not going to help
> much as you could only get the moment when query has started or its
> state has changed. Have you thought about the addition of a new column
> in pg_locks containing the timestamp of the moment a lock has been
> taken? I am sure that we are concerned about the performance impact
> that extra calls to gettimeofday could have though...
start the deadlock_timeout timer. Not sure how messy that'd be.
we use it in out custom patch without problems
Pavel
regards, tom lane
From: "Pavel Stehule" <pavel.stehule@gmail.com> > There are two relative independent tasks > > a) monitor and show total lock time of living queries > > b) monitor and log total lock time of executed queries. > > I am interested by @b now. When we work with slow query log, then we would > to identify reason for long duration. Locks are important source of these > queries on some systems. I'm interested in b, too. I was thinking of proposing a performance diagnostics feature like Oracle's wait events (V$SYSTEM_EVENT and V$SESSION_EVENT). So, if you do this, I'd like to contribute to the functional design, code and doc review, and testing. The point is to collect as much information about bottlenecks as possible, including lock waits. The rough sketch is: What info to collect: * heavyweight lock waits shown by pg_locks * lightweight lock waits * latch waits * socket waits (mainly for client input) How the info is delivered: * pg_stat_system_events shows the accumulated total accross the server instance * pg_stat_session_events shows the accumulated total for each session * EXPLAIN ANALYZE and auto_explain shows the accumulated total for each query We need to describe in the manual how to diagnose and tne the system with these event info. Regards MauMau
2014-08-13 11:14 GMT+02:00 MauMau <maumau307@gmail.com>:
 
 
From: "Pavel Stehule" <pavel.stehule@gmail.com>I'm interested in b, too. I was thinking of proposing a performance diagnostics feature like Oracle's wait events (V$SYSTEM_EVENT and V$SESSION_EVENT). So, if you do this, I'd like to contribute to the functional design, code and doc review, and testing.There are two relative independent tasks
a) monitor and show total lock time of living queries
b) monitor and log total lock time of executed queries.
I am interested by @b now. When we work with slow query log, then we would
to identify reason for long duration. Locks are important source of these
queries on some systems.
isn't it too heavy?
I have just terrible negative experience with Vertica, where this design is used - almost all information about queries are available, but any query to related tables are terrible slow, so I am inclined to more simple design oriented to log based solution. Table based solutions is not practical when you exec billions queries per day. I understand to motivation, but I afraid so it can be very expensive and slow on highly load servers. 
The point is to collect as much information about bottlenecks as possible, including lock waits. The rough sketch is:
What info to collect:
* heavyweight lock waits shown by pg_locks
* lightweight lock waits
* latch waits
* socket waits (mainly for client input)
How the info is delivered:
* pg_stat_system_events shows the accumulated total accross the server instance
* pg_stat_session_events shows the accumulated total for each session
* EXPLAIN ANALYZE and auto_explain shows the accumulated total for each query
We need to describe in the manual how to diagnose and tne the system with these event info.
Regards
MauMau
From: "Pavel Stehule" <pavel.stehule@gmail.com> > isn't it too heavy? Are you concerned about the impactof collection overhead on the queries diagnosed? Maybe not light, but I'm optimistic. Oracle has the track record of long use, and MySQL provides performance schema starting from 5.6. > I have just terrible negative experience with Vertica, where this design > is > used - almost all information about queries are available, but any query > to > related tables are terrible slow, so I am inclined to more simple design > oriented to log based solution. Table based solutions is not practical > when > you exec billions queries per day. I understand to motivation, but I > afraid > so it can be very expensive and slow on highly load servers. Which do you mean by "query related to tables", the queries from applications being diagnosed, or the queries that diagnose the performance using statistics views? Could you elaborate on your experience with Vertica? That trouble may be just because Vertica's implementation is not refined. I understand the feeling of being inclined to log based solution for its implementation simplicity. However, the server log is difficult or impossible to access using SQL queries. This prevents the development of performance diagnostics functionality in GUI administration tools. Also, statistics views allow for easy access on PAAS like Amazon RDS and Heroku. Regards MauMau
2014-08-13 13:59 GMT+02:00 MauMau <maumau307@gmail.com>:
 
  
 
From: "Pavel Stehule" <pavel.stehule@gmail.com>Are you concerned about the impactof collection overhead on the queries diagnosed? Maybe not light, but I'm optimistic. Oracle has the track record of long use, and MySQL provides performance schema starting from 5.6.isn't it too heavy?
partially, I afraid about total performance (about impact on IO) - when we use a usual tables, then any analyses without indexes are slow, so you need a indexes, and we cannot deferred index update. You should thinking about retention policy - and without partitioning you got massive deletes. So I cannot to imagine a usage of table based solution together with some higher load. Our MVCC storage is not practical for storing only inserted data, and some custom storage has no indexes - so this design is relative big project.
Which do you mean by "query related to tables", the queries from applications being diagnosed, or the queries that diagnose the performance using statistics views?I have just terrible negative experience with Vertica, where this design is
used - almost all information about queries are available, but any query to
related tables are terrible slow, so I am inclined to more simple design
oriented to log based solution. Table based solutions is not practical when
you exec billions queries per day. I understand to motivation, but I afraid
so it can be very expensive and slow on highly load servers.
Could you elaborate on your experience with Vertica? That trouble may be just because Vertica's implementation is not refined.
sure - Vertica is not mature database. More it has only one storage type optimized for OLAP, what is wrong for long catalog, and for working with performance events too.
I understand the feeling of being inclined to log based solution for its implementation simplicity. However, the server log is difficult or impossible to access using SQL queries. This prevents the development of performance diagnostics functionality in GUI administration tools. Also, statistics views allow for easy access on PAAS like Amazon RDS and Heroku.
I prefer a possibility to read log via SQL (maybe some FDW) than use tables for storing log. These tables can be relative very large in few days - and we cannot to write specialized engine like MySQL simply.
Pavel 
Regards
MauMau
From: "Pavel Stehule" <pavel.stehule@gmail.com> > 2014-08-13 13:59 GMT+02:00 MauMau <maumau307@gmail.com>: >> Are you concerned about the impactof collection overhead on the queries >> diagnosed? Maybe not light, but I'm optimistic. Oracle has the track >> record of long use, and MySQL provides performance schema starting from >> 5.6. > > > partially, I afraid about total performance (about impact on IO) - when we > use a usual tables, then any analyses without indexes are slow, so you > need > a indexes, and we cannot deferred index update. You should thinking about > retention policy - and without partitioning you got massive deletes. So I > cannot to imagine a usage of table based solution together with some > higher > load. Our MVCC storage is not practical for storing only inserted data, > and > some custom storage has no indexes - so this design is relative big > project. > > I prefer a possibility to read log via SQL (maybe some FDW) than use > tables > for storing log. These tables can be relative very large in few days - and > we cannot to write specialized engine like MySQL simply. I didn't mean performance statistics data to be stored in database tables. I just meant: * pg_stat_system_events is a view to show data on memory, which returns one row for each event across the system. This is similar to V$SYSTEM_EVENT in Oracle. * pg_stat_session_events is a view to show data on memory, which returns one row for each event on one session. This is similar to V$SESSION_EVENT in Oracle. * The above views represent the current accumulated data like other pg_stat_xxx views. * EXPLAIN ANALYZE and auto_explain shows all events for one query. The lock waits you are trying to record in the server log is one of the events. Regards MauMau
2014-08-13 15:22 GMT+02:00 MauMau <maumau307@gmail.com>:
 
From: "Pavel Stehule" <pavel.stehule@gmail.com>2014-08-13 13:59 GMT+02:00 MauMau <maumau307@gmail.com>:Are you concerned about the impactof collection overhead on the queries
diagnosed? Maybe not light, but I'm optimistic. Oracle has the track
record of long use, and MySQL provides performance schema starting from 5.6.
partially, I afraid about total performance (about impact on IO) - when we
use a usual tables, then any analyses without indexes are slow, so you need
a indexes, and we cannot deferred index update. You should thinking about
retention policy - and without partitioning you got massive deletes. So I
cannot to imagine a usage of table based solution together with some higher
load. Our MVCC storage is not practical for storing only inserted data, and
some custom storage has no indexes - so this design is relative big project.I prefer a possibility to read log via SQL (maybe some FDW) than use tables
for storing log. These tables can be relative very large in few days - and
we cannot to write specialized engine like MySQL simply.
I didn't mean performance statistics data to be stored in database tables. I just meant:
* pg_stat_system_events is a view to show data on memory, which returns one row for each event across the system. This is similar to V$SYSTEM_EVENT in Oracle.
* pg_stat_session_events is a view to show data on memory, which returns one row for each event on one session. This is similar to V$SESSION_EVENT in Oracle.
* The above views represent the current accumulated data like other pg_stat_xxx views.
* EXPLAIN ANALYZE and auto_explain shows all events for one query. The lock waits you are trying to record in the server log is one of the events.
I am little bit sceptic about only memory based structure. Is it this concept acceptable for commiters?
Pavel
Regards
MauMau
Pavel Stehule wrote: > 2014-08-13 15:22 GMT+02:00 MauMau <maumau307@gmail.com>: > > I didn't mean performance statistics data to be stored in database tables. > > I just meant: > > > > * pg_stat_system_events is a view to show data on memory, which returns > > one row for each event across the system. This is similar to > > V$SYSTEM_EVENT in Oracle. > > > > * pg_stat_session_events is a view to show data on memory, which returns > > one row for each event on one session. This is similar to V$SESSION_EVENT > > in Oracle. > > > > * The above views represent the current accumulated data like other > > pg_stat_xxx views. > > > > * EXPLAIN ANALYZE and auto_explain shows all events for one query. The > > lock waits you are trying to record in the server log is one of the events. > > I am little bit sceptic about only memory based structure. Is it this > concept acceptable for commiters? Is this supposed to be session-local data, or is it visible from remote sessions too? How durable is it supposed to be? Keep in mind that in case of a crash, all pgstats data is erased. -- Álvaro Herrera http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services
2014-08-18 7:42 GMT+02:00 Alvaro Herrera <alvherre@2ndquadrant.com>:
 
> > I didn't mean performance statistics data to be stored in database tables.Is this supposed to be session-local data, or is it visible from remote
> > I just meant:
> >
> > * pg_stat_system_events is a view to show data on memory, which returns
> > one row for each event across the system. This is similar to
> > V$SYSTEM_EVENT in Oracle.
> >
> > * pg_stat_session_events is a view to show data on memory, which returns
> > one row for each event on one session. This is similar to V$SESSION_EVENT
> > in Oracle.
> >
> > * The above views represent the current accumulated data like other
> > pg_stat_xxx views.
> >
> > * EXPLAIN ANALYZE and auto_explain shows all events for one query. The
> > lock waits you are trying to record in the server log is one of the events.
>
> I am little bit sceptic about only memory based structure. Is it this
> concept acceptable for commiters?
sessions too? How durable is it supposed to be? Keep in mind that in
case of a crash, all pgstats data is erased.
surely it should be visible from all sessions and least 48 hours.
I have no problem with cleaning pgstats after crash -  it is cost related to minimal overhead. And on server related hw there are (should be)  a minimal number of crash. 
Regards
Pavel
Pavel
--
Álvaro Herrera http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services
From: "Alvaro Herrera" <alvherre@2ndquadrant.com> > Is this supposed to be session-local data, or is it visible from remote > sessions too? How durable is it supposed to be? Keep in mind that in > case of a crash, all pgstats data is erased. I want it to be visible from other sessions. I'm okay about the data erasure during recovery. We can probably extend pg_statsinfo to save the new info for long-term trend analysis. TBH, I want a feature like pg_statsinfo in core. Regards MauMau
On 8/13/14, 5:14 AM, MauMau wrote: > I'm interested in b, too. I was thinking of proposing a performance > diagnostics feature like Oracle's wait events (V$SYSTEM_EVENT and > V$SESSION_EVENT). So, if you do this, I'd like to contribute to the > functional design, code and doc review, and testing. I already wrote up a design like this once: http://www.postgresql.org/message-id/509300F7.5000803@2ndQuadrant.com The main problem when I tried to code it was figuring out how to store the data. When I wrote that, I thought I could just stuff it into a shared memory block the way pg_stat_statements did. That didn't really work out. I think it's manageable now because the Dynamic Shared Memory management written since then has the right sort of shape to do the job. This is one of those jobs where I think the coding itself is the hard part, not the design nor the review. What I really want is something that dumps this data into memory, then a second process that persists to disk in batches. I think that's the only way we'll get high performance on reads while still saving enough data to be useful after a crash. We're working on getting a few things in this area fully funded to dig into them harder. The idea of designing for high-speed audit logs into memory and then persisting to disk has a lot of overlap with this one too, and that may get picked up too. -- Greg Smith greg.smith@crunchydatasolutions.com Chief PostgreSQL Evangelist - http://crunchydatasolutions.com/