Обсуждение: Determining server load from client

От:
Dan Harris
Дата:

I've found that it would be helpful to be able to tell how busy my
dedicated PG server is ( Linux 2.6 kernel, v8.0.3 currently ) before
pounding it with some OLAP-type queries.  Specifically, I have a
multi-threaded client program that needs to run several thousand
sequential queries.  I broke it into threads to take advantage of the
multi-core architecture of the server hardware.  It would be very nice
if I could check the load of the server at certain intervals to throttle
the number of concurrent queries and mitigate load problems when other
processes might be already inducing a significant load.

I have seen some other nice back-end things exposed through PG functions
( e.g. database size on disk ) and wondered if there was anything
applicable to this.  Even if it can't return the load average proper, is
there anything else in the pg_* tables that might give me a clue how
"busy" the server is for a period of time?

I've thought about allowing an ssh login without a keyphrase to log in
and capture it.  But, the client process is running as an apache user.
Giving the apache user a shell login to the DB box does not seem like a
smart idea for obvious security reasons...

So far, that's all I can come up with, other than a dedicated socket
server daemon on the DB machine to do it.

Any creative ideas are welcomed :)

Thanks

-Dan

От:
"Craig A. James"
Дата:

Dan Harris wrote:
> I've found that it would be helpful to be able to tell how busy my
> dedicated PG server is ...
>
> I have seen some other nice back-end things exposed through PG functions
> ( e.g. database size on disk ) and wondered if there was anything
> applicable to this.

I'd write a simple pg-perl function to do this.  You can access operating-system calls to find out the system's load.
Butnotice that you need "Untrusted Perl" to do this, so you can only do it on a system where you trust every
applicationthat connects to your database.  Something like this: 

create or replace function get_stats()
  returns text as '
  open(STAT, "</proc/stat");
  my @stats = <STAT>;
  close STAT;
  return join("", @stats);
' language plperlu;

See http://www.postgresql.org/docs/8.1/interactive/plperl-trusted.html

Craig

От:
Joe Healy
Дата:

(forgot to send to list)
Dan Harris wrote:
> architecture of the server hardware.  It would be very nice if I could
> check the load of the server at certain intervals to throttle the
> number of concurrent queries and mitigate load problems when other
> processes might be already inducing a significant load.
>
> I have seen some other nice back-end things exposed through PG
> functions ( e.g. database size on disk ) and wondered if there was
> anything applicable to this.  Even if it can't return the load average
> proper, is there anything else in the pg_* tables that might give me a
> clue how "busy" the server is for a period of time?



I have installed munin (http://munin.projects.linpro.no/) on a few
systems. This lets you look at graphs of system resources/load etc. I
have also added python scripts which do sample queries to let me know if
performance/index size is changing dramatically. I have attached an
example script.



Hope that helps,



Joe


------------------------------------------------------------------------

#! /usr/bin/python
import psycopg
import sys

def fixName(name):
    return name[:19]

if len(sys.argv) > 1 and sys.argv[1] == "config":
    print """graph_title Postgresql Index Sizes
graph_vlabel Mb"""

    con = psycopg.connect("host=xxx user=xxx dbname=xxx password=xxx")
    cur = con.cursor()

    cur.execute("select relname, relpages from pg_class where relowner > 10 and relkind='i' and relpages > 256 order by
reltuplesdesc;") 
    results = cur.fetchall()
    for name, pages in results:
        print "%s.label %s" % (fixName(name), name)

else:
    con = psycopg.connect("host=xxx user=xxx dbname=xxx password=xxx")
    cur = con.cursor()

    cur.execute("select relname, relpages from pg_class where relowner > 10 and relkind='i' and relpages > 256 order by
reltuplesdesc;") 
    results = cur.fetchall()

    for name, pages in results:
        print "%s.value %.2f" % (name[:19], pages*8.0/1024.0)


От:
Dan Harris
Дата:

Dan Harris wrote:
> I've found that it would be helpful to be able to tell how busy my
> dedicated PG server is ( Linux 2.6 kernel, v8.0.3 currently ) before
> pounding it with some OLAP-type queries.
..snip

Thank you all for your great ideas!  I'm going to try the perl function
as that seems like a very elegant way of doing it.

-Dan

От:
"Jim Buttafuoco"
Дата:

Dan

Use the following plperlu function

create or replace function LoadAVG()
returns record
as
$$
use Sys::Statistics::Linux::LoadAVG;
my $lxs = new Sys::Statistics::Linux::LoadAVG;
my $stats = $lxs->get;
return $stats;

$$
language plperlu;


select * from LoadAVg() as (avg_1 float,avg_5 float,avg_15 float);

The Sys::Statistics::Linux has all kind of info (from the /proc) file
system.

Jim

-----Original Message-----
From: 
[mailto:] On Behalf Of Dan Harris
Sent: Tuesday, March 20, 2007 8:48 PM
To: PostgreSQL Performance
Subject: [PERFORM] Determining server load from client

I've found that it would be helpful to be able to tell how busy my
dedicated PG server is ( Linux 2.6 kernel, v8.0.3 currently ) before
pounding it with some OLAP-type queries.  Specifically, I have a
multi-threaded client program that needs to run several thousand
sequential queries.  I broke it into threads to take advantage of the
multi-core architecture of the server hardware.  It would be very nice
if I could check the load of the server at certain intervals to throttle
the number of concurrent queries and mitigate load problems when other
processes might be already inducing a significant load.

I have seen some other nice back-end things exposed through PG functions
( e.g. database size on disk ) and wondered if there was anything
applicable to this.  Even if it can't return the load average proper, is
there anything else in the pg_* tables that might give me a clue how
"busy" the server is for a period of time?

I've thought about allowing an ssh login without a keyphrase to log in
and capture it.  But, the client process is running as an apache user.
Giving the apache user a shell login to the DB box does not seem like a
smart idea for obvious security reasons...

So far, that's all I can come up with, other than a dedicated socket
server daemon on the DB machine to do it.

Any creative ideas are welcomed :)

Thanks

-Dan

---------------------------(end of broadcast)---------------------------
TIP 1: if posting/reading through Usenet, please send an appropriate
       subscribe-nomail command to  so that your
       message can get through to the mailing list cleanly



От:
Richard Huxton
Дата:

Joe Healy wrote:
> (forgot to send to list)
> Dan Harris wrote:
>> architecture of the server hardware.  It would be very nice if I could
>> check the load of the server at certain intervals to throttle the
>> number of concurrent queries and mitigate load problems when other
>> processes might be already inducing a significant load.

> I have installed munin (http://munin.projects.linpro.no/) on a few
> systems. This lets you look at graphs of system resources/load etc. I
> have also added python scripts which do sample queries to let me know if
> performance/index size is changing dramatically. I have attached an
> example script.

For general monitoring of a handful of servers, I've been impressed with
munin. It's very simple to get it running and write your own plugins.

--
   Richard Huxton
   Archonet Ltd

От:
Tobias Brox
Дата:

I have my postgres munin monitoring script at
http://oppetid.no/~tobixen/pg_activity.munin.txt (had to suffix it with
.txt to make the local apache happy).

I would like to see what others have done as well.


От:
Richard Huxton
Дата:

Tobias Brox wrote:
> I have my postgres munin monitoring script at
> http://oppetid.no/~tobixen/pg_activity.munin.txt (had to suffix it with
> .txt to make the local apache happy).
>
> I would like to see what others have done as well.

Well, I use Perl rather than shell, but that's just me.

The main difference is that although I downloaded a couple of simple
pg-monitoring scripts from the web, I've concentrated on monitoring the
application(s) instead. Things like:
  - number of news items posted
  - searches run
  - logins, logouts

The main limitation with it for me is the fixed 5-min time interval. It
provides a slight irritation that I've got hourly/daily cron jobs that
are being monitored continually.
--
   Richard Huxton
   Archonet Ltd

От:
Erik Jones
Дата:


On Mar 21, 2007, at 5:13 AM, Tobias Brox wrote:

I have my postgres munin monitoring script at
.txt to make the local apache happy).

I would like to see what others have done as well.

I use cacti (http://cacti.net) which does the same thing that munin does but in php instead.  Here's what I use to db stats to it (again, php):

You basically call the script with the database name and the stat you want.  I have the active_queries stat set up as a gauge in cacti and the others as counters:

if(!isset($argv[1])) {    echo "DB name argument required!\n";    exit();
}

$stats = array('xact_commit', 'xact_rollback', 'blks_read', 'blks_hit', 'active_queries');
if(!isset($argv[2]) || !in_array($argv[2], $stats)) {    echo "Invalid stat arg!: {$argv[2]}";
    exit();
}
require_once('DB.php');

$db_name = $argv[1];
if(DB::isError($db = DB::connect("pgsql://user@host:5432/$db_name"))) {
    exit();
}

if($argv[2] == 'active_queries') {
    $actives_sql = "SELECT COUNT(*)
                    FROM pg_stat_activity
                    WHERE current_query NOT ILIKE '<idle>'
                        AND now() - query_start > '1 second';";
    if(DB::isError($db_stat = $db->getOne($actives_sql))) {
        exit();
    }
    echo "$db_stat\n";
    exit();
}

$db_stat_sql = "SELECT {$argv[2]}
                 FROM pg_stat_database
                 WHERE datname='$db_name';";
if(DB::isError($db_stat = $db->getOne($db_stat_sql))) {
    exit();
}

echo "$db_stat\n";


erik jones <>
software developer
615-296-0838
emma(r)



От:
Tobias Brox
Дата:

[Erik Jones - Wed at 09:31:48AM -0500]
> I use cacti (http://cacti.net) which does the same thing that munin
> does but in php instead.  Here's what I use to db stats to it (again,
> php):

I haven't tried cacti, but our sysadm has done a little bit of research
and concluded "cacti is better".  Maybe some day we'll move over.

Munin is generating all the graphs statically every fifth minute, while
cacti generates them on demand as far as I've understood.  The munin
approach is pretty bloat, since one usually would watch the graphs much
more seldom than what they are generated (at least, we do).  That's not
really an argument since CPU is cheap nowadays - but a real argument is
that the munin approach is less flexible.  One would like to adjust the
graph (like, min/max values for both axis) while watching quite some
times.

>     $actives_sql = "SELECT COUNT(*)
>                     FROM pg_stat_activity
>                     WHERE current_query NOT ILIKE '<idle>'
>                         AND now() - query_start > '1 second';";

So this one is quite similar to mine ...

> $db_stat_sql = "SELECT {$argv[2]}
>                  FROM pg_stat_database
>                  WHERE datname='$db_name';";

I was not aware of this view - it can probably be useful for us.  I will
add this one when I get the time ... (I'm at vacation now).


От:
Erik Jones
Дата:


On Mar 21, 2007, at 4:13 PM, Tobias Brox wrote:

[Erik Jones - Wed at 09:31:48AM -0500]
I use cacti (http://cacti.net) which does the same thing that munin  
does but in php instead.  Here's what I use to db stats to it (again,  
php):

I haven't tried cacti, but our sysadm has done a little bit of research
and concluded "cacti is better".  Maybe some day we'll move over.

Munin is generating all the graphs statically every fifth minute, while
cacti generates them on demand as far as I've understood.  The munin
approach is pretty bloat, since one usually would watch the graphs much
more seldom than what they are generated (at least, we do).  That's not
really an argument since CPU is cheap nowadays - but a real argument is
that the munin approach is less flexible.  One would like to adjust the
graph (like, min/max values for both axis) while watching quite some
times.

Well, by "default", Cacti polls all of the data sources you've set up every five minutes as well as that's how the docs instruct you to set up the cron job for the poller.  However, with a little understanding of how the rrdtool rras work, you could definitely poll more often and simply edit the existing rras and datasources to expect that or create new ones.  And, yes, the graph customization is pretty cool although for the most part the just map what's available from the rrdtool graph functionality.  If you do decide to set up Cacti I suggest you go straight to the faq section of the manual and read the part about going from a simple script to a graph.  The main manual is almost entirely centered on the built-in networking (e.g. snmp) data sources and, as such, doesn't do much for explaining how to set up other data sources.

erik jones <>
software developer
615-296-0838
emma(r)



От:
"CAJ CAJ"
Дата:



On 3/21/07, Erik Jones <> wrote:

On Mar 21, 2007, at 4:13 PM, Tobias Brox wrote:

[Erik Jones - Wed at 09:31:48AM -0500]
I use cacti (http://cacti.net) which does the same thing that munin  
does but in php instead.  Here's what I use to db stats to it (again,  
php):

I haven't tried cacti, but our sysadm has done a little bit of research
and concluded "cacti is better".  Maybe some day we'll move over.

Munin is generating all the graphs statically every fifth minute, while
cacti generates them on demand as far as I've understood.  The munin
approach is pretty bloat, since one usually would watch the graphs much
more seldom than what they are generated (at least, we do).  That's not
really an argument since CPU is cheap nowadays - but a real argument is
that the munin approach is less flexible.  One would like to adjust the
graph (like, min/max values for both axis) while watching quite some
times.

Well, by "default", Cacti polls all of the data sources you've set up every five minutes as well as that's how the docs instruct you to set up the cron job for the poller.  However, with a little understanding of how the rrdtool rras work, you could definitely poll more often and simply edit the existing rras and datasources to expect that or create new ones.  And, yes, the graph customization is pretty cool although for the most part the just map what's available from the rrdtool graph functionality.  If you do decide to set up Cacti I suggest you go straight to the faq section of the manual and read the part about going from a simple script to a graph.  The main manual is almost entirely centered on the built-in networking ( e.g. snmp) data sources and, as such, doesn't do much for explaining how to set up other data sources.


Has anyone had experience setting up something similar with Nagios? We monitor servers using nagios and not having to install additional software (cacti/munin) for postgres resource usage monitoring would be great.

Thanks in advance!


От:
Stefan Kaltenbrunner
Дата:

CAJ CAJ wrote:
>
>
> On 3/21/07, *Erik Jones* < <mailto:>> wrote:
>
>
>     On Mar 21, 2007, at 4:13 PM, Tobias Brox wrote:
>
>>     [Erik Jones - Wed at 09:31:48AM -0500]
>>>     I use cacti (http://cacti.net) which does the same thing that
>>>     munin
>>>     does but in php instead.  Here's what I use to db stats to it
>>>     (again,
>>>     php):
>>
>>     I haven't tried cacti, but our sysadm has done a little bit of
>>     research
>>     and concluded "cacti is better".  Maybe some day we'll move over.
>>
>>     Munin is generating all the graphs statically every fifth minute,
>>     while
>>     cacti generates them on demand as far as I've understood.  The munin
>>     approach is pretty bloat, since one usually would watch the graphs
>>     much
>>     more seldom than what they are generated (at least, we do).
>>     That's not
>>     really an argument since CPU is cheap nowadays - but a real
>>     argument is
>>     that the munin approach is less flexible.  One would like to
>>     adjust the
>>     graph (like, min/max values for both axis) while watching quite some
>>     times.
>
>     Well, by "default", Cacti polls all of the data sources you've set
>     up every five minutes as well as that's how the docs instruct you to
>     set up the cron job for the poller.  However, with a little
>     understanding of how the rrdtool rras work, you could definitely
>     poll more often and simply edit the existing rras and datasources to
>     expect that or create new ones.  And, yes, the graph customization
>     is pretty cool although for the most part the just map what's
>     available from the rrdtool graph functionality.  If you do decide to
>     set up Cacti I suggest you go straight to the faq section of the
>     manual and read the part about going from a simple script to a
>     graph.  The main manual is almost entirely centered on the built-in
>     networking ( e.g. snmp) data sources and, as such, doesn't do much
>     for explaining how to set up other data sources.
>
>
>
> Has anyone had experience setting up something similar with Nagios? We
> monitor servers using nagios and not having to install additional
> software (cacti/munin) for postgres resource usage monitoring would be
> great.

a lot of nagios plugins can supply performance data in addition to the
OK/WARNING/CRITICAL state information - there are a number of solutions
out there that can take that information and graph it on a per
hosts/server base automatically - examples for such addons are
nagiosgrapher and n2rrd(or look at www.nagiosexchange.org it has a large
number of addons listed).


Stefan