Обсуждение: pg_check_queries.pl: an ineffective way to check for slow queries?

Поиск
Список
Период
Сортировка

pg_check_queries.pl: an ineffective way to check for slow queries?

От
Mark Stosberg
Дата:
I'm using a Nagios plugin that's supposed to check PostgreSQL to see if there
are too many slow queries, and then send an alert if so.

After using it some time, I think the approach may be badly flawed. I'm
interested in your opinion. You don't need to know or use Nagios to consider this-- The SQL it uses to check for "slow
queries"boils down to this: 

SELECT datname
    , procpid
    , usename
    , current_query
    , waiting
    , client_addr
    , (((timeofday()::TIMESTAMP)-query_start)) AS duration
    FROM pg_stat_activity
    WHERE timeofday()::TIMESTAMP-query_start > INTERVAL '5 minutes';

###

I found several cases where it considered a something "slow" when it fact it
was not problem:

- Idle psql sessions counted are counted as slow
- Idle connections from a mod_perl server are counted as slow. ( Some connections were from a lightly used beta server
)
- VACUUMs were counted as slow, when in fact slower VACUUMs are one way to speed up performance
- Idle slony connections were counted as slow, when they are constantly there.

The default threshold is "3" slow queries before it considers the situation
CRITICAL.

I've considered ways to improve this logic, but I think the best approach may
be to replace this test with something application specific.

Option: We could simply raise the thresholds for "slow queries"
    ...but that doesn't address the issue of wrongly classifying connections as problematic.

Option: We could build the above exceptions into the checker
    ...but that feels like playing whack-a-mole. What other legitimate cases are other there?

In our case, we are already logging the executation time of our critical search
query. We could write our own plugin to check that no more than 10 of the last
100 searches took more than 10 seconds to execute. If that is ever true, we
definitely have a situation we want to look into.

What luck have others of you had with devising a way to monitor if your PostgreSQL server is "slow" ?

    Mark







--
 . . . . . . . . . . . . . . . . . . . . . . . . . . .
   Mark Stosberg            Principal Developer
   mark@summersault.com     Summersault, LLC
   765-939-9301 ext 202     database driven websites
 . . . . . http://www.summersault.com/ . . . . . . . .


Re: pg_check_queries.pl: an ineffective way to check for slow queries?

От
"Kevin Grittner"
Дата:
Mark Stosberg <mark@summersault.com> wrote:

> What luck have others of you had with devising a way to monitor if
> your PostgreSQL server is "slow" ?

For web apps we submit HTTP requests to our renderers periodically,
and monitor the response time.  Of course, this tests the whole stack,
but that's a good thing from our perspective.

For other software, we log queries which take more than two seconds,
and investigate performance complaints from our end users.  There is
too much variation in run times, partly based on selection criteria
entered by users at run time, for us to want to automatically alert
based on run time.

Ultimately, for us, the only performance metric which really matters
for much of our software is user satisfaction with response time.

-Kevin