PROPOSAL: tracking aggregated numbers from pg_stat_database

Поиск
Список
Период
Сортировка
От Tomas Vondra
Тема PROPOSAL: tracking aggregated numbers from pg_stat_database
Дата
Msg-id 51607CC2.9080702@fuzzy.cz
обсуждение исходный текст
Ответы Re: PROPOSAL: tracking aggregated numbers from pg_stat_database  (Tomas Vondra <tv@fuzzy.cz>)
Re: PROPOSAL: tracking aggregated numbers from pg_stat_database  (Peter Eisentraut <peter_e@gmx.net>)
Список pgsql-hackers
Hi,

I'm regularly using pg_stat_database view to analyze various aspects of
behavior of the cluster. The #1 issue I'm constantly running into is
that to get cluster-level view (across all the databases), the table
needs to be aggregated like this:
   SELECT       SUM(blks_hit) blks_hit,       SUM(blks_read) blks_read   FROM pg_stat_database

This more or less works in stable environments, but once you start
dropping databases (think of hosting with shared DB server) it gets
unusable because after DROP DATABASE the database suddenly disappears
from the sum.

Therefore I do propose tracking the aggregated stats, similar to the
pg_stat_bgwriter view. This does not require new messages (thanks to
reuse of the existing messages), and I expect the overhead to be
negligible (a few bytes of storage, minimal CPU).

I think it does not make sense to merge this into pg_stat_bgwriter,
creating a new view (can't think of a good name though), seems like a
much better choice to me.

And now a bit more detailed explanation of the issues ...

Analysis is usually based on comparing two snapshots (say a few minutes
apart), and this makes is rather much more difficult because the dropped
databases suddenly disappear from the second snapshot.

Say for example there are two databases, A and B, with stats snapshotted
at T1 and T2. The database B is dropped sometimes between the snapshots.

So the snaphots look like this:
time | db | blks_read | bkls_hit
-----------------------------------  T1 |  A |   1000000 |   500000  T1 |  B |   1000000 |   500000  T2 |  A |
1500000|   750000
 

Now, the aggregated data look like this:
time | blks_read | bkls_hit
------------------------------  T1 |   2000000 |  1000000  T2 |   1500000 |   750000

So the difference (T2-T1) is
blks_read | bkls_hit
----------------------  -500000 |  -250000

Yes, negative values do not make much sense. It's very difficult to
detect such behavior and account for that.

It might be possible to solve (some of) the issues with elaborate
snapshotting system, but it's awkward / difficult to use. Adding a new
system view works much nicer.

regards
Tomas



В списке pgsql-hackers по дате отправления:

Предыдущее
От: Jeff Janes
Дата:
Сообщение: Re: [BUGS] BUG #8043: 9.2.4 doesn't open WAL files from archive, only looks in pg_xlog
Следующее
От: Jeff Janes
Дата:
Сообщение: Process title for autovac