Re: PROPOSAL: tracking aggregated numbers from pg_stat_database

Поиск
Список
Период
Сортировка
От Tomas Vondra
Тема Re: PROPOSAL: tracking aggregated numbers from pg_stat_database
Дата
Msg-id 5161C106.4060304@fuzzy.cz
обсуждение исходный текст
Ответ на PROPOSAL: tracking aggregated numbers from pg_stat_database  (Tomas Vondra <tv@fuzzy.cz>)
Список pgsql-hackers
On 6.4.2013 21:51, Tomas Vondra wrote:
> Hi,
>
> I'm regularly using pg_stat_database view to analyze various aspects of
> behavior of the cluster. The #1 issue I'm constantly running into is
> that to get cluster-level view (across all the databases), the table
> needs to be aggregated like this:
>
>     SELECT
>         SUM(blks_hit) blks_hit,
>         SUM(blks_read) blks_read
>     FROM pg_stat_database
>
> This more or less works in stable environments, but once you start
> dropping databases (think of hosting with shared DB server) it gets
> unusable because after DROP DATABASE the database suddenly disappears
> from the sum.
>
> Therefore I do propose tracking the aggregated stats, similar to the
> pg_stat_bgwriter view. This does not require new messages (thanks to
> reuse of the existing messages), and I expect the overhead to be
> negligible (a few bytes of storage, minimal CPU).
>
> I think it does not make sense to merge this into pg_stat_bgwriter,
> creating a new view (can't think of a good name though), seems like a
> much better choice to me.

Attached is a first version of the patch, just to illustrate the idea.
It creates a new system view pg_stat_agg_database with aggregated data.
There are no docs, no regression tests etc.

Now, I'm thinking if we should do something similar with database object
(table/index) stats, i.e. keep maintain aggregated data. This might seem
silly at first, but consider for example a partitioned table. It's
common to have a db-level metrics on idx/seq_scans, but when you drop a
partition (e.g. the oldest one), you may get into the same trouble as
with database stats (see my previous post).

So I think it would be handy to define table/index stats aggregated at
the db-level, i.e. there would be one row for each database. I don't
think it makes much sense to aggregate vacuum/analyze info (counts and
timestamps), which means 18 BIGINT counters from pg_stat[io]_tables, and
10 BIGINT counters from pg_stat[io]_indexes. That's 224B for each
database, which I believe is negligible.

OTOH it would be very handy to have this info aggretated per-schema and
per-tablespace, but I'd say to do that later in a separate patch.

Opinions?

Tomas

Вложения

В списке pgsql-hackers по дате отправления:

Предыдущее
От: Kerem Kat
Дата:
Сообщение: Current syslogger filename
Следующее
От: Tomas Vondra
Дата:
Сообщение: how to pass data (tuples) to worker processes?