Re: Unexpectedly high disk space usage

Поиск
Список
Период
Сортировка
От Lists
Тема Re: Unexpectedly high disk space usage
Дата
Msg-id 509ABC88.6010305@benjamindsmith.com
обсуждение исходный текст
Ответ на Re: Unexpectedly high disk space usage  (Jeff Janes <jeff.janes@gmail.com>)
Ответы Re: Unexpectedly high disk space usage  (Tom Lane <tgl@sss.pgh.pa.us>)
Re: Unexpectedly high disk space usage  (Jeff Janes <jeff.janes@gmail.com>)
Список pgsql-general
On 11/07/2012 09:01 AM, Jeff Janes wrote:
> Ben, did you ever figure out where the space was going?

I think we've found where the space is going, but I still don't yet know
how to resolve it. I modified your query thusly in order to get a total
of space used, and got an answer that matches closely:

with stuff as (SELECT d.datname as Name,
pg_catalog.pg_get_userbyid(d.datdba) as Owner,
     CASE WHEN pg_catalog.has_database_privilege(d.datname, 'CONNECT')
         THEN pg_catalog.pg_database_size(d.datname)
         ELSE -1
     END as Size
FROM pg_catalog.pg_database d
     order by
     CASE WHEN pg_catalog.has_database_privilege(d.datname, 'CONNECT')
         THEN pg_catalog.pg_database_size(d.datname)
         ELSE NULL
     END desc nulls first) SELECT sum(size) AS overall from stuff;
Result: 171,276,369,124

# du -sbc /var/lib/pgsql/9.1/data/*
Result: 172,087,129,512

Now, the question is, I see several databases that uses disk usage with
sizes that are dramatically different than I get from a dump/restore to
another machine:

Production:
  santarosa444        | postgres | 44 GB

Dump/Restore:
  santarosa444        | postgres | 685 MB

Now, here's where it gets weird. From the disk space usage wiki,
(http://wiki.postgresql.org/wiki/Disk_Usage) you'll recognize how I used
it to get a total disk space used result:

with mytable AS (
                 SELECT
                         nspname || '.' || relname AS "relation",
                         pg_total_relation_size(C.oid) AS "size"
                 FROM
                         pg_class C
                 LEFT JOIN pg_namespace N ON
                         (N.oid = C.relnamespace)
                 WHERE nspname NOT IN ('pg_catalog', 'information_schema')
                         AND C.relkind <> 'i'
                         AND nspname !~ '^pg_toast'
                 ORDER BY
                         pg_total_relation_size(C.oid) DESC
                 )
         SELECT sum(size) AS size FROM mytable

... but the total result is 747,569,152 which is close to the
dump/restore value, not the production server value, even though I'm
running this query on the production server. So there's *something* that
the latter query isn't identifying that the former is.

On a hunch, ran this query:

with mytable AS (
                 SELECT
                         nspname || '.' || relname AS "relation",
                         pg_total_relation_size(C.oid) AS "size"
                 FROM
                         pg_class C
                 LEFT JOIN pg_namespace N ON
                         (N.oid = C.relnamespace) WHERE (true or
(nspname NOT IN ('pg_catalog', 'information_schema')))
                         AND C.relkind <> 'i'
                         AND nspname !~ '^pg_toast'
                 ORDER BY
                         pg_total_relation_size(C.oid) DESC
                 )
         SELECT sum(size) AS size FROM mytable;

And the result is 46,771,216,384! Removing the "mytable" wrapper stuff,
here are the top results:

  pg_catalog.pg_attribute                                  | 36727480320
  pg_catalog.pg_attrdef                                    | 3800072192
  pg_catalog.pg_depend                                     | 2665930752
  pg_catalog.pg_class                                      | 1508925440
  pg_catalog.pg_type                                       | 1113038848
  public.att_claims                                        | 451698688
  public.stgrades                                          | 127639552
  pg_catalog.pg_index                                      | 107806720


Google returns this page:
http://www.postgresql.org/docs/9.1/static/catalog-pg-attribute.html
which doesn't help me much. So, am I doing something wrong with admin?
Our current process is that every night in the middle of the night, a
script connects to each database on each server and runs a query to get
all tables in each database and, for each, run

"VACUUM ANALYZE $table"

for each table in the database.

And then once a week:
psql -U postgres -c "\l" | grep -Po "(\w+444)" | xargs -t -i psql -U
postgres {} -c "REINDEX DATABASE {};"

(note: there is a database for the "postgres" user on each DB server)
The script is a remnant from PG 8.x days, so am I missing something
fundamental about 9.x? I will note that autovacuum is off because it
occasionally causes transactions and queries to hang when an update
causes a vacuum mid-day, effectively taking us offline randomly. Our
scenario is pretty much a worst-possible case of transactions, prepared
transactions, temp tables, and concurrent read/write queries.


В списке pgsql-general по дате отправления:

Предыдущее
От: Bruce Momjian
Дата:
Сообщение: Re: Extra space when converting number with to_char
Следующее
От: Igor Neyman
Дата:
Сообщение: Re: How to verify pg_dump files