Обсуждение: storage calculations

Поиск
Список
Период
Сортировка

storage calculations

От
pgboy@guthrie.charm.net
Дата:
i've been browsing the documentation online, and i can't
find where the storage layouts for tables and indexes are
explained.

i'd like to be able to figure out, given a table or two
with a simple schema (no ineritance) and some indexes, and
a number of rows currently in these tables, how much disk
space is being used.

the underlying problem being worked here, by the way, is
to come up with a task that monitors space used by the database
so as to issue a warning of some sort when the partition
upon which the database resides is getting full. i had originally
used the statfs output from the (unix) os, but for this to accurately
reflect the "used" space, i found i had to do a "vacuum full" and
compress the database. i'd rather come up with a heuristic where
i can take a number of rows (which i get from the reltuples column
in pg_class, which seems to be accurate after running 'analyze')
and multiply by some number i have calculated based on the table
schema to get a fairly accurate guess at the real space used.

any and all advice welcome, and thanks.

pg

Re: storage calculations

От
"Mendola Gaetano"
Дата:
<pgboy@guthrie.charm.net> writes:
> i've been browsing the documentation online, and i can't
> find where the storage layouts for tables and indexes are
> explained.
>
> i'd like to be able to figure out, given a table or two
> with a simple schema (no ineritance) and some indexes, and
> a number of rows currently in these tables, how much disk
> space is being used.

oid2name | grep <your_db_name>

give you the name of the directory where the database is stored
inside your $PGDATA directory.

In order to now the name of the file that correspond to your
table:

oid2name -d <your_db_name> -t <your_table_name>



Regards
Gaetano Mendola


Re: storage calculations

От
pgboy@guthrie.charm.net
Дата:
sorry, when i referred to "disk space used" i meant the actual
amount used by live rows. if i insert 1000 rows then delete
999 of them, the disk file will be the size of the 100 row
container (make that 1000, not 100 - i cannot backspace here)
until i do a "vacuum full" - which does a table lock, which is
a bad thing.

given that, i'd like to try to be able to calculate the number of
bytes a row uses given its schema. i've seen this kind of
documentation for other dbs, i just cannot seem to find it
in the postgresql docs.

thanks.
pg

Re: storage calculations

От
Peter Eisentraut
Дата:
pgboy@guthrie.charm.net writes:

> sorry, when i referred to "disk space used" i meant the actual amount
> used by live rows. if i insert 1000 rows then delete 999 of them, the
> disk file will be the size of the 100 row container (make that 1000, not
> 100 - i cannot backspace here) until i do a "vacuum full" - which does a
> table lock, which is a bad thing.

The task that you originally described was that you want to monitor when
the disk is getting full.  For that task, you need to take into account
the actual size of the data on disk, not the size after a "vacuum full"
which you say you do not want to execute.  Basing a disk full monitor on
hypothetical sizes sounds pretty useless.

> given that, i'd like to try to be able to calculate the number of
> bytes a row uses given its schema. i've seen this kind of
> documentation for other dbs, i just cannot seem to find it
> in the postgresql docs.

There is some information about that in the FAQ, but keep in mind that
rows may be compressed or moved to secondary storage automatically.

--
Peter Eisentraut   peter_e@gmx.net


Re: storage calculations

От
pgboy@guthrie.charm.net
Дата:
On Thu, 31 Jul 2003, Peter Eisentraut wrote:N
> pgboy@guthrie.charm.net writes:N
> > sorry, when i referred to "disk space used" i meant the actual amount
> > used by live rows. if i insert 1000 rows then delete 999 of them, the
> > disk file will be the size of the 100 row container (make that 1000, not
> > 100 - i cannot backspace here) until i do a "vacuum full" - which does a
> > table lock, which is a bad thing.
>
> The task that you originally described was that you want to monitor when
> the disk is getting full.  For that task, you need to take into account
> the actual size of the data on disk, not the size after a "vacuum full"
> which you say you do not want to execute.  Basing a disk full monitor on
> hypothetical sizes sounds pretty useless.
>
> > given that, i'd like to try to be able to calculate the number of
> > bytes a row uses given its schema. i've seen this kind of
> > documentation for other dbs, i just cannot seem to find it
> > in the postgresql docs.
>
> There is some information about that in the FAQ, but keep in mind that
> rows may be compressed or moved to secondary storage automatically.

well, i can admit that i am confused. my assumption is that when a row
is deleted in pg, that row is merely marked as 'gone' until a vacuum-full
is done. my further assumption is that if i continue to add rows, those
rows are not necessarily appended to the end of the physical data file,
but can be written over 'deleted' rows.

given that, a vacuum-full is the only way i know of to get an accurate
reflection of the number of bytes being used to store the data. without
the vacuum, i can tell how big a potentially sparse file is, but i don't
want to raise a warning just because the file size is getting large
(unless, of course, pg dos not overwrite deleted rows, in which case the
warning, or a vaccum-full, seems appropriate.

i think i agree with you, too, that i cannot really calculate a
hypothetical size, unless i have all fixed-sized fields. in that case,
i should be able to accurately calculate the size, yes? if not, what
are the variables i could not account for?

uh, any more info on your comment "rows may be compressed or moved
to secondary storage automatically." i'd *love* to know how to do
that.

thanks.
pgboy


Re: storage calculations

От
Andrew Sullivan
Дата:
On Thu, Jul 31, 2003 at 08:51:09AM -0400, pgboy@guthrie.charm.net wrote:
>
> well, i can admit that i am confused. my assumption is that when a row
> is deleted in pg, that row is merely marked as 'gone' until a vacuum-full
> is done. my further assumption is that if i continue to add rows, those
> rows are not necessarily appended to the end of the physical data file,
> but can be written over 'deleted' rows.

Your assumption is wrong.  VACUUM without FULL will allow you to
write over the free space now available in your data files, subject
to the limitations of tracking as determined by your FSM settings.

VACUUM FULL actually shortens the data file.  Except for cases having
to do with large tuples that won't fit in the previous page, VACUUM
FULL means that you can't fit any more data in that file, once the
VACUUM FULL is done.

A

--
----
Andrew Sullivan                         204-4141 Yonge Street
Liberty RMS                           Toronto, Ontario Canada
<andrew@libertyrms.info>                              M2P 2A8
                                         +1 416 646 3304 x110


Re: storage calculations

От
pgboy@guthrie.charm.net
Дата:
On Thu, 31 Jul 2003, Andrew Sullivan wrote:

> On Thu, Jul 31, 2003 at 08:51:09AM -0400, pgboy@guthrie.charm.net wrote:
> >
> > well, i can admit that i am confused. my assumption is that when a row
> > is deleted in pg, that row is merely marked as 'gone' until a vacuum-full
> > is done. my further assumption is that if i continue to add rows, those
> > rows are not necessarily appended to the end of the physical data file,
> > but can be written over 'deleted' rows.
>
> Your assumption is wrong.  VACUUM without FULL will allow you to
> write over the free space now available in your data files, subject
> to the limitations of tracking as determined by your FSM settings.
>
> VACUUM FULL actually shortens the data file.  Except for cases having
> to do with large tuples that won't fit in the previous page, VACUUM
> FULL means that you can't fit any more data in that file, once the
> VACUUM FULL is done.
>
> A
>
>

well, i was close. i didn't realize that i had to do a (non-full)
vacuum to mark deleted space as free.

but after that, i am still left with a potentially sparse file
and i don't really have a way to guess how much free space is
available until i do a full vacuum, correct? (at which time the
file size(s) will indicate the amount being used)

just started looking at fsm. thanks for the pointer there. i hope
i'm not acting too obtuse here.

thanks.
pg


Re: storage calculations

От
Andrew Sullivan
Дата:
On Thu, Jul 31, 2003 at 01:41:54PM -0400, pgboy@guthrie.charm.net wrote:
>
> well, i was close. i didn't realize that i had to do a (non-full)
> vacuum to mark deleted space as free.

Ooops, I think I was still unclear.  VACUUM FULL actually re-arranges
the file, and returns it to the filesystem.  Plain VACUUM does indeed
leave you with a file that is bigger than the actual data stored
there.

You can learn how much more data you could fit in the files using
VACUUM VERBOSE, keeping in mind that tuples may not always fit in an
already-allocated page (like when the tuples are large).

> available until i do a full vacuum, correct? (at which time the
> file size(s) will indicate the amount being used)

That's right, yes.

> just started looking at fsm. thanks for the pointer there. i hope
> i'm not acting too obtuse here.

Nope.  If you don't ask, you won't learn about it.

A

--
----
Andrew Sullivan                         204-4141 Yonge Street
Liberty RMS                           Toronto, Ontario Canada
<andrew@libertyrms.info>                              M2P 2A8
                                         +1 416 646 3304 x110