Storage consumption

Поиск
Список
Период
Сортировка
От Troels Arvin
Тема Storage consumption
Дата
Msg-id pan.2003.11.14.14.27.58.319177@arvin.dk
обсуждение исходный текст
Ответы Re: Storage consumption  (Bruce Momjian <pgman@candle.pha.pa.us>)
Re: Storage consumption  (Kris Jurka <books@ejurka.com>)
Список pgsql-general
Hello,

For some very data-intensive projects it's interesting how much space the
DBMS uses for the storage of data, so I'm investigating how space
efficient different DBMSes are.

In the PostgreSQL manual, it's written that values of the type INTEGER
take op four bytes. I was curious about how close to real-World this
number is, so I did a test: How much space does PostgreSQL use when
storing 100000 rows where each row consists of a single INTEGER value?

With help from http://random.org/ I created a file with 100000 random
integer insertions. The SQL used to do that is available at
http://troels.arvin.dk/db/tests/storage-sizes/randomints.zip

About installation: PostgreSQL v. 7.3.4 on Red Hat Linux 9, file system
ext3. PostgreSQL data-area in /var/lib/pgsql/data.

For this test, PostgreSQL is being used for nothing else.

Before test start:
-----------------
Access to a default database ('psql' brings you right into a working
database) from psql.
Access to do a 'du' (disk usage unix-command) on /var/lib/pgsql/data from
the command line.
No existing table 'inttab' in database. PostgreSQL stopped.

Test starts.
-----------
Output of 'du -sb /var/lib/pgsql/data': 77946519.
Start PostgreSQL.
Do: "CREATE TABLE inttab (intval INT) WITHOUT OIDS;"
psql -q -f random_ints.sql
(Wait for a long time.)
Do: "VACUUM FULL;"
Shut down PostgreSQL.
Output of 'du -sb /var/lib/pgsql/data': 81190551.

Result:
------
Real difference: 81190551-77946519 = 3244032
Optimal difference: 100000*4       =  400000
Storage consumption rate ((real/optimal)*100)% = 811%

I'm surprised by an overhead _that_ high. Any comments on my methology?
Does it need adjustments? If you think it's rotten: What methology would
you use to measure space overhead for a DBMS? (Again: Space overhead is
seldomly interesting, but sometimes it is.)

I guess that transaction log files are a joker in this context, but then
again: A number which reflects the DBMS' disk usage before and after an
operation does have real-World meaning, I think.

(Of course, I'll need another methology for DBMSes which preallocate a
fixed amount of storage for a database.)

--
Greetings from Troels Arvin, Copenhagen, Denmark


В списке pgsql-general по дате отправления:

Предыдущее
От: pmorettini@earthlink.net (PhilM)
Дата:
Сообщение: Can Postgresql run under Windows?
Следующее
От: Christopher Browne
Дата:
Сообщение: Re: embedded postgresql