Tuple storage overhead

Поиск
Список
Период
Сортировка
От Peter Bex
Тема Tuple storage overhead
Дата
Msg-id 20100416094101.GE11616@frohike.homeunix.org
обсуждение исходный текст
Ответы Re: Tuple storage overhead  (Szymon Guz <mabewlun@gmail.com>)
Re: Tuple storage overhead  (Richard Huxton <dev@archonet.com>)
Re: Tuple storage overhead  (Merlin Moncure <mmoncure@gmail.com>)
Список pgsql-general
Hi all,

I have a table with three columns: one integer and two doubles.
There are two indexes defined (one on the integer and one on one
of the doubles).  This table stores 700000 records, which take up
30 Mb according to pg_relation_size(), and the total relation size
is 66 Mb.

I expected the disk space usage to be halved by changing the doubles
to floats, but it only dropped by 5 MB!  (I tried various approaches,
including dumping and restoring to make sure there was no uncollected
garbage lying around)

Someone on IRC told me the per-tuple storage overhead is pretty big,
and asked me to create a similar table containing only integers:

db=# create table testing (  x integer );
db=# INSERT INTO testing (x) VALUES (generate_series(1, 700000));
dacolt_development=# SELECT pg_size_pretty(pg_total_relation_size('testing'));
 pg_size_pretty
 ----------------
  24 MB
  (1 row)
db=# SELECT pg_size_pretty(pg_relation_size('testing'));
 pg_size_pretty
 ----------------
  24 MB
db=# CREATE INDEX testing_1 ON testing (x);
db=# CREATE INDEX testing_2 ON testing (x);
db=# SELECT pg_size_pretty(pg_relation_size('testing'));
 pg_size_pretty
 ----------------
  24 MB
  (1 row)
db=# SELECT pg_size_pretty(pg_total_relation_size('testing'));
 pg_size_pretty
 ----------------
  54 MB
  (1 row)

Is there a way to reduce the per-tuple storage overhead?

The reason I'm asking is that I have tons of tables like this,
and some data sets are much bigger than this.  In a relatively
simple testcase I'm importing data from text files which are
5.7 Gb in total, and this causes the db size to grow to 34Gb.

This size is just one small sample of many such datasets that I
need to import, so disk size is really an important factor.

Regards,
Peter
--
http://sjamaan.ath.cx
--
"The process of preparing programs for a digital computer
 is especially attractive, not only because it can be economically
 and scientifically rewarding, but also because it can be an aesthetic
 experience much like composing poetry or music."
                            -- Donald Knuth

В списке pgsql-general по дате отправления:

Предыдущее
От: Scott Marlowe
Дата:
Сообщение: Re: VACUUM process running for a long time
Следующее
От: raghavendra t
Дата:
Сообщение: tar error, in pg_start_backup()