Re: Smaller data types use same disk space

Поиск
Список
Период
Сортировка
От Steve Crawford
Тема Re: Smaller data types use same disk space
Дата
Msg-id 500F43C7.9090708@pinpointresearch.com
обсуждение исходный текст
Ответ на Smaller data types use same disk space  ("McGehee, Robert" <Robert.McGehee@geodecapital.com>)
Список pgsql-general
On 07/24/2012 03:21 PM, McGehee, Robert wrote:
> Hi,
> I've created two tables labeled "Big" and "Small" that both store the same 10 million rows of data using 493MB and
487MBof disk space respectively. The difference is that the "Big" table uses data types that take up more space
(integerrather than smallint, float rather than real, etc). The "Big" table should need about 27 bytes/row versus 16
bytes/rowfor the "Small" table, indicating to me that the "Big" table should be 70% bigger in actual disk size. In
reality,it's only 1% bigger or 6MB (after clustering, vacuuming and analyzing). Why is this? Shouldn't the "Small"
tablebe about 110MB smaller (11 bytes for 10 million rows)? I'm estimating table size with \d+ 
>
> Thanks, Robert
>
>           Table "Big"
>    Column  |       Type       | Bytes
> ----------+------------------+-----------
>   rmid     | integer          | 4
>   date     | date             | 4
>   rmfactor | text             | 7 (about 3 characters/cell)
>   id       | integer          | 4
>   value    | double precision | 8
> ---------------------------------
>   Total Bytes/Row               27
>   Rows                          10M
>   Actual Size                   493MB
>
>
>      Table "Small"
>   Column |   Type   | Bytes
> --------+----------+-----------
>   rmid   | smallint | 2
>   date   | date     | 4
>   rmfid  | smallint | 2 (rmfid is a smallint index into the rmfactor table)
>   id     | integer  | 4
>   value  | real     | 4
> ---------------------------------
>   Total Bytes/Row     16
>   Rows                10M
>   Actual Size         487MB
>
>
>

More questions than answers:

What version of PostgreSQL?

How are your determining the space used by a table?

Why are you assuming 7 bytes for a 3-character value? (Character values
up to 126 characters long only have 1-character overhead.)

What is the fill-factor on the tables? (Should default to 100% but don't
know how you are configured.)

Do the tables have OIDs or not?

Other considerations are that rows don't split across pages so there is
a bit of waste per page. Also there could be compression considerations
though I'm not sure that small rows like this will be compressed.

Cheers,
Steve



В списке pgsql-general по дате отправления:

Предыдущее
От: Adrian Klaver
Дата:
Сообщение: Re: Smaller data types use same disk space
Следующее
От: jtkells
Дата:
Сообщение: Re: insert binary data into a table column with psql