Обсуждение: Physical column size

Поиск
Список
Период
Сортировка

Physical column size

От
"Paul Mackay"
Дата:
Hi,

I've created a table like this :
CREATE TABLE tmp_A (
c "char",
i int4
);

And another one
CREATE TABLE tmp_B (
i int4,
ii int4
);

I then inserted a bit more than 19 million rows in each table (exactly the same number of rows in each).

The end result is that the physical size on disk used by table tmp_A is exactly the same as table tmp_B (as revealed by the pg_relation_size function) ! Given that a "char" field is supposed to be 1 byte in size and a int4 4 bytes, shouldn't the tmp_A use a smaller disk space ? Or is it that any value, whatever the type, requires at least 4 bytes to be stored ?

Thanks,
Paul

Re: Physical column size

От
Peter Eisentraut
Дата:
Am Freitag, 3. März 2006 11:03 schrieb Paul Mackay:
> I've created a table like this :
> CREATE TABLE tmp_A (
> c "char",
> i int4
> );
>
> And another one
> CREATE TABLE tmp_B (
> i int4,
> ii int4
> );

> The end result is that the physical size on disk used by table tmp_A is
> exactly the same as table tmp_B (as revealed by the pg_relation_size
> function) !

An int4 field is required to be aligned at a 4-byte boundary internally, so
there are 3 bytes wasted between tmp_A.c and tmp_A.i.  If you switch the
order of the fields you should see space savings.  (Note, however, that the
per-row overhead is about 32 bytes, so you'll probably only save about 10%
overall, rather than the 37.5% that one might expect.)

--
Peter Eisentraut
http://developer.postgresql.org/~petere/

Re: Physical column size

От
Martijn van Oosterhout
Дата:
On Fri, Mar 03, 2006 at 11:03:24AM +0100, Paul Mackay wrote:
> The end result is that the physical size on disk used by table tmp_A is
> exactly the same as table tmp_B (as revealed by the pg_relation_size
> function) ! Given that a "char" field is supposed to be 1 byte in size and a
> int4 4 bytes, shouldn't the tmp_A use a smaller disk space ? Or is it that
> any value, whatever the type, requires at least 4 bytes to be stored ?

Alignment. An int4 value must start on a multiple of 4 offset, so you
get three bytes of padding. If you put the int4, then the char it
should work better. Although whole rows have alignment requirements
too...

Have a nice day,
--
Martijn van Oosterhout   <kleptog@svana.org>   http://svana.org/kleptog/
> Patent. n. Genius is 5% inspiration and 95% perspiration. A patent is a
> tool for doing 5% of the work and then sitting around waiting for someone
> else to do the other 95% so you can sue them.

Вложения

Re: Physical column size

От
Ragnar
Дата:
On fös, 2006-03-03 at 11:03 +0100, Paul Mackay wrote:
> Hi,
>
> I've created a table like this :
> CREATE TABLE tmp_A (
> c "char",
> i int4
> );
>
> And another one
> CREATE TABLE tmp_B (
> i int4,
> ii int4
> );
>
> I then inserted a bit more than 19 million rows in each table (exactly
> the same number of rows in each).
>
> The end result is that the physical size on disk used by table tmp_A
> is exactly the same as table tmp_B (as revealed by the
> pg_relation_size function) ! Given that a "char" field is supposed to
> be 1 byte in size and a int4 4 bytes, shouldn't the tmp_A use a
> smaller disk space ? Or is it that any value, whatever the type,
> requires at least 4 bytes to be stored ?

the int4 needs to be aligned at 4 bytes boundaries,
making wasted space after the char.

this would probably be the same size:

CREATE TABLE tmp_C (
 c "char",
 cc "char",
 i int4
);

and this would be smaller:

CREATE TABLE tmp_D (
 c "char",
 cc "char",
 ccc "char",
);

P.S.: I did not actually check to
see if the "char" type needs to be aligned,
by I assumed not.



Re: Physical column size

От
Tom Lane
Дата:
Peter Eisentraut <peter_e@gmx.net> writes:
> An int4 field is required to be aligned at a 4-byte boundary internally, so
> there are 3 bytes wasted between tmp_A.c and tmp_A.i.  If you switch the
> order of the fields you should see space savings.

Probably not, because the row-as-a-whole has alignment requirements too.
In this example you'll just move the pad bytes from one place to
another.

            regards, tom lane