Обсуждение: Row size overhead
Hi,
I have a table
CREATE TABLE "MsgCommon"
(
"UTC" timestamp without time zone NOT NULL,
"UID" bigint NOT NULL,
"DataSourceID" integer NOT NULL,
"DataSourceType" integer NOT NULL,
"BSCArchive" boolean NOT NULL,
"Oddities" integer NOT NULL,
"Encapsulated" boolean NOT NULL,
"Formatter" character(3),
"MMSI" integer
);
with 3358604 rows.
I’m confused with the table size which is 245 MB.
Simple calculations show that each row occupies 76 bytes approximately.
But anticipated row size would be 41 or near.
select
pg_column_size( '2001-01-01'::timestamp without time zone ) +
pg_column_size( 0::bigint ) +
pg_column_size( 0::integer ) +
pg_column_size( 0::integer ) +
pg_column_size( true ) +
pg_column_size( 0::integer ) +
pg_column_size( true ) +
pg_column_size( '0'::character(3) ) +
pg_column_size( 0::integer );
Does the presence of HeapTupleHeaderData, ‘null bitmap’, aligning to MAXALIGN distance and other additional per row/page fields explain this difference (as described in “53.3. Database Page Layout”)?
Is there a way to reduce these overheads?
It may seem strange to you but in MSSQLServer2005 the same table occupies 150 MB only (47 bytes per row).
Thanks in advance,
Zubkovsky Sergey
2008/3/19 Zubkovsky, Sergey <Sergey.Zubkovsky@transas.com>: > > Simple calculations show that each row occupies 76 bytes approximately. > > But anticipated row size would be 41 or near. > You haven't mentioned PG version. For 8.2 onwards, the tuple header is 23 bytes. Add another 4 bytes for one line pointer for each row. If you have null values, another 5 bytes for null bitmap and alignment. Plus add few bytes for page header and any unusable space in a page (because a row can not fit in the remaining space). Also ISTM that you might be loosing some space because of alignment in the tuple itself. Try moving booleans and char(3) at the end. There is not much you can do with other overheads. Thanks, Pavan -- Pavan Deolasee EnterpriseDB http://www.enterprisedb.com
Thanks for your reply. I had used PG 8.3.1 on 32-bit WinXP platform. "PostgreSQL 8.3.1, compiled by Visual C++ build 1400" But I suppose this fact doesn't change anything essentially. Thanks, Sergey Zubkovsky -----Original Message----- From: Pavan Deolasee [mailto:pavan.deolasee@gmail.com] Sent: Wednesday, March 19, 2008 8:23 PM To: Zubkovsky, Sergey Cc: pgsql-general@postgresql.org Subject: Re: [GENERAL] Row size overhead 2008/3/19 Zubkovsky, Sergey <Sergey.Zubkovsky@transas.com>: > > Simple calculations show that each row occupies 76 bytes approximately. > > But anticipated row size would be 41 or near. > You haven't mentioned PG version. For 8.2 onwards, the tuple header is 23 bytes. Add another 4 bytes for one line pointer for each row. If you have null values, another 5 bytes for null bitmap and alignment. Plus add few bytes for page header and any unusable space in a page (because a row can not fit in the remaining space). Also ISTM that you might be loosing some space because of alignment in the tuple itself. Try moving booleans and char(3) at the end. There is not much you can do with other overheads. Thanks, Pavan -- Pavan Deolasee EnterpriseDB http://www.enterprisedb.com
On Thursday 20 March 2008 7:24 am, Zubkovsky, Sergey wrote: > Thanks for your reply. > > I had used PG 8.3.1 on 32-bit WinXP platform. > "PostgreSQL 8.3.1, compiled by Visual C++ build 1400" > But I suppose this fact doesn't change anything essentially. > > Thanks, > Sergey Zubkovsky What you are probably seeing is row depth not row width. Postgres uses MVCC and so there can be multiple versions of a row in existence at one time. For a better explanation see: http://www.postgresql.org/docs/8.3/interactive/routine-vacuuming.html Try running Vacuum and/or Vacuum Full and see what happens to table size. > > > -----Original Message----- > From: Pavan Deolasee [mailto:pavan.deolasee@gmail.com] > Sent: Wednesday, March 19, 2008 8:23 PM > To: Zubkovsky, Sergey > Cc: pgsql-general@postgresql.org > Subject: Re: [GENERAL] Row size overhead > > 2008/3/19 Zubkovsky, Sergey <Sergey.Zubkovsky@transas.com>: > > Simple calculations show that each row occupies 76 bytes > > approximately. > > > But anticipated row size would be 41 or near. > > You haven't mentioned PG version. For 8.2 onwards, the tuple header is > 23 bytes. Add another 4 bytes for one line pointer for each row. If you > have > null values, another 5 bytes for null bitmap and alignment. Plus add few > bytes > for page header and any unusable space in a page (because a row can not > fit in the remaining space). > > Also ISTM that you might be loosing some space because of alignment > in the tuple itself. Try moving booleans and char(3) at the end. There > is not > much you can do with other overheads. > > > Thanks, > Pavan > > > > -- > Pavan Deolasee > EnterpriseDB http://www.enterprisedb.com -- Adrian Klaver aklaver@comcast.net
In my test all 3358604 rows were inserted by single 'COPY FROM file' command. So it's obviously that each row has only one version. > For 8.2 onwards, the tuple header is > 23 bytes. Add another 4 bytes for one line pointer for each row. If you > have > null values, another 5 bytes for null bitmap and alignment. Plus add few > bytes > for page header and any unusable space in a page (because a row can not > fit in the remaining space). > Simple calculations show that each row occupies 76 bytes approximately. > But anticipated row size would be 41 or near. Row overhead: 23 + 4 + 5 = 32 Total row size: 32 + 41 = 73 Unfortunately value '76' has an objective explanation. Thanks, Sergey Zubkovsky -----Original Message----- From: Adrian Klaver [mailto:aklaver@comcast.net] Sent: Thursday, March 20, 2008 5:44 PM To: pgsql-general@postgresql.org Cc: Zubkovsky, Sergey; Pavan Deolasee Subject: Re: [GENERAL] Row size overhead On Thursday 20 March 2008 7:24 am, Zubkovsky, Sergey wrote: > Thanks for your reply. > > I had used PG 8.3.1 on 32-bit WinXP platform. > "PostgreSQL 8.3.1, compiled by Visual C++ build 1400" > But I suppose this fact doesn't change anything essentially. > > Thanks, > Sergey Zubkovsky What you are probably seeing is row depth not row width. Postgres uses MVCC and so there can be multiple versions of a row in existence at one time. For a better explanation see: http://www.postgresql.org/docs/8.3/interactive/routine-vacuuming.html Try running Vacuum and/or Vacuum Full and see what happens to table size. > > > -----Original Message----- > From: Pavan Deolasee [mailto:pavan.deolasee@gmail.com] > Sent: Wednesday, March 19, 2008 8:23 PM > To: Zubkovsky, Sergey > Cc: pgsql-general@postgresql.org > Subject: Re: [GENERAL] Row size overhead > > 2008/3/19 Zubkovsky, Sergey <Sergey.Zubkovsky@transas.com>: > > Simple calculations show that each row occupies 76 bytes > > approximately. > > > But anticipated row size would be 41 or near. > > You haven't mentioned PG version. For 8.2 onwards, the tuple header is > 23 bytes. Add another 4 bytes for one line pointer for each row. If you > have > null values, another 5 bytes for null bitmap and alignment. Plus add few > bytes > for page header and any unusable space in a page (because a row can not > fit in the remaining space). > > Also ISTM that you might be loosing some space because of alignment > in the tuple itself. Try moving booleans and char(3) at the end. There > is not > much you can do with other overheads. > > > Thanks, > Pavan > > > > -- > Pavan Deolasee > EnterpriseDB http://www.enterprisedb.com -- Adrian Klaver aklaver@comcast.net