Обсуждение: double precision[] storage space questions

Поиск
Список
Период
Сортировка

double precision[] storage space questions

От
greg
Дата:
Hi all

I cannot find any documentation on the space taken by a double precision
array. And the few tests I did surprise me.

Here are a few tries I did to understand
select pg_column_size(1.1::double precision)              return 8    --- as
expected
select pg_column_size('{}'::double precision[])           return 16  --- ok
maybe an array header
select pg_column_size('{1.111}'::double precision[])   return 32  --- I
expected 16+ sizeof(double) = 24

select pg_column_size('{1.0,2.0,3.0,4.0,5.0,6.0,7.0,8.0,9.0,10.0}'::double
precision[])  return 104 --- I'am lost because I expected 10*16 + 16 = 176.
It is neither 16+10*8 (96)

So what is happening behind the scene, I did not found any documentation.
Since the queries are done in memory I suppose no compression is going on.
Furthermore select
pg_column_size('{1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0}'::double
precision[]) return 104 as well. So I discard compression.

The whole point is that in the application I work on, we store double arrays
as bytea (using some serialization before storing the data).
I was very surprised to see that the storage of an array of double take more
space using double precision[] than serializing it and storing it into a
bytea.

Thanks for any help and docs



--
View this message in context: http://postgresql.nabble.com/double-precision-storage-space-questions-tp5853581.html
Sent from the PostgreSQL - general mailing list archive at Nabble.com.


Re: double precision[] storage space questions

От
Tom Lane
Дата:
greg <gregory.jevardat@unige.ch> writes:
> I cannot find any documentation on the space taken by a double precision
> array. And the few tests I did surprise me.

> Here are a few tries I did to understand
> select pg_column_size(1.1::double precision)              return 8    --- as
> expected
> select pg_column_size('{}'::double precision[])           return 16  --- ok
> maybe an array header
> select pg_column_size('{1.111}'::double precision[])   return 32  --- I
> expected 16+ sizeof(double) = 24

'{}' is a zero-dimensional array so it doesn't have the same
dimensionality information that your third case does.  See
the comments at the head of
http://git.postgresql.org/gitweb/?p=postgresql.git;a=blob;f=src/include/utils/array.h;hb=HEAD

> The whole point is that in the application I work on, we store double arrays
> as bytea (using some serialization before storing the data).

TBH, that seems like a pretty silly decision.  It guarantees that you
cannot do any useful manipulations of the array on the database side.

            regards, tom lane


Re: double precision[] storage space questions

От
Francisco Olarte
Дата:
Hi Greg:

On Fri, Jun 12, 2015 at 4:08 PM, greg <gregory.jevardat@unige.ch> wrote:
> I cannot find any documentation on the space taken by a double precision
> array. And the few tests I did surprise me.
>
> Here are a few tries I did to understand
> select pg_column_size(1.1::double precision)              return 8    --- as
> expected
> select pg_column_size('{}'::double precision[])           return 16  --- ok
> maybe an array header
> select pg_column_size('{1.111}'::double precision[])   return 32  --- I
> expected 16+ sizeof(double) = 24
>
> select pg_column_size('{1.0,2.0,3.0,4.0,5.0,6.0,7.0,8.0,9.0,10.0}'::double
> precision[])  return 104 --- I'am lost because I expected 10*16 + 16 = 176.
> It is neither 16+10*8 (96)

That's not a very good set of tests, look at mine ( slightly edited for size );

apc=# select pg_column_size('{}'::double precision[]);
             16
apc=# select pg_column_size('{1.0}'::double precision[]);
             32
apc=# select pg_column_size('{1.0,2.0}'::double precision[]);
             40
apc=# select pg_column_size('{1.0,2.0,3.0}'::double precision[]);
             48
apc=# select pg_column_size('{1.0,2.0,3.0,4.0}'::double precision[]);
             56

Here I already expect 8*n+24, so the data point for 10

apc=# select pg_column_size('{1.0,2.0,3.0,4.0,5.0,6.0,7.0,8.0,9.0,10.0}'::double
precision[]);
            104

Does not surprise me.

> So what is happening behind the scene, I did not found any documentation.
> Since the queries are done in memory I suppose no compression is going on.
> Furthermore select
> pg_column_size('{1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0}'::double
> precision[]) return 104 as well. So I discard compression.

You're hunting for wild things, as said in some other places, headers,
special case for dimensionless arrays, it's a classic thing in
databases.

> The whole point is that in the application I work on, we store double arrays
> as bytea (using some serialization before storing the data).
> I was very surprised to see that the storage of an array of double take more
> space using double precision[] than serializing it and storing it into a
> bytea.

Not too much, just 20 bytes more per column, unless you play
compression tricks. Unless you have lots of small columns, I doubt
serializing/deserializing it is worth the hassle. Postgres does not
always use the most compact form for storage. In fact I would be
greatly surprised that any database stores an array ( which can be
multidimensional, I do not know if other databases have single
dimensional array types ) in a more compact way than an specialized
serialization format for one dimensional double arrays.

Francisco Olarte.


Re: double precision[] storage space questions

От
greg
Дата:
Francisco Olarte wrote
> Hi Greg:
>
> On Fri, Jun 12, 2015 at 4:08 PM, greg <

> gregory.jevardat@

> > wrote:
>> I cannot find any documentation on the space taken by a double precision
>> array. And the few tests I did surprise me.
>>
>> Here are a few tries I did to understand
>> select pg_column_size(1.1::double precision)              return 8    ---
>> as
>> expected
>> select pg_column_size('{}'::double precision[])           return 16  ---
>> ok
>> maybe an array header
>> select pg_column_size('{1.111}'::double precision[])   return 32  --- I
>> expected 16+ sizeof(double) = 24
>>
>> select
>> pg_column_size('{1.0,2.0,3.0,4.0,5.0,6.0,7.0,8.0,9.0,10.0}'::double
>> precision[])  return 104 --- I'am lost because I expected 10*16 + 16 =
>> 176.
>> It is neither 16+10*8 (96)
>
> That's not a very good set of tests, look at mine ( slightly edited for
> size );
>
> apc=# select pg_column_size('{}'::double precision[]);
>              16
> apc=# select pg_column_size('{1.0}'::double precision[]);
>              32
> apc=# select pg_column_size('{1.0,2.0}'::double precision[]);
>              40
> apc=# select pg_column_size('{1.0,2.0,3.0}'::double precision[]);
>              48
> apc=# select pg_column_size('{1.0,2.0,3.0,4.0}'::double precision[]);
>              56
>
> Here I already expect 8*n+24, so the data point for 10
>
> apc=# select
> pg_column_size('{1.0,2.0,3.0,4.0,5.0,6.0,7.0,8.0,9.0,10.0}'::double
> precision[]);
>             104
>
> Does not surprise me.
>
>> So what is happening behind the scene, I did not found any documentation.
>> Since the queries are done in memory I suppose no compression is going
>> on.
>> Furthermore select
>> pg_column_size('{1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0}'::double
>> precision[]) return 104 as well. So I discard compression.
>
> You're hunting for wild things, as said in some other places, headers,
> special case for dimensionless arrays, it's a classic thing in
> databases.
>
>> The whole point is that in the application I work on, we store double
>> arrays
>> as bytea (using some serialization before storing the data).
>> I was very surprised to see that the storage of an array of double take
>> more
>> space using double precision[] than serializing it and storing it into a
>> bytea.
>
> Not too much, just 20 bytes more per column, unless you play
> compression tricks. Unless you have lots of small columns, I doubt
> serializing/deserializing it is worth the hassle. Postgres does not
> always use the most compact form for storage. In fact I would be
> greatly surprised that any database stores an array ( which can be
> multidimensional, I do not know if other databases have single
> dimensional array types ) in a more compact way than an specialized
> serialization format for one dimensional double arrays.
>
> Francisco Olarte.
>
>
> --
> Sent via pgsql-general mailing list (

> pgsql-general@

> )
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-general

Thanks a lot for this crystal clear explanation. Indeed my tests were bad, I
should have find it by myself ... So there is 24 extra bytes per column.
Given the average length of my arrays it is not sure that the byte storage
worth it.


Greg



--
View this message in context:
http://postgresql.nabble.com/double-precision-storage-space-questions-tp5853581p5853626.html
Sent from the PostgreSQL - general mailing list archive at Nabble.com.