Обсуждение: What is the difference in storage between a blank string and null?

Поиск
Список
Период
Сортировка

What is the difference in storage between a blank string and null?

От
"Chris Hoover"
Дата:
I'm doing some testing on how to decrease our database size as I work on a partitioning scheme. 

I have found that if I have the database store all empty strings as nulls, I get a significant savings over saving them as blank strings (i.e. '').  Below is an example of savings I am seeing for the same table:

In my test case, storing empty strings  give me a table size of 20,635,648
Storing empty strings as nulls gives me a table size of: 5,742,592.

As you can see, storing empty strings as nulls is saving me approximately 72% on this table.  So, I am wanting to understand what Postgres is doing differently with the nulls.  Would someone kindly enlighten me on this.

(P.S. I am using a nullif(trim(column),'') in my partition and view rules to store the nulls, and coalesce(column,'') to give my application the data back without nulls.)

Thanks,

Chris

PG 8.1

--
Tired of HIGH Gas prices?  Visit http://colafuelguy.mybpi.com to start saving at the pump no matter where you live!

Re: What is the difference in storage between a blank string and null?

От
Kenneth Marshall
Дата:
On Fri, Apr 11, 2008 at 04:02:36PM -0400, Chris Hoover wrote:
> I'm doing some testing on how to decrease our database size as I work on a
> partitioning scheme.
>
> I have found that if I have the database store all empty strings as nulls, I
> get a significant savings over saving them as blank strings (i.e. '').
> Below is an example of savings I am seeing for the same table:
>
> In my test case, storing empty strings  give me a table size of 20,635,648
> Storing empty strings as nulls gives me a table size of: 5,742,592.
>
> As you can see, storing empty strings as nulls is saving me approximately
> 72% on this table.  So, I am wanting to understand what Postgres is doing
> differently with the nulls.  Would someone kindly enlighten me on this.
>
> (P.S. I am using a nullif(trim(column),'') in my partition and view rules to
> store the nulls, and coalesce(column,'') to give my application the data
> back without nulls.)
>
> Thanks,
>
> Chris
>
> PG 8.1
>

PostgreSQL stores NULLs differently. This accounts for your space
difference. If you application can work with NULLs instead of ''
(not the same thing), go for it.

Cheers,
Ken

Re: What is the difference in storage between a blank string and null?

От
Shane Ambler
Дата:
Chris Hoover wrote:
> I'm doing some testing on how to decrease our database size as I work on a
> partitioning scheme.
>
> I have found that if I have the database store all empty strings as nulls, I
> get a significant savings over saving them as blank strings (i.e. '').
> Below is an example of savings I am seeing for the same table:
>
> In my test case, storing empty strings  give me a table size of 20,635,648
> Storing empty strings as nulls gives me a table size of: 5,742,592.
>
> As you can see, storing empty strings as nulls is saving me approximately
> 72% on this table.  So, I am wanting to understand what Postgres is doing
> differently with the nulls.  Would someone kindly enlighten me on this.
>
> (P.S. I am using a nullif(trim(column),'') in my partition and view rules to
> store the nulls, and coalesce(column,'') to give my application the data
> back without nulls.)
>
> Thanks,
>
> Chris
>
> PG 8.1
>

Without looking at the exact storage algorithms or being real picky
about exact specifics -

NULL only needs one bit of storage for each row to indicate that there
is no value stored. This may become one byte if there is only one column
in the table. This is most likely in row storage overhead anyway.

An empty string will use one byte to save the string length (being 0).

So one bit against 1 byte...

It really depends on how many real world rows will have empty strings as
to how much you save.

Chapter 53 in the manual gives a brief overview of data storage but most
of the details will be found in the source.


--

Shane Ambler
pgSQL (at) Sheeky (dot) Biz

Get Sheeky @ http://Sheeky.Biz