Обсуждение: Size comparison between a Composite type and an equivalent Text field

Поиск
Список
Период
Сортировка

Size comparison between a Composite type and an equivalent Text field

От
denis@edistar.com
Дата:
I need to store very large integers (more of 30 digits).

I found two solutions to this problem:
- using a text field
- splitting the integer into 2 parts and then storing them in a
composite type with 2 bigint fields

The definitive choice will depend on the disk space used by one solution
instead of the other.

I think the storage size of the text field will be the number of the
digits plus some extra data for the structure of the text type.
For the composite type the size will be of 2*8 bytes plus the extra data
for maintaning the structure of the type.

For example if i have an integer of 30 digits:

Text Field
30 bytes + sizeof(text data structure)

Composite type
8 bytes + sizeof(composite data structure)

What of the two data structures (text or composite) will use more disk
space?

Thank you,

--
Doct. Eng. Denis Gasparin
---------------------------
Edistar srl


Re: Size comparison between a Composite type and an

От
Douglas McNaught
Дата:
denis@edistar.com writes:

> I need to store very large integers (more of 30 digits).

Er,

What's wrong with the NUMERIC type?  That can go up to hundreds of
digits.

-Doug

Re: Size comparison between a Composite type and an

От
Denis Gasparin
Дата:
Hi Doug.

I considered also the numeric type. In that case if the number is of 32
digits the storage size is of 2*8 + 8 = 24 bytes.
If i store it using a composite data type of two bigints the size is 2*8
+ composite data structure overhead bytes.

If the composite data type has 4 bytes overhead, I save 4 bytes for each
number... that is important because I must store many many numbers.

Performance speaking, the numeric type can be indexed?
In the case of composite data types, I must create an operator class for
indexing the fields of that type...
What is the performance gap between indexed numeric and composite?

Thank you,
Denis


Douglas McNaught wrote:
> denis@edistar.com writes:
>
>
>> I need to store very large integers (more of 30 digits).
>>
>
> Er,
>
> What's wrong with the NUMERIC type?  That can go up to hundreds of
> digits.
>
> -Doug
>
> ---------------------------(end of broadcast)---------------------------
> TIP 9: In versions below 8.0, the planner will ignore your desire to
>        choose an index scan if your joining column's datatypes do not
>        match
>
>


Re: Size comparison between a Composite type and an

От
Douglas McNaught
Дата:
Denis Gasparin <denis@edistar.com> writes:

> If the composite data type has 4 bytes overhead, I save 4 bytes for
> each number... that is important because I must store many many
> numbers.

Yes, if size is a big issue you might be better off with a specialized
type.

-Doug

Re: Size comparison between a Composite type and an

От
Martijn van Oosterhout
Дата:
On Tue, Feb 28, 2006 at 09:51:54PM +0100, Denis Gasparin wrote:
> Hi Doug.
>
> I considered also the numeric type. In that case if the number is of 32
> digits the storage size is of 2*8 + 8 = 24 bytes.
> If i store it using a composite data type of two bigints the size is 2*8
> + composite data structure overhead bytes.
>
> If the composite data type has 4 bytes overhead, I save 4 bytes for each
> number... that is important because I must store many many numbers.

Well, that's an assumption. There was a time where composite types had
a 10+ byte overhead. I'm not sure what the current overhead is but I'm
fairly sure it's more than 4.

> Performance speaking, the numeric type can be indexed?

Ofcourse.

> In the case of composite data types, I must create an operator class for
> indexing the fields of that type...
> What is the performance gap between indexed numeric and composite?

I think numeric will win easily. a few other things:

1. Alignment will probably eat any small savings you make
2. Numeric will take less space for small numbers
3. You have to make functions to index these composite type. Unless you
write them directly in C, the numeric code will win just on the
overhead of whatever other language you use to do it.

The best performance might come from making your own int16 type but all
this seems like premature optimisation to me. How about you start with
numeric and optimise when you see there's an actual problem.

Have a nice day,
--
Martijn van Oosterhout   <kleptog@svana.org>   http://svana.org/kleptog/
> Patent. n. Genius is 5% inspiration and 95% perspiration. A patent is a
> tool for doing 5% of the work and then sitting around waiting for someone
> else to do the other 95% so you can sue them.

Вложения

Re: Size comparison between a Composite type and an

От
denis@edistar.com
Дата:
I made some tests with three different types:

numeric, text and a specialized type written in c.

The tests were made with 20 digit codes.

The specialized type was a struct defined as:

typdef struct mycode {
    char c1;
    char c2;
    int32 c3;
    int32 c4;
} mycode

The sizeof(mycode) returns 12 bytes so i assume this as storage size of
mycode type.
The mycode type was created with the following sql command:
CREATE TYPE mycode (
   internallength = 12,
   input = mycode_in,
   output = mycode_out,
   alignment = int
);


The text data type has a storage size of 20 bytes + 4 bytes overhead =
24 bytes.

The numeric data type has a storage size of 10 bytes + 8 bytes overhead
= 18 bytes.

I made three tables of one column using the three different data types
and checked the size in bytes of the three tables.

The results were not as expected.

I was expecting these theoretical results:
mycode: 1.000.000 of records =>  12.000.000 bytes
numeric: 1.000.000 of records => 18.000.000 bytes
text: 1.000.000 of records => 24.000.000 bytes

That is the final size of the table with the text data type to be the
double of mycode type.

The real results were:
mycode: 1.000.000 of records =>  65.159.168 bytes
numeric: 1.000.000 of records => 74.895.702 bytes
text: 1.000.000 of records => 77.340.672 bytes

The "text" table is only 16% larger than mycode one (I was expecting 100%!).

Any idea?

Thank you,
Denis

Douglas McNaught wrote:
> Denis Gasparin <denis@edistar.com> writes:
>
>
>> If the composite data type has 4 bytes overhead, I save 4 bytes for
>> each number... that is important because I must store many many
>> numbers.
>>
>
> Yes, if size is a big issue you might be better off with a specialized
> type.
>
> -Doug
>
>


Re: Size comparison between a Composite type and an

От
Tom Lane
Дата:
denis@edistar.com writes:
> I made three tables of one column using the three different data types
> and checked the size in bytes of the three tables.
> The results were not as expected.

You forgot to consider per-row overhead, including alignment padding.

            regards, tom lane

Re: Size comparison between a Composite type and an

От
Martijn van Oosterhout
Дата:
On Wed, Mar 01, 2006 at 05:24:03PM +0100, denis@edistar.com wrote:
> I made some tests with three different types:
>
> numeric, text and a specialized type written in c.
>
> The tests were made with 20 digit codes.

<snip>

> The results were not as expected.
>
> I was expecting these theoretical results:
> mycode: 1.000.000 of records =>  12.000.000 bytes
> numeric: 1.000.000 of records => 18.000.000 bytes
> text: 1.000.000 of records => 24.000.000 bytes
>
> That is the final size of the table with the text data type to be the
> double of mycode type.
>
> The real results were:
> mycode: 1.000.000 of records =>  65.159.168 bytes
> numeric: 1.000.000 of records => 74.895.702 bytes
> text: 1.000.000 of records => 77.340.672 bytes
>
> The "text" table is only 16% larger than mycode one (I was expecting 100%!).

You're missing the per-tuple overhead which is approximatly 28 bytes.
Once you take alignment into account, it's not surprising the results
are closer than you expected.

Have a nice day,
--
Martijn van Oosterhout   <kleptog@svana.org>   http://svana.org/kleptog/
> Patent. n. Genius is 5% inspiration and 95% perspiration. A patent is a
> tool for doing 5% of the work and then sitting around waiting for someone
> else to do the other 95% so you can sue them.

Вложения