Обсуждение: Re: [HACKERS] varchar(), text,char() overhead

Поиск
Список
Период
Сортировка

Re: [HACKERS] varchar(), text,char() overhead

От
darrenk@insightdist.com (Darren King)
Дата:
> Do people want the overhead of char(), varchar(), and text to be reduced
> from 4-bytes to 2-bytes.  We store the length in this overhead, but
> since we have a size limit on tuple size, we can't have a field over 8k
> in size anyway.  Even if we up that to 32k for 6.3, we still only use 2
> bytes.
>
> I have added it to the TODO list.  Most of the code already supports it
> by using VARSIZE and VARDATA macros.  Once the structure size changes,
> the macros change too.  The only issue is places where they take the
> first four bytes of the variable-length type and cast it to an int32,
> which will not work in this case.  We have to change this so it uses the
> macros too.

Would be a nice space-saver if you have tables with many small text fields.

Dig out that old message of mine concerning block size and check out item #4.

Excerpted below if you've finally deleted it... :) :)

> Date: Wed, 29 Jan 1997 13:38:10 -0500
> From: aixssd!darrenk (Darren King)
> Subject: [HACKERS] Max size of data types and tuples.
> ...
> 4.  Since only 13 bits are needed for storing the size of these
> textual fields in a tuple, could PostgreSql use a 16-bit int to
> store it?  Currently, the size is padded to four bytes in the
> tuple and this eats space if you have many textual fields.
> Without further digging, I'm assuming that the size is double-word
> aligned so that the actual text starts on a double-word boundary.
> ...

darrenk

Re: [HACKERS] varchar(), text,char() overhead

От
Bruce Momjian
Дата:
> > macros too.
>
> Would be a nice space-saver if you have tables with many small text fields.
>
> Dig out that old message of mine concerning block size and check out item #4.
>
> Excerpted below if you've finally deleted it... :) :)
>
> > Date: Wed, 29 Jan 1997 13:38:10 -0500
> > From: aixssd!darrenk (Darren King)
> > Subject: [HACKERS] Max size of data types and tuples.
> > ...
> > 4.  Since only 13 bits are needed for storing the size of these
> > textual fields in a tuple, could PostgreSql use a 16-bit int to
> > store it?  Currently, the size is padded to four bytes in the
> > tuple and this eats space if you have many textual fields.
> > Without further digging, I'm assuming that the size is double-word
> > aligned so that the actual text starts on a double-word boundary.
> > ...

I had forgotten about your mention of this.  I am running some tests
now, and things look promising.  However, if we go to 64k or 128k
tuples, we would be in trouble.  (We can do 64k tuples by changing the
'special variable' length value from -1 to 0.


--
Bruce Momjian
maillist@candle.pha.pa.us

Re: [HACKERS] varchar(), text,char() overhead

От
Bruce Momjian
Дата:
>
> I had forgotten about your mention of this.  I am running some tests
> now, and things look promising.  However, if we go to 64k or 128k
> tuples, we would be in trouble.  (We can do 64k tuples by changing the
> 'special variable' length value from -1 to 0.
>

I am not going to make any changes to the variable length overhead for
char(), varchar(), and text at this time.  It is too close to beta.  I
will keep the item on the TODO list, and we can hash it out later.

--
Bruce Momjian
maillist@candle.pha.pa.us

Re: [HACKERS] varchar(), text,char() overhead

От
"Vadim B. Mikheev"
Дата:
Bruce Momjian wrote:
>
> > > macros too.
> >
> > Would be a nice space-saver if you have tables with many small text fields.
> >
> > Dig out that old message of mine concerning block size and check out item #4.
> >
> > Excerpted below if you've finally deleted it... :) :)
> >
> > > Date: Wed, 29 Jan 1997 13:38:10 -0500
> > > From: aixssd!darrenk (Darren King)
> > > Subject: [HACKERS] Max size of data types and tuples.
> > > ...
> > > 4.  Since only 13 bits are needed for storing the size of these
> > > textual fields in a tuple, could PostgreSql use a 16-bit int to
> > > store it?  Currently, the size is padded to four bytes in the
> > > tuple and this eats space if you have many textual fields.
> > > Without further digging, I'm assuming that the size is double-word
> > > aligned so that the actual text starts on a double-word boundary.
> > > ...
>
> I had forgotten about your mention of this.  I am running some tests
> now, and things look promising.  However, if we go to 64k or 128k
> tuples, we would be in trouble.  (We can do 64k tuples by changing the
          ^^^^^^^^^^^^^^^^^^^^^^
Also, multi-representation feature allows to have 2Gb in varlena fields.

> 'special variable' length value from -1 to 0.

Yes, it's way.

Vadim

Re: [HACKERS] varchar(), text,char() overhead

От
Bruce Momjian
Дата:
> > I had forgotten about your mention of this.  I am running some tests
> > now, and things look promising.  However, if we go to 64k or 128k
> > tuples, we would be in trouble.  (We can do 64k tuples by changing the
>           ^^^^^^^^^^^^^^^^^^^^^^
> Also, multi-representation feature allows to have 2Gb in varlena fields.

What is multi-representation feature?  Large objects?



--
Bruce Momjian
maillist@candle.pha.pa.us

Re: [HACKERS] varchar(), text,char() overhead

От
"Vadim B. Mikheev"
Дата:
Bruce Momjian wrote:
>
> > > I had forgotten about your mention of this.  I am running some tests
> > > now, and things look promising.  However, if we go to 64k or 128k
> > > tuples, we would be in trouble.  (We can do 64k tuples by changing the
> >           ^^^^^^^^^^^^^^^^^^^^^^
> > Also, multi-representation feature allows to have 2Gb in varlena fields.
>
> What is multi-representation feature?  Large objects?

Yes. Server could store varlena fields in LO when size of field or
tuple at whole is too big to be stored in relation blocks.
This allows to have tuples much longer than data blocks.
This is also Ok for performance sometime (if big varlenas are not used
in WHERE they could be not read from disk for each tuple; if UPDATE don't
change out-stored varlenas they could be not stored twice).

We could use vl_len < 0 for out-stored varlenas: vl_len = -1000
could mean that size of data is 1000 bytes, data stored in LO and
LO' id (oid?) is in vl_dat. It seems easy to implement (without
optimization of access to data).

Vadim