Обсуждение: varchar() change

Поиск
Список
Период
Сортировка

varchar() change

От
Bruce Momjian
Дата:
Let me go over the issues with the varchar() change.

char() will continue to store full bytes, while varchar() function like
text, but with a limited length.

Now, pg_attribute.attlen is access everywhere, trying to find out how
long the data field is.  With text, the length is -1, but with varchar
currently, it is the max length, and hence, it has to store all those
bytes.

Now, my idea is to add a new pg_attribute column called 'attmaxlen'
which will hold the maximum length of the field.  char() and varchar()
will use this field, and the code will have be changed.  Cases where
attlen is referenced to determine data size will continue to use -1, but
references to all functions that create a data entry will use the
attmaxlen.  I see 124 references to attlen in the code.  Not too bad.
Most are obvious.

We had some of this work in the past, fixing places where the size was
not properly passed into the table creation code, because varchar() and
char() do not have lengths defined in pg_type like everyone else, but it
is only in pg_attribute.

This is a related change to allow data reference and tuple max length
reference to be separate.  I can see other new types using this field
to.

Come to think of it, I wonder if I could have the disk copy of
pg_attribute use the pg_type length, and use the pg_attribute length
only when creating/updating entries?  I wonder if that is what it does
already.  Looks like that may be true.

Comments?

--
Bruce Momjian
maillist@candle.pha.pa.us

Re: [HACKERS] varchar() change

От
"Vadim B. Mikheev"
Дата:
Bruce Momjian wrote:
>
> Let me go over the issues with the varchar() change.
>
> char() will continue to store full bytes, while varchar() function like
> text, but with a limited length.
>
> Now, pg_attribute.attlen is access everywhere, trying to find out how
> long the data field is.  With text, the length is -1, but with varchar
> currently, it is the max length, and hence, it has to store all those
> bytes.
>
> Now, my idea is to add a new pg_attribute column called 'attmaxlen'
> which will hold the maximum length of the field.  char() and varchar()
> will use this field, and the code will have be changed.  Cases where
> attlen is referenced to determine data size will continue to use -1, but
> references to all functions that create a data entry will use the
> attmaxlen.  I see 124 references to attlen in the code.  Not too bad.
> Most are obvious.

Ok. I agreed that we have to add new column to pg_attribute, but I recommend

1. use some other name - not attmaxlen: this field could be used for
   NUMBER, etc and "maxlen" is not good name for storing precision, etc
   (atttspec ?)
2. use -2 for varchar: let's think about attlen -1 as about "un-limited"
   varlena, and about attlen -2 as about "limited" one, with maxlen
   specified in att???. I don't see problem with -2 - just new case of
   switch (attlen) - and this will allow leave text (-1) untouched
   (or you will have to store -1 in att??? for text to differentiate
    text from varchar)...
   Hmm, ... on the other hand, we could check atttype before switch(attlen)
   in heaptuple.c and other places - don't know what's better...

Vadim

Re: [HACKERS] varchar() change

От
"Thomas G. Lockhart"
Дата:
> Let me go over the issues with the varchar() change.
>
> char() will continue to store full bytes, while varchar() function like
> text, but with a limited length.
>
> Now, pg_attribute.attlen is access everywhere, trying to find out how
> long the data field is.  With text, the length is -1, but with varchar
> currently, it is the max length, and hence, it has to store all those
> bytes.
>
> Now, my idea is to add a new pg_attribute column called 'attmaxlen'
> which will hold the maximum length of the field.  char() and varchar()
> will use this field, and the code will have be changed.  Cases where
> attlen is referenced to determine data size will continue to use -1, but
> references to all functions that create a data entry will use the
> attmaxlen.  I see 124 references to attlen in the code.  Not too bad.
> Most are obvious.
>
> We had some of this work in the past, fixing places where the size was
> not properly passed into the table creation code, because varchar() and
> char() do not have lengths defined in pg_type like everyone else, but it
> is only in pg_attribute.
>
> This is a related change to allow data reference and tuple max length
> reference to be separate.  I can see other new types using this field
> to.
>
> Come to think of it, I wonder if I could have the disk copy of
> pg_attribute use the pg_type length, and use the pg_attribute length
> only when creating/updating entries?  I wonder if that is what it does
> already.  Looks like that may be true.
>
> Comments?

Is what you are trying to do related to what could be used to implement
other (SQL92) data types like numeric(precision,scale) where there are one
or two additional parameters which are assigned when a column/class/type is
defined and which must be available when working with column/class/type
instances? We probably don't want to do anything about the latter for v6.3
(spread pretty thin with the work we've already picked up) but I'd like to
do something for v6.4...

Oh, while I'm thinking about it, this kind of thing is probably also
necessary to get arrays working as expected (enforcing dimensions specified
in the declaration).

                                                               - Tom


Re: [HACKERS] varchar() change

От
Bruce Momjian
Дата:
> > Come to think of it, I wonder if I could have the disk copy of
> > pg_attribute use the pg_type length, and use the pg_attribute length
> > only when creating/updating entries?  I wonder if that is what it does
> > already.  Looks like that may be true.
> >
> > Comments?
>
> Is what you are trying to do related to what could be used to implement
> other (SQL92) data types like numeric(precision,scale) where there are one
> or two additional parameters which are assigned when a column/class/type is
> defined and which must be available when working with column/class/type
> instances? We probably don't want to do anything about the latter for v6.3
> (spread pretty thin with the work we've already picked up) but I'd like to
> do something for v6.4...
>
> Oh, while I'm thinking about it, this kind of thing is probably also
> necessary to get arrays working as expected (enforcing dimensions specified
> in the declaration).

Yes, I had the numeric in mind.

--
Bruce Momjian
maillist@candle.pha.pa.us