Обсуждение: Primary key efficiency

Поиск
Список
Период
Сортировка

Primary key efficiency

От
Russell Shaw
Дата:
Hi,

Is a long field ok for a primary key, or should i use
a unique integer?:

create table parts (
     manufacturers_code char(40) primary key,
     description char(40),
     man_id int references manufacturers (man_id)
);


Re: Primary key efficiency

От
"V i s h a l Kashyap @ [Sai Hertz And Control Systems]"
Дата:
Dear Russell Shaw ,

> Hi,

Hello

>
> Is a long field ok for a primary key, or should i use
> a unique integer?:

Unique Integer  is the most preferable thing as this retrive your data
at faster.
All said , if your select queries would use this primary key  for
identifying  tuples then better use integer


--
Regards,
Vishal Kashyap

~*~*~*~*~*~*~*~*~*~*~*~*~*~*~*~*~*~*~*~*~*~*~*~*
I Know you believe my words so logon to Jabber.org
and add vishalkashyap@jabber.org to your roster.
~*~*~*~*~*~*~*~*
I am usually called as Vishal Kashyap
but my Girlfriend calls me as Vishal CASH UP.
This is because others identify me because of my generosity
but my Girlfriend identify me because of my CASH.
~*~*~*~*~*~*~*~*~*~*~*~*~*~*~*~*~*~*~*~*~*~*~*~*


Re: Primary key efficiency

От
Tom Lane
Дата:
Russell Shaw <rjshaw@iprimus.com.au> writes:
> Is a long field ok for a primary key, or should i use
> a unique integer?:

> create table parts (
>      manufacturers_code char(40) primary key,
>      description char(40),
>      man_id int references manufacturers (man_id)
> );

There's nothing wrong with using a character field as primary key,
but I'd advise you to think twice about defining it as char(40).
Almost certainly you want varchar(40), so as not to be wasting huge
amounts of space on padding blanks.  For that matter, where did
the "40" come from in the first place?  Is there a really good
application-driven reason to limit the codes or descriptions to 40
characters?  If your answer reveals that the number was picked out
of the air, I'd suggest dropping the constraint entirely.  Text or
unconstrained varchar is a better choice, though it's not completely
SQL-standard.

One thing you should think carefully about before using externally
supplied data as a primary key is "what happens if the manufacturer's
code changes"?  You'll have to update not only this table, but all
references to it from other tables.  It's usually considered good
practice to choose primary keys that will *never* change, and that
essentially means that they can't have any externally-imposed meaning.
The popularity of auto-generated serial numbers as primary keys comes
from this consideration.

            regards, tom lane

Re: Primary key efficiency

От
Russell Shaw
Дата:
Tom Lane wrote:
> Russell Shaw <rjshaw@iprimus.com.au> writes:
>
>>Is a long field ok for a primary key, or should i use
>>a unique integer?:
>
>
>>create table parts (
>>     manufacturers_code char(40) primary key,
>>     description char(40),
>>     man_id int references manufacturers (man_id)
>>);
>
>
> There's nothing wrong with using a character field as primary key,
> but I'd advise you to think twice about defining it as char(40).
> Almost certainly you want varchar(40), so as not to be wasting huge
> amounts of space on padding blanks.

I'd assumed databases would save the number of blanks needed in any
position, and restore/add the blanks when the field was retrieved.

 > For that matter, where did
> the "40" come from in the first place?  Is there a really good
> application-driven reason to limit the codes or descriptions to 40
> characters?  If your answer reveals that the number was picked out
> of the air, I'd suggest dropping the constraint entirely.  Text or
> unconstrained varchar is a better choice, though it's not completely
> SQL-standard.

I assumed that for some reason it would be faster or more efficient
than something of unfixed length or else why does it exist? Maybe it
is only kept for compatability purposes?

> One thing you should think carefully about before using externally
> supplied data as a primary key is "what happens if the manufacturer's
> code changes"?  You'll have to update not only this table, but all
> references to it from other tables.  It's usually considered good
> practice to choose primary keys that will *never* change, and that
> essentially means that they can't have any externally-imposed meaning.
> The popularity of auto-generated serial numbers as primary keys comes
> from this consideration.

That is something hard to decide on. If a part becomes obsolete, i
thought maybe i could fill in an "obsolete" field so that any report
that is generated using it will alert me to change to a new part in
anything that uses it.

I'm using postgresql from php script embedded in html (apache on debian, etc).


Re: Primary key efficiency

От
Tom Lane
Дата:
Russell Shaw <rjshaw@iprimus.com.au> writes:
> Tom Lane wrote:
>> Almost certainly you want varchar(40), so as not to be wasting huge
>> amounts of space on padding blanks.

> I'd assumed databases would save the number of blanks needed in any
> position, and restore/add the blanks when the field was retrieved.

In Postgres, if the field is really wide (like kilobytes) then
compression will kick in and pretty much eliminate runs of spaces,
or runs of anything else for that matter.  But I doubt it would get
applied to 40-byte fields.

> I assumed that for some reason it would be faster or more efficient
> than something of unfixed length or else why does it exist? Maybe it
> is only kept for compatability purposes?

Didn't you notice the contradiction to your previous assumption?  Either
the field is fixed-width or it's not, you don't get to have it both
ways.  It is true that there are certain optimizations that can be
applied to fixed-width fields, but they are relatively minor in
Postgres.  (In databases that do overwrite-in-place, it can be a
significant win to ensure that *all* the fields of a record are fixed
width and so the total record size is fixed, but Postgres doesn't do
that anyway.)  In any case, CHAR(n) is never considered a fixed-width
type in Postgres, because N is measured in characters not bytes and so
the physical width is variable anyway, at least when using
variable-width character set encodings.

Bottom line is that there are no efficiency advantages to CHAR(n) in
Postgres, although there can be some in old-line databases.  You should
only use it if your data actually has a semantic constraint to a fixed
width --- postal codes are a common example of something that really is
appropriate for CHAR(n).

            regards, tom lane