Обсуждение: Database Best Practices

Поиск
Список
Период
Сортировка

Database Best Practices

От
ghoffman@ucsd.edu (Gary Hoffman)
Дата:
To all the database experts out there:

As I begin to design my PostGreSQL tables, I'm looking for some
over-the-shoulder advice on the order of "best practices" recommendations.
Has anyone gathered their experience together into such a book or website?

For example, I'd like to know, in designing a name-and-address table,
  - should a title field be provided? (Mr., Dr., etc.)
  - how about a suffix (Jr., III, etc.)
  - how many address lines should I have for a home address?
  - should all the fields be type 'text'? Should Zip be 'text' or 'int4'?
  - what schema handles Zip-plus-four best (including the blasted hyphen)?
  - what is the best database schema for a universal, international
address table that includes individuals and companies?

Now that I know _how_ to create tables and columns, I want to know
_how_best_ to do so.

Thanks for your tips.

Gary

**************************************************************************
* Gary B. Hoffman, Computing Services Manager  e-mail: ghoffman@ucsd.edu *
* Graduate School of International Relations and Pacific Studies (IR/PS) *
* University of California, San Diego (UCSD)       voice: (858) 534-1989 *
* 9500 Gilman Dr., La Jolla, CA 92093-0519 USA       fax: (858) 534-3939 *
**************************************************************************


Re: [GENERAL] Database Best Practices

От
"Oliver Elphick"
Дата:
Gary Hoffman wrote:
  >To all the database experts out there:
  >
  >As I begin to design my PostGreSQL tables, I'm looking for some
  >over-the-shoulder advice on the order of "best practices" recommendations.
  >Has anyone gathered their experience together into such a book or website?
  >
  >For example, I'd like to know, in designing a name-and-address table,
  >  - should a title field be provided? (Mr., Dr., etc.)
  >  - how about a suffix (Jr., III, etc.)
  >  - how many address lines should I have for a home address?
  >  - should all the fields be type 'text'? Should Zip be 'text' or 'int4'?
  >  - what schema handles Zip-plus-four best (including the blasted hyphen)?
  >  - what is the best database schema for a universal, international
  >address table that includes individuals and companies?

If you want a database that can take international addresses, forget about
the rigid scheme of US addresses; most countries are much more free-form.

British postcodes are not numeric (mine is "PO30 1XP"), so a zip field that
is typed int4 would not accommodate British codes at all.

A full postal address might look like this:

  Mr A.B. Cousins                      name
  Personnel Manager                    position
  Contorted Widgets Ltd                company
  Contortion House                     building name
  5 Park Road                          street address
  Acme Industrial Estate               block
  Little Wittering                     village
  Trumpington                          post town
  Hexhamshire                          county
  TR12 5SQ                             post code
  UK                                   country

That's extreme, but I've seen some nearly as bad.  The Post Office
don't need all of that, but it helps someone who is driving around and
trying to find the place.

Putting them together is more of a problem, since the order of fields
differs between countries.  I believe German addresses go from larger
area to smaller rather than smaller to larger, for example.


Here's a sample schema that differentiates between individuals and
organisations.  It regards addresses as separate from people, because a
person can have more than one address, and people can share the same
address.

======================== schema ===================================
-- refint functions (from postgresql-contrib package).  These are
-- needed until foreign keys are implemented in PostgreSQL
\i /usr/lib/postgresql/modules/refint.sql

-------------------------- country --------------------------------
create table country
(
        id              char(2)         primary key
                                        check (id ~ '[A-Z]{2}'),
        name            text            not null,
        region          text,
        telcode         text
)
;

-- foreign key triggers
create trigger country_fref
        before DELETE or UPDATE on country
        for each row execute procedure
        check_foreign_key(1, 'restrict', 'id',
                'address', 'country'
                );

-- country contains the full list of ISO country codes, and the region of
-- the world where those countries are to be found.
-- id      = ISO country code
-- name    = country name
-- region  = continent or area where the country is
-- telcode = telephone code for the country

-------------------------- address --------------------------------
create table address
(
        id              int             primary key,
        house           text,
        number          text,
        street          text,
        village         text,
        town            text            not null,
        county          text,
        postcode        char(9),
        country         char(2)         references country (id),
        phone1_country  text,
        phone1_area     text,
        phone1_number   text,
        phone1_ext      text,
        phone2_country  text,
        phone2_area     text,
        phone2_number   text,
        phone2_ext      text,
        fax_country     text,
        fax_area        text,
        fax_number      text,
        fax_ext         text,
        comment         text
)
;

-- Use SPI triggers until foreign keys are implemented
create trigger address_pkref
create trigger address_pkref
        before INSERT or UPDATE on address
        for each row execute procedure
        check_primary_key('country', 'country', 'id');

create trigger address_fref1
        before DELETE or UPDATE on address
        for each row execute procedure
        check_foreign_key(1, 'restrict', 'id',
                'person_address', 'address');

create trigger address_fref2
        before DELETE or UPDATE on address
        for each row execute procedure
        check_foreign_key(3, 'setnull', 'id',
                'individual', 'address',
                'organisation', 'address',
                'person', 'address'
                );

-- id       = unique identifier of this address
-- house    = house name
-- number   = house number
-- street   = street address (excluding house name or number)
-- village  = second part of street address, if any
-- town     = city or post town
-- county   = county (may sometimes be blank)
-- postcode = post or zip code
-- country  = country code
-- phone1_country   = primary telephone country code
-- phone1_area      = primary telephone area code
-- phone1_number    = primary telephone number
-- phone1_ext       = primary telephone extension
-- phone2_country   = secondary telephone country code
-- phone2_area      = secondary telephone area code
-- phone2_number    = secondary telephone number
-- phone2_ext       = secondary telephone extension
-- fax_country   = fax telephone country code
-- fax_area      = fax telephone area code
-- fax_number    = fax telephone number
-- fax_ext       = fax telephone extension
-- comment       = free text

-- A person (of any kind) can have one or more addresses.  All fields here
-- are associated with a particular physical location


-------------------------- person --------------------------------
create table person
(
        id              char(10)        primary key,
        name            text            not null,
        address         int             references address (id),
        salutation      text            default 'Dear Sir',
        envelope        text,
        email           text,
        www             text,
        comment         text
)
;

-- Use SPI triggers until foreign keys are implemented


create trigger person_pkref
        before INSERT or UPDATE on person
        for each row execute procedure
        check_primary_key('address', 'address', 'id');

create trigger person_fref_x
        before DELETE or UPDATE on person
        for each row execute procedure
        check_foreign_key(1, 'restrict', 'id',
                'person_address', 'person'
                );


-- Top-level class that describes persons (living or otherwise)
-- id         = identifier
-- address    = id of the primary address of this person
-- salutation = how to address this person in the start of a letter
--                  (e.g.: "Dear Fred")
-- envelope   = how to address this person on an envelope
--                  (e.g.: "Mr F.G. Bloggs")
-- email      = email address
-- www        = Web URL
-- comment       = free text

-- Dependencies: address


-------------------------- individual --------------------------------
create function name(bpchar, bpchar, bpchar) returns bpchar
        as '/usr1/proj/bray/sql/funcs.so' language 'c';

create table individual
(
        gender          char(1)         check (gender = 'M' or gender = 'F' or gender is null),
        born            date        check ((born >= '1 Jan 1880' and born <= 'today') or born is null),
        surname         text,
        forenames       text,
        title           text,
        old_surname     text,
        mobile          text,
        ni_no           text

        constraint is_named check (not (surname isnull and forenames isnull))
)
        inherits (person)
;

create trigger individual_pkref
        before INSERT or UPDATE on individual
        for each row execute procedure
        check_primary_key('address', 'address', 'id');

create trigger individual_fref_x
        before DELETE or UPDATE on individual
        for each row execute procedure
        check_foreign_key(1, 'restrict', 'id',
                'person_address', 'person',
                'organisation', 'contact'
                );

-- Table of living individuals (as opposed to firms or legal persons)
-- gender    = 'M' or 'F'
-- born      = date of birth
-- surname   = surname
-- forenames = forenames
-- title     = Mr, Mrs, etc
-- old_surname   = maiden name, etc.

-- Inheritance: person



-------------------------- organisation --------------------------------
create table organisation
(
        structure      char(1)         check (structure='L' or structure='C' or structure='U' or structure='O'),
    contact        char(10)        references individual (id),
    old_name       text
)
        inherits (person)
;


create trigger organisation_pkref
        before INSERT or UPDATE on organisation
        for each row execute procedure
        check_primary_key('address', 'address', 'id');

create trigger organisation_pkref_c
        before INSERT or UPDATE on organisation
        for each row execute procedure
        check_primary_key('contact', 'individual', 'id');

create trigger organisation_fref_x
        before DELETE or UPDATE on organisation
        for each row execute procedure
        check_foreign_key(1, 'restrict', 'id',
                'person_address', 'person'
                );



-- Defines persons that are not individuals
-- contact      = id of the primary person to be contacted when dealing
--                with this organisation
-- structure    = L(imited), C(orporation), U(nincorporated) or O(ther)
-- old_name     = previous organisation name (if any)

-- Inheritance: person
-- Dependencies: individual, address

---------------- person_address relation ----------------------
create table person_address
(
        person          char(10)        not null
                                         -- references person*,
                                         references person(id),
        address         int             not null
                                         references address(id),
        function        text            not null,

        primary key (person, address, function)
)
;

-- refint triggers
create trigger person_address_pkref_1
        before INSERT or UPDATE on person_address
        for each row execute procedure
        check_primary_key('person', 'person*', 'id');

create trigger person_address_pkref_2
        before INSERT or UPDATE on person_address
        for each row execute procedure
        check_primary_key('address', 'address', 'id');

-- Relation of people to addresses (many-to-many relationship)
-- person   =  id of a person
-- address  =  id of an address
-- function =  use of address (e.g. accounts, deliveries...)

======================== schema ===================================
--
      Vote against SPAM: http://www.politik-digital.de/spam/
                 ========================================
Oliver Elphick                                Oliver.Elphick@lfix.co.uk
Isle of Wight                              http://www.lfix.co.uk/oliver
               PGP key from public servers; key ID 32B8FAA1
                 ========================================
     "Behold, what manner of love the Father hath bestowed
      upon us, that we should be called the sons of God..."
                                I John 3:1



RE: [GENERAL] Database Best Practices

От
"Terence Chan"
Дата:
Yes, Oliver is right.

We don't even have city here in Hong Kong. You know Hong Kong is now a city
of China, but Hong Kong also has its own Country Code.

I really hate to fill in a web form which force me to enter n/a into the Zip
Code, City etc...

R.

Terence Chan
Cetec Ltd.
Hong Kong


-----Original Message-----
From:    owner-pgsql-general@postgreSQL.org
[mailto:owner-pgsql-general@postgreSQL.org] On Behalf Of Oliver Elphick
Sent:    Tuesday, July 20, 1999 2:27 PM
To:    Gary Hoffman
Cc:    pgsql-general@postgreSQL.org
Subject:    Re: [GENERAL] Database Best Practices

Gary Hoffman wrote:
  >To all the database experts out there:
  >
  >As I begin to design my PostGreSQL tables, I'm looking for some
  >over-the-shoulder advice on the order of "best practices"
recommendations.
  >Has anyone gathered their experience together into such a book or
website?
  >
  >For example, I'd like to know, in designing a name-and-address table,
  >  - should a title field be provided? (Mr., Dr., etc.)
  >  - how about a suffix (Jr., III, etc.)
  >  - how many address lines should I have for a home address?
  >  - should all the fields be type 'text'? Should Zip be 'text' or 'int4'?
  >  - what schema handles Zip-plus-four best (including the blasted
hyphen)?
  >  - what is the best database schema for a universal, international
  >address table that includes individuals and companies?

If you want a database that can take international addresses, forget about
the rigid scheme of US addresses; most countries are much more free-form.

British postcodes are not numeric (mine is "PO30 1XP"), so a zip field that
is typed int4 would not accommodate British codes at all.

A full postal address might look like this:

  Mr A.B. Cousins                      name
  Personnel Manager                    position
  Contorted Widgets Ltd                company
  Contortion House                     building name
  5 Park Road                          street address
  Acme Industrial Estate               block
  Little Wittering                     village
  Trumpington                          post town
  Hexhamshire                          county
  TR12 5SQ                             post code
  UK                                   country

That's extreme, but I've seen some nearly as bad.  The Post Office
don't need all of that, but it helps someone who is driving around and
trying to find the place.

Putting them together is more of a problem, since the order of fields
differs between countries.  I believe German addresses go from larger
area to smaller rather than smaller to larger, for example.


Here's a sample schema that differentiates between individuals and
organisations.  It regards addresses as separate from people, because a
person can have more than one address, and people can share the same
address.

======================== schema ===================================
-- refint functions (from postgresql-contrib package).  These are
-- needed until foreign keys are implemented in PostgreSQL
\i /usr/lib/postgresql/modules/refint.sql

-------------------------- country --------------------------------
create table country
(
        id              char(2)         primary key
                                        check (id ~ '[A-Z]{2}'),
        name            text            not null,
        region          text,
        telcode         text
)
;

-- foreign key triggers
create trigger country_fref
        before DELETE or UPDATE on country
        for each row execute procedure
        check_foreign_key(1, 'restrict', 'id',
                'address', 'country'
                );

-- country contains the full list of ISO country codes, and the region of
-- the world where those countries are to be found.
-- id      = ISO country code
-- name    = country name
-- region  = continent or area where the country is
-- telcode = telephone code for the country

-------------------------- address --------------------------------
create table address
(
        id              int             primary key,
        house           text,
        number          text,
        street          text,
        village         text,
        town            text            not null,
        county          text,
        postcode        char(9),
        country         char(2)         references country (id),
        phone1_country  text,
        phone1_area     text,
        phone1_number   text,
        phone1_ext      text,
        phone2_country  text,
        phone2_area     text,
        phone2_number   text,
        phone2_ext      text,
        fax_country     text,
        fax_area        text,
        fax_number      text,
        fax_ext         text,
        comment         text
)
;

-- Use SPI triggers until foreign keys are implemented
create trigger address_pkref
create trigger address_pkref
        before INSERT or UPDATE on address
        for each row execute procedure
        check_primary_key('country', 'country', 'id');

create trigger address_fref1
        before DELETE or UPDATE on address
        for each row execute procedure
        check_foreign_key(1, 'restrict', 'id',
                'person_address', 'address');

create trigger address_fref2
        before DELETE or UPDATE on address
        for each row execute procedure
        check_foreign_key(3, 'setnull', 'id',
                'individual', 'address',
                'organisation', 'address',
                'person', 'address'
                );

-- id       = unique identifier of this address
-- house    = house name
-- number   = house number
-- street   = street address (excluding house name or number)
-- village  = second part of street address, if any
-- town     = city or post town
-- county   = county (may sometimes be blank)
-- postcode = post or zip code
-- country  = country code
-- phone1_country   = primary telephone country code
-- phone1_area      = primary telephone area code
-- phone1_number    = primary telephone number
-- phone1_ext       = primary telephone extension
-- phone2_country   = secondary telephone country code
-- phone2_area      = secondary telephone area code
-- phone2_number    = secondary telephone number
-- phone2_ext       = secondary telephone extension
-- fax_country   = fax telephone country code
-- fax_area      = fax telephone area code
-- fax_number    = fax telephone number
-- fax_ext       = fax telephone extension
-- comment       = free text

-- A person (of any kind) can have one or more addresses.  All fields here
-- are associated with a particular physical location


-------------------------- person --------------------------------
create table person
(
        id              char(10)        primary key,
        name            text            not null,
        address         int             references address (id),
        salutation      text            default 'Dear Sir',
        envelope        text,
        email           text,
        www             text,
        comment         text
)
;

-- Use SPI triggers until foreign keys are implemented


create trigger person_pkref
        before INSERT or UPDATE on person
        for each row execute procedure
        check_primary_key('address', 'address', 'id');

create trigger person_fref_x
        before DELETE or UPDATE on person
        for each row execute procedure
        check_foreign_key(1, 'restrict', 'id',
                'person_address', 'person'
                );


-- Top-level class that describes persons (living or otherwise)
-- id         = identifier
-- address    = id of the primary address of this person
-- salutation = how to address this person in the start of a letter
--                  (e.g.: "Dear Fred")
-- envelope   = how to address this person on an envelope
--                  (e.g.: "Mr F.G. Bloggs")
-- email      = email address
-- www        = Web URL
-- comment       = free text

-- Dependencies: address


-------------------------- individual --------------------------------
create function name(bpchar, bpchar, bpchar) returns bpchar
        as '/usr1/proj/bray/sql/funcs.so' language 'c';

create table individual
(
        gender          char(1)         check (gender = 'M' or gender = 'F'
or gender is null),
        born            date        check ((born >= '1 Jan 1880' and born <=
'today') or born is null),
        surname         text,
        forenames       text,
        title           text,
        old_surname     text,
        mobile          text,
        ni_no           text

        constraint is_named check (not (surname isnull and forenames
isnull))
)
        inherits (person)
;

create trigger individual_pkref
        before INSERT or UPDATE on individual
        for each row execute procedure
        check_primary_key('address', 'address', 'id');

create trigger individual_fref_x
        before DELETE or UPDATE on individual
        for each row execute procedure
        check_foreign_key(1, 'restrict', 'id',
                'person_address', 'person',
                'organisation', 'contact'
                );

-- Table of living individuals (as opposed to firms or legal persons)
-- gender    = 'M' or 'F'
-- born      = date of birth
-- surname   = surname
-- forenames = forenames
-- title     = Mr, Mrs, etc
-- old_surname   = maiden name, etc.

-- Inheritance: person



-------------------------- organisation --------------------------------
create table organisation
(
        structure      char(1)         check (structure='L' or structure='C'
or structure='U' or structure='O'),
    contact        char(10)        references individual (id),
    old_name       text
)
        inherits (person)
;


create trigger organisation_pkref
        before INSERT or UPDATE on organisation
        for each row execute procedure
        check_primary_key('address', 'address', 'id');

create trigger organisation_pkref_c
        before INSERT or UPDATE on organisation
        for each row execute procedure
        check_primary_key('contact', 'individual', 'id');

create trigger organisation_fref_x
        before DELETE or UPDATE on organisation
        for each row execute procedure
        check_foreign_key(1, 'restrict', 'id',
                'person_address', 'person'
                );



-- Defines persons that are not individuals
-- contact      = id of the primary person to be contacted when dealing
--                with this organisation
-- structure    = L(imited), C(orporation), U(nincorporated) or O(ther)
-- old_name     = previous organisation name (if any)

-- Inheritance: person
-- Dependencies: individual, address

---------------- person_address relation ----------------------
create table person_address
(
        person          char(10)        not null
                                         -- references person*,
                                         references person(id),
        address         int             not null
                                         references address(id),
        function        text            not null,

        primary key (person, address, function)
)
;

-- refint triggers
create trigger person_address_pkref_1
        before INSERT or UPDATE on person_address
        for each row execute procedure
        check_primary_key('person', 'person*', 'id');

create trigger person_address_pkref_2
        before INSERT or UPDATE on person_address
        for each row execute procedure
        check_primary_key('address', 'address', 'id');

-- Relation of people to addresses (many-to-many relationship)
-- person   =  id of a person
-- address  =  id of an address
-- function =  use of address (e.g. accounts, deliveries...)

======================== schema ===================================
--
      Vote against SPAM: http://www.politik-digital.de/spam/
                 ========================================
Oliver Elphick                                Oliver.Elphick@lfix.co.uk
Isle of Wight                              http://www.lfix.co.uk/oliver
               PGP key from public servers; key ID 32B8FAA1
                 ========================================
     "Behold, what manner of love the Father hath bestowed
      upon us, that we should be called the sons of God..."
                                I John 3:1







Re: [GENERAL] Database Best Practices

От
David Warnock
Дата:
Gary Hoffman

> For example, I'd like to know, in designing a name-and-address table,
>   - should a title field be provided? (Mr., Dr., etc.)
>   - how about a suffix (Jr., III, etc.)
>   - how many address lines should I have for a home address?
>   - should all the fields be type 'text'? Should Zip be 'text' or 'int4'?
>   - what schema handles Zip-plus-four best (including the blasted hyphen)?
>   - what is the best database schema for a universal, international
> address table that includes individuals and companies?

If you are serious about name and address tables then you have chosen
one of the most complex things to model in a useable way.

We have developed a fundraising system for charities, the first
customers are Bible Societies around the world and the system replaces
one written by me nearly 10 years ago.

We use the following

Titles table. Has format strings to create Salutation prefix,
salutation, addressee from the title, any part of the name and also any
constants. Each person has their own salutation prefix, saluation and
addressee which are initially generated from the title but which can be
changed.  Some countries have titles that span more than 1 line (eg
Austria, Italy) and it is common to require a diferent name presentation
for the addressee (top of address) and salutation (top of letter). In
many languages the Salutation prefix (usually "Dear" in English is
diferent for men and women.  In some countries titles are almost never
used (eg Denmark) in others they are essential and there are 1000's of
variations (Austria).

Names. We have Lastname, firstname, initials and honours (eg BSc (hons)
or Jr). This is still not ideal for some cultures which dom not have
western naming conventions (eg India at least traditionally).

Addresses. There are lots of complications. Basically we have an address
format for each country and dynamically rearrange the address
presentation according to that (eg house number on right in most of
Europe on left in UK, postcode before city in most of europe and on a
new line after county in the UK).

We have 5 address lines plus postcode, state (from a lookup list for the
selected country), and country. We enter addresses backwards ie country,
state, postcode so that we can check correctly and adjust the formats as
we go (plus automatic address completion from the postcode). For example
UK does not have states but Spain and USA do.  The postcode format is
kept with the country. The address format has descriptions for each
address line which are displayed so that the same line is used for City
in all UK addresses. The address format also controls whether particular
address lines are "Not used", "Optional" or "Required".

NB House number is not a short or simple column. In many countries where
lots of people live in apartments it will combine the floor number and
apartment number.

Obviously you also need country specific formats for phone numbers (eg
Denmark does not have area codes).

When you support finding people you need to remember things like

- postcode finding is useless in much of the world eg Denmark has 1
postcode for a whole town. UK has 1 postcode for 17 houses on average.
- Surname finding is useless in many countries eg (Smith or Patel in UK,
Hansen or Jensen in Denmark


In terms of column types we use varchar with a unicode character set
otherwise you may not be able to have an address in Moscow in the same
dbms as one in Portugal, one in Finland, Latvia, Malta, Jordan etc

Also in countries like Norway you need a summer and winter address for
people as many move out of the city in the summer.

There is a book "Guide to worldwide Postal-Code & Address formats" from
Marian Nelson/Nelson Intersearch Company tel +1 (212) 580-4819 fax +1
(212) 362-9855 email MarNelson@aol.com

Regards

Dave

--
David Warnock
Sundayta Ltd