Обсуждение: Surrogate VS natural keys

Поиск
Список
Период
Сортировка

Surrogate VS natural keys

От
Naz Gassiep
Дата:
OK so which is the "correct" way to do it?

E.g., Say I have a table with users, and a table with clubs, and a table
that links them. Each user can be in more than one club and each club
has more than one member. Standard M:M relationship. Which link table is
the "right" way to do it?

This:

CREATE TABLE (
    userid INTEGER NOT NULL REFERENCES users,
    clubid INTEGER NOT NULL REFERENCES clubs,
    PRIMARY KEY (userid, clubid)
);

Or this:

CREATE TABLE (
    id SERIAL PRIMARY KEY,
    userid INTEGER NOT NULL REFERENCES users,
    clubid INTEGER NOT NULL REFERENCES clubs
);

I've always favored natural keys (the first option) as it just seems to
me a more natural and semantic representation of the data, however I
often get app designers complaining about it being more complex or
something.

Comments?

Re: Surrogate VS natural keys

От
brian
Дата:
Naz Gassiep wrote:
> OK so which is the "correct" way to do it?
>
> E.g., Say I have a table with users, and a table with clubs, and a table
> that links them. Each user can be in more than one club and each club
> has more than one member. Standard M:M relationship. Which link table is
> the "right" way to do it?
>
> This:
>
> CREATE TABLE (
>     userid INTEGER NOT NULL REFERENCES users,
>     clubid INTEGER NOT NULL REFERENCES clubs,
>     PRIMARY KEY (userid, clubid)
> );
>
> Or this:
>
> CREATE TABLE (
>     id SERIAL PRIMARY KEY,
>     userid INTEGER NOT NULL REFERENCES users,
>     clubid INTEGER NOT NULL REFERENCES clubs
> );
>

The former uses a primary key across both columns to enforce a unique
constraint. In the latter, you have a seperate ID column, which does not
enforce that constraint. And you have to ask yourself if you'll ever be
referencing that ID column for anything at all. I doubt i ever would.
Generally, you'd be using this to relate rows from a more generalised
table using either the club ID or the user ID. I can't see how having a
seperate serial ID column would be useful for any kind of select.

brian

Re: Surrogate VS natural keys

От
Rich Shepard
Дата:
On Wed, 20 Jun 2007, brian wrote:

> The former uses a primary key across both columns to enforce a unique
> constraint. In the latter, you have a seperate ID column, which does not
> enforce that constraint. And you have to ask yourself if you'll ever be
> referencing that ID column for anything at all. I doubt i ever would.
> Generally, you'd be using this to relate rows from a more generalised
> table using either the club ID or the user ID. I can't see how having a
> seperate serial ID column would be useful for any kind of select.

   Also, the reason for a third, M-M, table is to relate multiple players and
multiple clubs. If you think of the logic involved, your third table has
only one row for each player-club combination. Therefore, each row is unique
by definition and a surrogate key adds no value.

Rich

--
Richard B. Shepard, Ph.D.               |    The Environmental Permitting
Applied Ecosystem Services, Inc.        |          Accelerator(TM)
<http://www.appl-ecosys.com>     Voice: 503-667-4517      Fax: 503-667-8863

Re: Surrogate VS natural keys

От
Martijn van Oosterhout
Дата:
On Wed, Jun 20, 2007 at 08:39:23AM -0700, Rich Shepard wrote:
>   Also, the reason for a third, M-M, table is to relate multiple players and
> multiple clubs. If you think of the logic involved, your third table has
> only one row for each player-club combination. Therefore, each row is unique
> by definition and a surrogate key adds no value.

While true in this simple case, it can quickly become more complicated
if your relationship starts gaining attributes. For example, if you add
start and stop dates, so the (player,club) combination is not unique
anymore. If you track invoices, games or scores it may be easier to
reference the relatioship via a surrogate key rather than copying the
other IDs around everywhere.

For simple tables like this I generally don't bother, but sometimes I
find myself adding a surrogate key later.

Have a nice day,
--
Martijn van Oosterhout   <kleptog@svana.org>   http://svana.org/kleptog/
> From each according to his ability. To each according to his ability to litigate.

Вложения

Re: Surrogate VS natural keys

От
"Joshua D. Drake"
Дата:
Martijn van Oosterhout wrote:
> On Wed, Jun 20, 2007 at 08:39:23AM -0700, Rich Shepard wrote:
>>   Also, the reason for a third, M-M, table is to relate multiple players and
>> multiple clubs. If you think of the logic involved, your third table has
>> only one row for each player-club combination. Therefore, each row is unique
>> by definition and a surrogate key adds no value.
>
> While true in this simple case, it can quickly become more complicated
> if your relationship starts gaining attributes. For example, if you add
> start and stop dates, so the (player,club) combination is not unique
> anymore. If you track invoices, games or scores it may be easier to
> reference the relatioship via a surrogate key rather than copying the
> other IDs around everywhere.
>
> For simple tables like this I generally don't bother, but sometimes I
> find myself adding a surrogate key later.

The value of a surrogate key is easy retrieval and really has nothing to
do with normalization or proper modeling.

I often add a surrogate key, even when one is not required just so I
don't have to worry about have a 4 element where clause.

Joshua D. Drake


>
> Have a nice day,


--

       === The PostgreSQL Company: Command Prompt, Inc. ===
Sales/Support: +1.503.667.4564 || 24x7/Emergency: +1.800.492.2240
Providing the most comprehensive  PostgreSQL solutions since 1997
              http://www.commandprompt.com/

Donate to the PostgreSQL Project: http://www.postgresql.org/about/donate
PostgreSQL Replication: http://www.commandprompt.com/products/


Re: Surrogate VS natural keys

От
Rich Shepard
Дата:
On Wed, 20 Jun 2007, Martijn van Oosterhout wrote:

> While true in this simple case, it can quickly become more complicated if
> your relationship starts gaining attributes. For example, if you add start
> and stop dates, so the (player,club) combination is not unique anymore. If
> you track invoices, games or scores it may be easier to reference the
> relatioship via a surrogate key rather than copying the other IDs around
> everywhere.

   That's very true, Martijn. It did not seem to be the case in the original
post. When you add a time history and need to identify a player's team at a
specified point in time, it gets much more complicated. That's when reading
Rick Snodgrass' book helps a lot.

Rich

--
Richard B. Shepard, Ph.D.               |    The Environmental Permitting
Applied Ecosystem Services, Inc.        |          Accelerator(TM)
<http://www.appl-ecosys.com>     Voice: 503-667-4517      Fax: 503-667-8863

Re: Surrogate VS natural keys

От
Richard Broersma Jr
Дата:
--- "Joshua D. Drake" <jd@commandprompt.com> wrote:
> The value of a surrogate key is easy retrieval and really has nothing to
> do with normalization or proper modeling.
>
> I often add a surrogate key, even when one is not required just so I
> don't have to worry about have a 4 element where clause.


I've often wondered about this.  Since PostgreSQL allows FOREIGN KEYS to be referenced from UNIQUE
(non-primary) natural keys, couldn't the schema be designed so that every table has a surrogate
PRIMARY KEY and yet still maintain the relationships using UNIQUE natural keys.

Would a design like this be practical?

Regards,
Richard Broersma Jr.

Re: Surrogate VS natural keys

От
"Joshua D. Drake"
Дата:
Richard Broersma Jr wrote:
> --- "Joshua D. Drake" <jd@commandprompt.com> wrote:
>> The value of a surrogate key is easy retrieval and really has nothing to
>> do with normalization or proper modeling.
>>
>> I often add a surrogate key, even when one is not required just so I
>> don't have to worry about have a 4 element where clause.
>
>
> I've often wondered about this.  Since PostgreSQL allows FOREIGN KEYS to be referenced from UNIQUE
> (non-primary) natural keys, couldn't the schema be designed so that every table has a surrogate
> PRIMARY KEY and yet still maintain the relationships using UNIQUE natural keys.
>
> Would a design like this be practical?

I would do it the other way. Have your primary keys be natural.

Joshua D. Drake


>
> Regards,
> Richard Broersma Jr.
>
> ---------------------------(end of broadcast)---------------------------
> TIP 6: explain analyze is your friend
>


--

       === The PostgreSQL Company: Command Prompt, Inc. ===
Sales/Support: +1.503.667.4564 || 24x7/Emergency: +1.800.492.2240
Providing the most comprehensive  PostgreSQL solutions since 1997
              http://www.commandprompt.com/

Donate to the PostgreSQL Project: http://www.postgresql.org/about/donate
PostgreSQL Replication: http://www.commandprompt.com/products/


Re: Surrogate VS natural keys

От
Michael Glaesemann
Дата:
On Jun 20, 2007, at 11:28 , Joshua D. Drake wrote:

> Richard Broersma Jr wrote:
>
>> I've often wondered about this.  Since PostgreSQL allows FOREIGN
>> KEYS to be referenced from UNIQUE
>> (non-primary) natural keys, couldn't the schema be designed so
>> that every table has a surrogate
>> PRIMARY KEY and yet still maintain the relationships using UNIQUE
>> natural keys.
>> Would a design like this be practical?
>
> I would do it the other way. Have your primary keys be natural.

The albeit small advantage of using PRIMARY KEY on your surrogate if
you're using the surrogate for foreign key constraints is that you
can leave off the column name when using REFERENCES: it'll default to
the PRIMARY KEY column(s). For example:

CREATE TABLE foos
(
     foo_id SERIAL PRIMARY KEY
     , foo TEXT NOT NULL
     , bal TEXT NOT NULL, UNIQUE (foo, bal)
);

CREATE TABLE bars
(
     bar_id SERIAL PRIMARY KEY
     , bar TEXT NOT NULL
     , foo_id INTEGER NOT NULL
         REFERENCES foos
);

\d bars
                            Table "public.bars"
Column |  Type   |                       Modifiers
--------+---------
+-------------------------------------------------------
bar_id | integer | not null default nextval('bars_bar_id_seq'::regclass)
bar    | text    | not null
foo_id | integer | not null
Indexes:
     "bars_pkey" PRIMARY KEY, btree (bar_id)
Foreign-key constraints:
     "bars_foo_id_fkey" FOREIGN KEY (foo_id) REFERENCES foos(foo_id)

It does work for multi-column primary keys as well:

CREATE TABLE baz_quuxen
(
     baz TEXT NOT NULL
     , quux TEXT NOT NULL
     , PRIMARY KEY (baz, quux)
);

CREATE TABLE blurfls
(
     blurfl TEXT PRIMARY KEY
     , baz TEXT NOT NULL
     , quux TEXT NOT NULL
     , FOREIGN KEY (baz, quux) REFERENCES baz_quuxen
);

\d blurfls
   Table "public.blurfls"
Column | Type | Modifiers
--------+------+-----------
blurfl | text | not null
baz    | text | not null
quux   | text | not null
Indexes:
     "blurfls_pkey" PRIMARY KEY, btree (blurfl)
Foreign-key constraints:
     "blurfls_baz_fkey" FOREIGN KEY (baz, quux) REFERENCES baz_quuxen
(baz, quux)

Having PRIMARY KEY on your natural key does provide some additional
documentation.

Michael Glaesemann
grzm seespotcode net



Re: Surrogate VS natural keys

От
"Joshua D. Drake"
Дата:
Michael Glaesemann wrote:
>
> On Jun 20, 2007, at 11:28 , Joshua D. Drake wrote:
>
>> Richard Broersma Jr wrote:
>>
>>> I've often wondered about this.  Since PostgreSQL allows FOREIGN KEYS
>>> to be referenced from UNIQUE
>>> (non-primary) natural keys, couldn't the schema be designed so that
>>> every table has a surrogate
>>> PRIMARY KEY and yet still maintain the relationships using UNIQUE
>>> natural keys.
>>> Would a design like this be practical?
>>
>> I would do it the other way. Have your primary keys be natural.
>
> The albeit small advantage of using PRIMARY KEY on your surrogate if
> you're using the surrogate for foreign key constraints is that you can
> leave off the column name when using REFERENCES: it'll default to the
> PRIMARY KEY column(s). For example:

Sure but for the sake of doing normalization correctly ;) a primary key
should be natural.

Joshua D. Drake


>
> CREATE TABLE foos
> (
>     foo_id SERIAL PRIMARY KEY
>     , foo TEXT NOT NULL
>     , bal TEXT NOT NULL, UNIQUE (foo, bal)
> );
>
> CREATE TABLE bars
> (
>     bar_id SERIAL PRIMARY KEY
>     , bar TEXT NOT NULL
>     , foo_id INTEGER NOT NULL
>         REFERENCES foos
> );
>
> \d bars
>                            Table "public.bars"
> Column |  Type   |                       Modifiers
> --------+---------+-------------------------------------------------------
> bar_id | integer | not null default nextval('bars_bar_id_seq'::regclass)
> bar    | text    | not null
> foo_id | integer | not null
> Indexes:
>     "bars_pkey" PRIMARY KEY, btree (bar_id)
> Foreign-key constraints:
>     "bars_foo_id_fkey" FOREIGN KEY (foo_id) REFERENCES foos(foo_id)
>
> It does work for multi-column primary keys as well:
>
> CREATE TABLE baz_quuxen
> (
>     baz TEXT NOT NULL
>     , quux TEXT NOT NULL
>     , PRIMARY KEY (baz, quux)
> );
>
> CREATE TABLE blurfls
> (
>     blurfl TEXT PRIMARY KEY
>     , baz TEXT NOT NULL
>     , quux TEXT NOT NULL
>     , FOREIGN KEY (baz, quux) REFERENCES baz_quuxen
> );
>
> \d blurfls
>   Table "public.blurfls"
> Column | Type | Modifiers
> --------+------+-----------
> blurfl | text | not null
> baz    | text | not null
> quux   | text | not null
> Indexes:
>     "blurfls_pkey" PRIMARY KEY, btree (blurfl)
> Foreign-key constraints:
>     "blurfls_baz_fkey" FOREIGN KEY (baz, quux) REFERENCES
> baz_quuxen(baz, quux)
>
> Having PRIMARY KEY on your natural key does provide some additional
> documentation.
>
> Michael Glaesemann
> grzm seespotcode net
>
>
>
> ---------------------------(end of broadcast)---------------------------
> TIP 2: Don't 'kill -9' the postmaster
>


--

       === The PostgreSQL Company: Command Prompt, Inc. ===
Sales/Support: +1.503.667.4564 || 24x7/Emergency: +1.800.492.2240
Providing the most comprehensive  PostgreSQL solutions since 1997
              http://www.commandprompt.com/

Donate to the PostgreSQL Project: http://www.postgresql.org/about/donate
PostgreSQL Replication: http://www.commandprompt.com/products/


Re: Surrogate VS natural keys

От
"Merlin Moncure"
Дата:
On 6/20/07, Richard Broersma Jr <rabroersma@yahoo.com> wrote:
>
> --- "Joshua D. Drake" <jd@commandprompt.com> wrote:
> > The value of a surrogate key is easy retrieval and really has nothing to
> > do with normalization or proper modeling.
> >
> > I often add a surrogate key, even when one is not required just so I
> > don't have to worry about have a 4 element where clause.
>
>
> I've often wondered about this.  Since PostgreSQL allows FOREIGN KEYS to be referenced from UNIQUE
> (non-primary) natural keys, couldn't the schema be designed so that every table has a surrogate
> PRIMARY KEY and yet still maintain the relationships using UNIQUE natural keys.
>
> Would a design like this be practical?

yeah, although I prefer to throw the primary key on the natural.
Either way, the natural key is identified...my major issue with the
surrogate design style is that the natural key is often not identified
which inevitably leads to a mess.

I also find databases with natural keys to be much easier to follow
and feel much 'cleaner' to me.  People who've never seen a large
database without surrogates will be amazed at how much more expressive
the tables are.  Surrogates have certain advantages but I classify
them as an optimization, meaning they should be introduced at the last
possible moment in the design.

merlin