Обсуждение: Schema design question

Поиск
Список
Период
Сортировка

Schema design question

От
Bill Moseley
Дата:
I originally was planning on asking about Postgresql inheritance, but
after reading the docs again I do not think that it applies to my
design.

So, now I'm asking a rather basic schema design question.

I have two related objects where one inherits column values from
another.  No, that's incorrect.  The "child" receives default values
from the "parent" when the child is created.

A more concrete example: tables that represent classes taught at a
school.  The parent object is a general course description, and the
child object is a specific instance of a course -- a "class" -- which
is a course taught at a given time and location.  A course can be
taught multiple times, obviously.

A course (and thus a class) can have multiple instructors -- a
many-to-many relationship.  So I have a link table for that.  A class
normally uses the course's default instructors, but may be different
for specific classes instance.

How would you layout the tables for somethings like this?


I can think (out loud) of three ways to set this up:

1) Separate tables for "course" and "class" and when a class is
created simply copy column data from the course to the class.

  Pro: Selects are simple

  Con: Column duplication in the two tables -- two tables look a lot alike
       Need to have duplicate link tables (one pointing to each table)


2) Create a third "common_values" table that both "course" and "class"
tables reference.  Then when creating a class from a course clone the
common values row to a new row that the class can reference.

  Pro: No duplication of columns in multiple tables.
       Only need one linking table for instructors (but still need to
       create new links when creating the new row)

  Con: Need to always do joins on selects (not really a problem)

3) Create a single table with a flag to indicate if the row is a
"course" or a "class".

  Pro: Simple selects and no column duplication between tables

  Con: Columns for a course might be ok as NULL, but would be required
       for a specific class.

Again, a "course" and "class" are very similar.  But, once a class is
created from a course it really is its own entity.  For example, if
the course description changes in the future I don't want it to change
on previous classes.  There also needs to be a link between the two.
For example, you might want to show a list of courses, and then see
what classes are scheduled for a given course, so a class should
reference its parent course.

Thanks very much,


--
Bill Moseley
moseley@hank.org


Re: Schema design question

От
Matt Miller
Дата:
On Wed, 2005-08-17 at 10:49 -0700, Bill Moseley wrote:
> The parent object is a general course description, and the
> child object is a specific instance of a course
> ...
> tables that represent classes taught at a
> school.  The parent object is a general course ... the
> child object is ... a "class" -- which
> is a course taught at a given time and location.  A course can be
> taught multiple times ... A course (and thus a class) can have
> multiple instructors
>
> How would you layout the tables for somethings like this?

create table course (id serial primary key,
                     description varchar);
create table teacher (id serial primary key,
                      name varchar);
create table course_teacher (course_id integer not null,
                             teacher_id integer not null);
alter table course_teacher add primary key (course_id, teacher_id);
alter table course_teacher add foreign key (course_id)
                               references course
                               deferrable initially deferred;
create index course_teacher_teacher_ix on course_teacher (teacher_id);
alter table course_teacher add foreign key (teacher_id)
                               references teacher
                               deferrable initially deferred;
create table class (id serial primary key,
                    course_id integer not null,
                    teacher_id integer not null,
                    starts_on date,
                    location varchar);
create index class_course_ix on class (course_id);
alter table class add foreign key (course_id)
                      references course
                      deferrable initially deferred;
create index class_teacher_ix on class (teacher_id);
alter table class add foreign key (teacher_id)
                      references teacher
                      deferrable initially deferred;

> A class
> normally uses the course's default instructors, but may be different
> for specific classes instance.

When a class is created the user first specifies course_id.  At that
point the app can look at course_teacher and offer the list of default
teachers.  In case a non-default teacher is desired the app also offers
a lookup into teacher to see all available teachers.  The teacher_id
column of class is thus populated.  Set the "start_on" date and the
"location," and you're done.

> I can think (out loud) of three ways to set this up:

I'm sure there are many ways to get there.  To me, the way I've
described is the most-direct way to represent the relationships you've
described.


Re: Schema design question

От
Sebastian Hennebrueder
Дата:
Bill Moseley schrieb:

>On Wed, Aug 17, 2005 at 10:05:39PM +0200, Sebastian Hennebrueder wrote:
>
>
>>> Con: Column duplication in the two tables -- two tables look a lot alike
>>>      Need to have duplicate link tables (one pointing to each table)
>>>
>>>
>>>
>>>
>>They are not duplicated. As you say later in your explanation the course
>>settings may change. So the data may be the same right at the beginning
>>but can differ by the time. => It is not the same data!!
>>
>>
>
>I meant that I would have two tables that look like they hold very
>similar data.  That's not really a problem -- just bugs me to see
>duplication.  Plus, it could introduce errors if I ever modified, say,
>a columns type in one table and not the matching column in the other
>table.
>
>Any opinions on which table layout you would use?
>
>Thanks,
>
>
>
It is still not the same data. When my name is Sebastian Hennebrueder
and your name is Sebastian Hennebrueder, than we are not the same person.
Class and Course is not the same, they only have accidentally the same
data right at the beginning.
And as they have a reference to each other you should put them in two
separate tables, so that you can create a foreign key relation to
enforce the relation.

I do not now a good online tutorial for database normalisation but just
try Google or ask here.

--
Best Regards / Viele Grüße

Sebastian Hennebrueder

----

http://www.laliluna.de

Tutorials for JSP, JavaServer Faces, Struts, Hibernate and EJB

Get support, education and consulting for these technologies - uncomplicated and cheap.


Re: Schema design question

От
Bill Moseley
Дата:
On Wed, Aug 17, 2005 at 07:41:20PM +0000, Matt Miller wrote:

Thanks for responding, Matt:


> create table course (id serial primary key,
>                      description varchar);
> create table teacher (id serial primary key,
>                       name varchar);
> create table course_teacher (course_id integer not null,
>                              teacher_id integer not null);
[...]
> create table class (id serial primary key,
>                     course_id integer not null,
>                     teacher_id integer not null,
>                     starts_on date,
>                     location varchar);

There may be more than one teacher in the class so instead I'd need
another "class_teacher" link table.

I guess what "bugged" me about this type of layout is that a course
and class share so many columns.  Duplication just looks wrong -- and
I worry about changing a column type on one table and forgetting to
change it on the other table.  Also have to remember to copy all
columns every time a specific class is created.

On the other hand, if I used a single table to represent both types of
entities, then selects are always going to have something like WHERE
type = 'course' added onto the WHERE.  That's extra processing for no
good reason.

> I'm sure there are many ways to get there.  To me, the way I've
> described is the most-direct way to represent the relationships you've
> described.

And thanks very much for you help.

--
Bill Moseley
moseley@hank.org


Re: Schema design question

От
Matt Miller
Дата:
On Wed, 2005-08-17 at 13:40 -0700, Bill Moseley wrote:
> a course
> and class share so many columns.  ...and
> I worry about changing a column type on one table and forgetting to
> change it on the other table.

Postgres types might help here.  You could probably create a type that
contains the common columns, and then embed that type where you need it.

Re: Schema design question

От
Matt Miller
Дата:
On Wed, 2005-08-17 at 13:40 -0700, Bill Moseley wrote:
> I guess what "bugged" me about this type of layout is that a course
> and class share so many columns.  Duplication just looks wrong -- and
> I worry about changing a column type on one table and forgetting to
> change it on the other table.

If your design leads you to many shared attributes between class and
course, then ask yourself if the class-specific versions must always be
equal to the course-defaulted versions.  If yes, then consider factoring
out the common columns into a third table and using foreign keys to
reference that table from the other two.  However, if the course merely
suggests defaults for the class, and if any of these defaults can be
overridden by any class, then I would definitely duplicate the columns
in class.  I would do this even if a large percentage of classes will
never override the defaults.

A similar situation arises if the child table is supposed to remember
what the values of the parent were at the time the child was inserted.
If the parent can change values over time but the child should be
unaffected then, as before, I would copy the info into each child
record.