design of queries for sparse data

Поиск
Список
Период
Сортировка
От Andreas
Тема design of queries for sparse data
Дата
Msg-id 47374577.8020208@gmx.net
обсуждение исходный текст
Ответы Re: design of queries for sparse data  ("Sean Davis" <sdavis2@mail.nih.gov>)
Re: design of queries for sparse data  (Hans-Peter Oeri <hp@oeri.ch>)
Список pgsql-novice
Hi,

I need some help to improve my design skills.   :)

I lately read an article about table design, that teached one shouldn't
designe tables where it's clear that some columns aren't relevant for
every row. It didn't span into the dirty usage details beyond the table
design, though.

E.g. a really simple example like a school that stores pupils like this:
pupil (pupil_id, pupil_name, attends_english, attends_history,
attends_maths, attends_football, attends_swimming)

1)   Some pupils don't attend to football, swimming or both.
2)   Occasionally there will be new classes added and others get
dropped. Say in a year a column "attends_knitting" gets introduced. Now
all those 50,000 existing rows get a column where the person hadn't even
the occasion to apply.
If for some reason the knitting class gets discontinued every row in the
future will still get this column.

So it was better to create 3 normalized tables:
pupil  (pupil_id,  pupil_name,  start_date,  exit_date)
classes  (class_id,  class_name,  is_available,  output_order)
attends_to  (pupil_id,  class_id,  in_year)   as an n:m-relation

Fine. Now I got rid off those empty columns in the pupil table.


MY QUESTIONS:

1)   How would I SELECT a report that looks like the first version of
the pupil table out of the 3 table design?
There must be a nontrivial SELECT statement that combines all 3 tables.
E.g. I want the result:
pupil_id, pupil_name, attends_to_english, ....., attends_to_football,
attends_to_swimming, attends_to_knitting
(42, Frank Miller, yes, ...., no, yes, yes)
(43, Suzy Smith, yes, ..., yes, yes, no)
...

2)   Could I control the order in which those attends_to-columns appear
by a numerical field output_order?

3)   Could I restrict the classes list so that only those appear when
there are pupils actually attending them in a given time frame?
3) a)   Like "competitve knitting" was only available from 2000-2005.
Now I'd produce a list of 2007 so there shouldn't appear an empty
knitting-column.  -->  classes.is_availlable
3) b)   Or it is availlable but no one has chosen it in 2007. -->
attends_to.in_year


Regards
Andreas



В списке pgsql-novice по дате отправления:

Предыдущее
От: Brad Nicholson
Дата:
Сообщение: Re: Autovacuum - what does it actually do?
Следующее
От: "Sean Davis"
Дата:
Сообщение: Re: design of queries for sparse data