Re: a newbie question on table design
От | Leif B. Kristensen |
---|---|
Тема | Re: a newbie question on table design |
Дата | |
Msg-id | 200802151543.57443.leif@solumslekt.org обсуждение исходный текст |
Ответ на | a newbie question on table design ("fdu.xiaojf@gmail.com" <fdu.xiaojf@gmail.com>) |
Ответы |
Re: a newbie question on table design
|
Список | pgsql-general |
On Friday 15. February 2008, fdu.xiaojf@gmail.com wrote: >Hi all, > >I have a large sdf file with many records of molecules and associated >data items and I want to save them in a PostgreSQL database. There are >about less than 40 data items for every molecule(the names of the data >items fore each molecule are the same, but values differ). The number >of molecules may exceed 20 million. > >Now I have come up with two ways to construct the table: > >1) a table with about 40 columns, every data item has its > corresponding column, and one molecule corresponds to one row in the > table. > >This is direct and simple. The drawbacks is if I want to add more data >types to the database, I have to modify the structure of the table. > >2) a table with just 3 columns: > > CREATE TABLE mytable( > id serial, > data_name text, > data_value text > ); > >Then a single molecule will corresonds to about 40 rows in the > database. This is a sound concept, but I'd rather store the data_name in a separate table with an integer key, and replace data_name in mytable with a data_name_fk REFERENCES data_names (data_name_id). That's just Occam's Razor applied to database design, aka first normal form. You'd probably store the name of the molecule in a third table. Then you have a model very similar to the classic 'book database' where a book can have multiple authors, and an author can have multiple books. There are examples for this design all over the place. >If I need to add more data types to the table, I just need to add new >rows with new "data_name" column values. The drawback of this table is >it has too many rows(40 times of the former one) and waste a lot > space. > >Which one is better, or there are some other smart ways ? > >I have another question. Since the data I need to save is huge, is it >appropriate that I save the data value in compressed format ? That sounds a lot like premature optimization. Postgres is actually quite good at compacting data natively. -- Leif Biberg Kristensen | Registered Linux User #338009 http://solumslekt.org/ | Cruising with Gentoo/KDE My Jazz Jukebox: http://www.last.fm/user/leifbk/
В списке pgsql-general по дате отправления: