Обсуждение: Organization of tables

Поиск

Список

Период

Сортировка

Organization of tables

От

"Salman Tahir"

Дата:

14 июня 2007 г., 08:21:21

Hi,

I have a question regarding the organization of a table I want to
create in my database: I have the following setup:

Table Fragments (simplified example):
Primary key = {mass}

Approach (1)
fragment   |   mass (of fragment)   |   peptide
---------------+--------------------------------+--------------
A             |      x                          | Peptide1, Peptide2
Q             |      y                          | Peptide1
K             |      z                          | Peptide 2, Peptide3

The idea here is that a peptide can be composed of many fragments e.g.
Peptide 2 is made up of fragments A and K; Peptide1 is made up of A
and Q and so on.

My idea is to create an index on the mass column and be able to
retrieve all Peptides that contain a certain fragment mass e.g

SELECT peptide
FROM Fragments
WHERE mass = x;

Should give me: Peptide1, Peptide2

The alternative way I have thought of to organize this table is to
have something as follows:
Approach (2)
Primary Key = {fragment, mass, peptide}
fragment   |   mass (of fragment)   |   peptide
---------------+--------------------------------+--------------
A             |      x                          | Peptide1
A             |      x                          | Peptide2
Q             |      y                          | Peptide1
K             |      z                          | Peptide 2
K             |      z                          | Peptide 3

If I consider 2500 unique fragments then, using approach (1), table
Fragments will hold 2,500 tuples. If I consider the same number of
fragments then table Fragments using approach 2 holds 15,000 tuples.

I have considered using approach (1) whereby I would have less tuples
to search but if I wanted to access the peptides they belong to I
would retrieve the list of corresponding peptides e.g the string
"Peptide1, Peptide2" and process it in my program. However this seems
like a hack around the way a database table should be organised. The
problem increases further when I have to scale up and consider more
unique fragments (>2500).

Any help on how best to structure such data would be mostly appreciated.

- Salman Tahir

Re: Organization of tables

От

"Rodrigo De León"

Дата:

14 июня 2007 г., 12:22:16

On 6/14/07, Salman Tahir <salmantahir1@gmail.com> wrote:
> Any help on how best to structure such data would be mostly appreciated.

See: http://en.wikipedia.org/wiki/Database_normalization

*** Grossly oversimplified example follows ***

CREATE TABLE PEPTIDE( NAME TEXT PRIMARY KEY
);

CREATE TABLE FRAGMENT( NAME TEXT PRIMARY KEY , MASS TEXT
);

CREATE TABLE PEPTIDE_FRAGMENT( FRAGMENT TEXT NOT NULL REFERENCES FRAGMENT(NAME) , PEPTIDE TEXT NOT NULL REFERENCES
PEPTIDE(NAME)
);

INSERT INTO PEPTIDE
VALUES ('Peptide 1'),('Peptide 2'),('Peptide 3');

INSERT INTO FRAGMENT
VALUES ('A','x'),('Q','y'),('K','z');

INSERT INTO PEPTIDE_FRAGMENT
VALUES
('A','Peptide 1'),('A','Peptide 2'),('Q','Peptide 1')
,('K','Peptide 2'),('K','Peptide 3');

SELECT F.NAME AS FRAGMENT, F.MASS
, (SELECT ARRAY_TO_STRING(ARRAY(    SELECT PEPTIDE FROM PEPTIDE_FRAGMENT    WHERE FRAGMENT = F.NAME ORDER BY NAME  ),
','))AS PEPTIDE
 
FROM FRAGMENT F;
fragment | mass |       peptide
----------+------+---------------------A        | x    | Peptide 1,Peptide 2Q        | y    | Peptide 1K        | z
|Peptide 2,Peptide 3

Re: Organization of tables

От

Дата:

15 июня 2007 г., 13:43:03

Hi Salman,

will this achieve your needs?


peptide_table

primary_key

name (ie, Peptide1, 
Peptide2)

mass



fragment_table

primary_key

name (ie, A, Q, 
K)


link_table

primary_key

peptide_id

fragment_id

it gives you easy access to peptide mass and it allows you to link multiple fragments to individual peptides.

if the peptide mass ultimately comes form the fragments, you could enter the mass value there and then have the db do a
calculationto sum up the masses for a given peptide.
 

sorry if i've misunderstood you problem.

best of luck,

oe1


----- Original Message ----
From: Salman Tahir <salmantahir1@gmail.com>
To: pgsql-sql@postgresql.org
Sent: Thursday, June 14, 2007 4:21:06 AM
Subject: [SQL] Organization of tables

Hi,

I have a question regarding the organization of a table I want to
create in my database: I have the following setup:

Table Fragments (simplified example):
Primary key = {mass}

Approach (1)
fragment   |   mass (of fragment)   |   peptide
---------------+--------------------------------+--------------
A             |      x                          | Peptide1, Peptide2
Q             |      y                          | Peptide1
K             |      z                          | Peptide 2, Peptide3

The idea here is that a peptide can be composed of many fragments e.g.
Peptide 2 is made up of fragments A and K; Peptide1 is made up of A
and Q and so on.

My idea is to create an index on the mass column and be able to
retrieve all Peptides that contain a certain fragment mass e.g

SELECT peptide
FROM Fragments
WHERE mass = x;

Should give me: Peptide1, Peptide2

The alternative way I have thought of to organize this table is to
have something as follows:
Approach (2)
Primary Key = {fragment, mass, peptide}
fragment   |   mass (of fragment)   |   peptide
---------------+--------------------------------+--------------
A             |      x                          | Peptide1
A             |      x                          | Peptide2
Q             |      y                          | Peptide1
K             |      z                          | Peptide 2
K             |      z                          | Peptide 3

If I consider 2500 unique fragments then, using approach (1), table
Fragments will hold 2,500 tuples. If I consider the same number of
fragments then table Fragments using approach 2 holds 15,000 tuples.

I have considered using approach (1) whereby I would have less tuples
to search but if I wanted to access the peptides they belong to I
would retrieve the list of corresponding peptides e.g the string
"Peptide1, Peptide2" and process it in my program. However this seems
like a hack around the way a database table should be organised. The
problem increases further when I have to scale up and consider more
unique fragments (>2500).

Any help on how best to structure such data would be mostly appreciated.

- Salman Tahir

---------------------------(end of broadcast)---------------------------
TIP 3: Have you checked our extensive FAQ?
              http://www.postgresql.org/docs/faq





____________________________________________________________________________________
TV dinner still cooling? 
Check out "Tonight's Picks" on Yahoo! TV.
http://tv.yahoo.com/

Вход в личный кабинет

Восстановление пароля

Подтверждение аккаунта

Изменение пароля

Обсуждение: Organization of tables

Organization of tables

Re: Organization of tables

Re: Organization of tables