data modeling genes and alleles... help!

Поиск
Список
Период
Сортировка
От Modulok
Тема data modeling genes and alleles... help!
Дата
Msg-id CAN2+EpaJjahG4QEmRyXohBUCQs6hdb85=8Y_AKRpRQ6mA3svGw@mail.gmail.com
обсуждение исходный текст
Ответы Re: data modeling genes and alleles... help!
Re: data modeling genes and alleles... help!
Re: data modeling genes and alleles... help!
Список pgsql-general
List,

I have a data modeling problem. That much, I know. The question is how do I
model this? (Below.)

I'm making a database which will store pseudo-genetic data. It's a basic
many-to-many setup::

    create table person(
        name varchar(32) primary key
    );
    create table gene(
        name varchar(32) primary key
    );
    create table person_gene(
        person varchar(32) references person(name),
        gene varchar(32) references gene(name)
    );

And I have data like::

    insert into person(name)
    values
        ('foo')
    ;
    insert into gene(name)
    values
        ('hair'),
        ('eye')
    ;
    insert into person_gene(person, gene)
    values
        ('foo', 'hair'),
        ('foo', 'eye')
    ;

Great. This is important as I need to be able to ask questions like "who
carries gene 'x'?" as well as "what genes does person 'y' carry?" But then
things get thorny...

I also need to store the properties of the individual genes (the alleles). This
is akin to an instance of one of the many gene classes in my application code.
So I make more tables::

    create table hair(
        id serial primary key,
        density float,
        thickness float
    );
    create table eye(
        id serial primary key,
        pupil_type int
    );

How do I store a reference to this data? I'd add a column to the person_gene
table, but it points to what? I can't reference a column name because they're
all stored in different tables. I also can't store them in the same table, as
they all store different data. Do I store the *table name* itself in a column
of the gene_table? (Smells like a klude.)

A person might not carry all genes. The number of genes in existence is not
fixed. New ones are introduced infrequently. There may be genes that no one
carries. (I assume I just make a new table each time a new gene is introduced?)

I thought about just pickling/marshaling the instances of my various gene
classes and just having a single 'genes' table which has a blob column but I
hesitate to do that because I want to be able to do queries on aggregate allele
stats. Things like "how many persons have pupil type 1?", etc.

It's late and I've probably over complicated it. Any pointers or advice on how
to model this would be greatly appreciated.

Cheers!
-Modulok-


В списке pgsql-general по дате отправления:

Предыдущее
От: Michael Paquier
Дата:
Сообщение: Re: Regular function
Следующее
От: Dann Corbit
Дата:
Сообщение: Re: data modeling genes and alleles... help!