Обсуждение: complex queries
Hi Sean and group members,
with all your help i could upload a part of my dataset
successfully into my database.
I am queying my database. currently I am querying two
tables for data.
There are two tables processed_data and
sequence_annotation.
My first query (which is successful):
db=> select proc_exprs, proc_pval from processed_data,
sequence_annotation
where processed_data.seq_anno_id =
sequence_annotation.seq_anno_id
and sequence_annotation.seq_gene_symbol = 'FN1';
proc_exprs | proc_pval
------------+-----------
6.72867 | 0.141216
5.01667 | 0.212055
4.94133 | 0.216501
0.214667 | 0.624944
0.0846667 | 0.525513
7.19867 | 0.128558
11.7807 | 0.053066
11.5487 | 0.0545418
7.57867 | 0.0984488
0.33 | 0.678788
0.0546667 | 0.483358
11.7227 | 0.0534283
4.806 | 0.221026
2.86333 | 0.434024
3.60933 | 0.324803
0.43 | 0.742388
0.018 | 0.427739
3.222 | 0.375615
(18 rows)
Query 2: A little complicated one (successful):
db=> select proc_exprs, proc_pval, seq_anno_id from
processed_data
where proc_pval < 0.004;
proc_exprs | proc_pval | seq_anno_id
------------+------------+-------------
107.803 | 0.00382485 | 50237
106.607 | 0.00387695 | 51903
158.191 | 0.00240939 | 69829
170.103 | 0.00220903 | 70008
137.591 | 0.00284852 | 70009
217.086 | 0.00165253 | 70214
186.021 | 0.00198545 | 70223
213.575 | 0.00168479 | 70558
121.545 | 0.00330776 | 71717
221.715 | 0.00161171 | 71757
248.226 | 0.0014101 | 75187
111.929 | 0.00365456 | 80492
138.5 | 0.0022871 | 70008
90.9887 | 0.00380519 | 70009
Query 3: A complicated one( unsuccessful):
bob=> select proc_exprs, proc_pval, seq_gene_symbol
from processed_data, sequence_annotation
where proc_pval < 0.05 AND
processed_data.seq_anno_id =
sequence_annotation.seq_gene_symbol;
proc_exprs | proc_pval | seq_gene_symbol
------------+-----------+-----------------
(0 rows)
seq_gene_symbol is in sequence_annotation table and
its primary key is value of seq_anno_id.
Why I am getting 0 row. Whats wrong with my query.
Could you help me please.
Thank you.
K
My tables:
Processed_data table:
Column | Type |
Modifiers
-------------+------------------------+---------------------------------------------------------------------
proc_id | integer | not null
default
nextval('public.processed_data_proc_id_seq'::text)
exp_id | integer |
seq_anno_id | integer |
cel_id | integer |
proc_symbol | character varying(100) |
proc_exprs | real |
proc_pval | real |
Indexes:
"processed_data_pkey" primary key, btree (proc_id)
Foreign-key constraints:
"$2" FOREIGN KEY (exp_id) REFERENCES
experiment(exp_id)
"$3" FOREIGN KEY (seq_anno_id) REFERENCES
sequence_annotation(seq_anno_id)
"$4" FOREIGN KEY (cel_id) REFERENCES cel(cel_id)
Sequence_annotation table:
Column | Type |
Modifiers
-----------------------+--------------------------+------------------------------------------------------------------------------
seq_anno_id | integer |
not null default
nextval('public.sequence_annotation_seq_anno_id_seq'::text)
seq_probeset_id | character varying(15000) |
seq_source | character varying(15000) |
seq_transcript_id | character varying(15000) |
seq_target_desc | character varying(15000) |
seq_rep_pub_id | character varying(15000) |
seq_unigene_id | character varying(15000) |
seq_alignment | character varying(15000) |
seq_gene_title | character varying(15000) |
seq_gene_symbol | character varying(15000) |
seq_chr_location | character varying(15000) |
seq_ensembl | character varying(15000) |
seq_locus_link | character varying(15000) |
seq_swissport | character varying(15000) |
seq_ec | character varying(15000) |
seq_omim | character varying(15000) |
seq_refseq_prot_id | character varying(15000) |
seq_refseq_trans_id | character varying(15000) |
seq_flybase | character varying(15000) |
seq_agi | character varying(15000) |
seq_wormbase | character varying(15000) |
seq_mgi_name | character varying(15000) |
seq_rgd_name | character varying(15000) |
seq_sgd_acc | character varying(15000) |
seq_go_bp | character varying(15000) |
seq_go_cc | character varying(15000) |
seq_go_mf | character varying(15000) |
seq_pathway | character varying(15000) |
seq_prot_family | character varying(15000) |
seq_prot_domain | character varying(15000) |
seq_interpro_id | character varying(15000) |
seq_tm | character varying(15000) |
seq_qtl | character varying(15000) |
seq_ann_trasc_cluster | character varying(15000) |
seq_transc_assign | character varying(15000) |
seq_anno_notes | character varying(15000) |
genechip_id | integer |
Indexes:
"sequence_annotation_pkey" primary key, btree
(seq_anno_id)
Foreign-key constraints:
"$1" FOREIGN KEY (genechip_id) REFERENCES
genechip(genechip_id)
__________________________________________________
Do You Yahoo!?
Tired of spam? Yahoo! Mail has the best spam protection around
http://mail.yahoo.com
On Mar 11, 2005, at 11:45 AM, Kumar S wrote: > bob=> select proc_exprs, proc_pval, seq_gene_symbol > from processed_data, sequence_annotation > where proc_pval < 0.05 AND > processed_data.seq_anno_id = > sequence_annotation.seq_gene_symbol; > > > proc_exprs | proc_pval | seq_gene_symbol > ------------+-----------+----------------- > (0 rows) > > > seq_gene_symbol is in sequence_annotation table and > its primary key is value of seq_anno_id. > You probably need to spend some time reading about queries in an SQL book (many of which are online), but in this particular case, it doesn't look like seq_gene_symbol is going to equal seq_anno_id. It looks like you want: processed_data.seq_anno_id=sequence_annotation.seq_anno_id Off-topic, but have you thought about looking at using BASE or CHADO to keep your microarray data? Sean
It's very hard to tell exactly which table your columns are coming
from in your SQL. It makes it cleaner and more readable if you alias
your table.
select
pd.proc_exprs
, pd.proc_pval
, sa.seq_gene_symbol
from
processed_data pd -- pd is and alias name for processed_data
, sequence_annotation sa
where
pd.seq_anno_id = sa.seq_gene_symbol
AND pd.proc_pval < 0.05;
don't depend upon the query to recognize that the column you are
stating is unique between all the tables selected. Plus using a short
alias that makes sense makes your query more easily readable.
As you start making more complex queries these tips will help.
1. use alias table names in your query.
2. when you have multple tables define the relations at the beginning
of the where clause. Then define your constraint.
3. Try do define your relationships starting with the larger tables.
This speeds up processing of the query.
You can take some real performance hits when your query uses 12
tables.
> Hi Sean and group members,
>
> with all your help i could upload a part of my dataset
> successfully into my database.
>
> I am queying my database. currently I am querying two
> tables for data.
>
> There are two tables processed_data and
> sequence_annotation.
>
>
> My first query (which is successful):
>
> db=> select proc_exprs, proc_pval from processed_data,
> sequence_annotation
> where processed_data.seq_anno_id =
> sequence_annotation.seq_anno_id
> and sequence_annotation.seq_gene_symbol = 'FN1';
> proc_exprs | proc_pval
> ------------+-----------
> 6.72867 | 0.141216
> 5.01667 | 0.212055
> 4.94133 | 0.216501
> 0.214667 | 0.624944
> 0.0846667 | 0.525513
> 7.19867 | 0.128558
> 11.7807 | 0.053066
> 11.5487 | 0.0545418
> 7.57867 | 0.0984488
> 0.33 | 0.678788
> 0.0546667 | 0.483358
> 11.7227 | 0.0534283
> 4.806 | 0.221026
> 2.86333 | 0.434024
> 3.60933 | 0.324803
> 0.43 | 0.742388
> 0.018 | 0.427739
> 3.222 | 0.375615
> (18 rows)
>
>
> Query 2: A little complicated one (successful):
>
> db=> select proc_exprs, proc_pval, seq_anno_id from
> processed_data
> where proc_pval < 0.004;
> proc_exprs | proc_pval | seq_anno_id
> ------------+------------+-------------
> 107.803 | 0.00382485 | 50237
> 106.607 | 0.00387695 | 51903
> 158.191 | 0.00240939 | 69829
> 170.103 | 0.00220903 | 70008
> 137.591 | 0.00284852 | 70009
> 217.086 | 0.00165253 | 70214
> 186.021 | 0.00198545 | 70223
> 213.575 | 0.00168479 | 70558
> 121.545 | 0.00330776 | 71717
> 221.715 | 0.00161171 | 71757
> 248.226 | 0.0014101 | 75187
> 111.929 | 0.00365456 | 80492
> 138.5 | 0.0022871 | 70008
> 90.9887 | 0.00380519 | 70009
>
> Query 3: A complicated one( unsuccessful):
>
> bob=> select proc_exprs, proc_pval, seq_gene_symbol
> from processed_data, sequence_annotation
> where proc_pval < 0.05 AND
> processed_data.seq_anno_id =
> sequence_annotation.seq_gene_symbol;
>
>
> proc_exprs | proc_pval | seq_gene_symbol
> ------------+-----------+-----------------
> (0 rows)
>
>
> seq_gene_symbol is in sequence_annotation table and
> its primary key is value of seq_anno_id.
>
>
> Why I am getting 0 row. Whats wrong with my query.
>
> Could you help me please.
>
> Thank you.
>
> K
>
>
> My tables:
> Processed_data table:
> Column | Type |
> Modifiers
> -------------+------------------------+---------------------------------------------------------------------
> proc_id | integer | not null
> default
> nextval('public.processed_data_proc_id_seq'::text)
> exp_id | integer |
> seq_anno_id | integer |
> cel_id | integer |
> proc_symbol | character varying(100) |
> proc_exprs | real |
> proc_pval | real |
> Indexes:
> "processed_data_pkey" primary key, btree (proc_id)
> Foreign-key constraints:
> "$2" FOREIGN KEY (exp_id) REFERENCES
> experiment(exp_id)
> "$3" FOREIGN KEY (seq_anno_id) REFERENCES
> sequence_annotation(seq_anno_id)
> "$4" FOREIGN KEY (cel_id) REFERENCES cel(cel_id)
>
>
>
> Sequence_annotation table:
> Column | Type |
> Modifiers
>
-----------------------+--------------------------+------------------------------------------------------------------------------
> seq_anno_id | integer |
> not null default
> nextval('public.sequence_annotation_seq_anno_id_seq'::text)
> seq_probeset_id | character varying(15000) |
> seq_source | character varying(15000) |
> seq_transcript_id | character varying(15000) |
> seq_target_desc | character varying(15000) |
> seq_rep_pub_id | character varying(15000) |
> seq_unigene_id | character varying(15000) |
> seq_alignment | character varying(15000) |
> seq_gene_title | character varying(15000) |
> seq_gene_symbol | character varying(15000) |
> seq_chr_location | character varying(15000) |
> seq_ensembl | character varying(15000) |
> seq_locus_link | character varying(15000) |
> seq_swissport | character varying(15000) |
> seq_ec | character varying(15000) |
> seq_omim | character varying(15000) |
> seq_refseq_prot_id | character varying(15000) |
> seq_refseq_trans_id | character varying(15000) |
> seq_flybase | character varying(15000) |
> seq_agi | character varying(15000) |
> seq_wormbase | character varying(15000) |
> seq_mgi_name | character varying(15000) |
> seq_rgd_name | character varying(15000) |
> seq_sgd_acc | character varying(15000) |
> seq_go_bp | character varying(15000) |
> seq_go_cc | character varying(15000) |
> seq_go_mf | character varying(15000) |
> seq_pathway | character varying(15000) |
> seq_prot_family | character varying(15000) |
> seq_prot_domain | character varying(15000) |
> seq_interpro_id | character varying(15000) |
> seq_tm | character varying(15000) |
> seq_qtl | character varying(15000) |
> seq_ann_trasc_cluster | character varying(15000) |
> seq_transc_assign | character varying(15000) |
> seq_anno_notes | character varying(15000) |
> genechip_id | integer |
> Indexes:
> "sequence_annotation_pkey" primary key, btree
> (seq_anno_id)
> Foreign-key constraints:
> "$1" FOREIGN KEY (genechip_id) REFERENCES
> genechip(genechip_id)
>
>
>
> __________________________________________________
> Do You Yahoo!?
> Tired of spam? Yahoo! Mail has the best spam protection around
> http://mail.yahoo.com
>
> ---------------------------(end of broadcast)---------------------------
> TIP 2: you can get off all lists at once with the unregister command
> (send "unregister YourEmailAddressHere" to majordomo@postgresql.org)