Re: Adding a suffix array index
От | Simon Riggs |
---|---|
Тема | Re: Adding a suffix array index |
Дата | |
Msg-id | 1100875810.4113.13675.camel@localhost.localdomain обсуждение исходный текст |
Ответ на | Adding a suffix array index (Troels Arvin <troels@arvin.dk>) |
Список | pgsql-hackers |
On Fri, 2004-11-19 at 10:42, Troels Arvin wrote: > Hello, > > I'm working on a thesis project where I explore the addition of a > specialized, bioinformatics-related data type to a RDBMS. My choice of > RDBMS is PostgreSQL, of course, and I've started by adding a "dnaseq" (DNA > sequence) data type, using PostgreSQL's APIs for type additions. > > The idea is to try to make it practical and even "attractive" to work with > DNA sequences in an RDBMS. My starting goal is to make it viable to work > with sequences in the 50-500 million base range. Some may think that > RDBMSes and long chunks of data don't match well. My opinion is that the > increasing power of computers and RDBMS software should at some point make > it possible to work with DNA sequences in a "normal" data management > setting like a RDBMS, instead of solely using stand-alone tools and > stand-alone data files. Anyways, it's an open question if my hypothesis is > right. > Presumably you know about these? http://www.ncbi.nih.gov/BLAST/ http://www.ciri.upc.es/cela_pblade/BLAST.htm http://www.netezza.com/products/bio.cfm I think you're right, but you'd need to have more than one application of the data for it to be a convincing argument. Without parallelism, your best efforts will be to equal the speed of the single-use data structures used in BLAST. -- Best Regards, Simon Riggs
В списке pgsql-hackers по дате отправления: