Lianhe,
> I want to use index on the gene_symbol column in my
> query and gene_symbol is indexed. but when I use
> lower (gene_symbol) like lower('%mif%'), the index
> is not used. While when I change to
> lower(gene_symbol) = lower('mif'), the index is used
> and index scan works, but this is not what I like. I
> want all the gene_symbols containing substring
> 'mif' are pulled out, and not necessarily exactly match.
LIKE '%mif%' is what's called an "unanchored text search" and it cannot use an
index. The database *has* to scan the full text looking for the substring.
This is true of all database platforms I know of.
In regular text fields containing words, your problem is solvable with full
text indexing (FTI). Unfortunately, FTI is not designed for arbitrary
non-language strings. It could be adapted, but would require a lot of
hacking.
So you will need to find a way to restructure you data to avoid needing
unanchored text searches. One way would be to break down the gene_symbol
field into its smallest atomic components and store those in an indexed child
table. Or if you're searching on the same values all the time, you can
create a partial index.
--
-Josh Berkus
Aglio Database Solutions
San Francisco