Обсуждение: Very slow search using basic pattern matching
I'm trying to use the following statement as part of a search facility on a site I develop. However the search is taking in excess of 10 seconds some times to return results. I'ts a dual opteron server with a couple gigs of RAM so should be more than enough to deal with this database. SELECT da_records.RECORD_ID, da_records.TITLE_EN AS TITLE, da_records.AUTHOR_EN AS AUTHOR, da_records.DESCRIPTION_CY AS DESCRIPTION, da_records.PUBLISHER_CY AS PUBLISHER, da_records.URL_CY AS URL, da_records.RECORD_ID, da_records.KEYWORD_LIST_ID, da_records.LANGUAGE, da_records.CONTRIBUTOR_NAME, da_records.CONTRIBUTOR_EMAIL, da_records.CONTRIBUTOR_ORGANISATION, da_records.CONTRIBUTOR_CREDIT, da_records.DEWEY_LIST_ID, da_records.LISTING_PRIORITY, da_records.SUBMITTED_DATE, da_records.LAST_EDIT_DATE, da_records.STATUS FROM da_records WHERE da_records.DESCRIPTION_CY ~* '.*Aberystwyth*.' OR da_records.TITLE_CY ~* '.*Aberystwyth*.' limit 100 Is there a better way of matching the string? (Must be case insensitive) TIA -- Dan Field <dof@llgc.org.uk> - Support Programmer: Cymru ar y we cy_GB: http://www.cymruarywe.org en_GB: http://www.walesontheweb.org
On Tue, May 18, 2004 at 16:47:11 +0100, Dan Field <dof@llgc.org.uk> wrote: > FROM > da_records > WHERE > da_records.DESCRIPTION_CY ~* '.*Aberystwyth*.' > OR > da_records.TITLE_CY ~* '.*Aberystwyth*.' > limit 100 > > Is there a better way of matching the string? (Must be case insensitive) This is going to require a sequential scan and if you aren't going to hit 100 entries near the beginning of the table, you will be effectively doing a full scan of the table. You might try looking at the tsearch2 contrib entry (a package for full text searching) as that will likely be able to help you out.
> WHERE > da_records.DESCRIPTION_CY ~* '.*Aberystwyth*.' > OR > da_records.TITLE_CY ~* '.*Aberystwyth*.' > limit 100 > > Is there a better way of matching the string? (Must be case > insensitive) Are you sure you can't anchor the search pattern ? eg ~* '^' || lower('Aberystwyth') || '.*' That'd allow for use of (functional) indices. Karsten -- GPG key ID E4071346 @ wwwkeys.pgp.net E167 67FD A291 2BEA 73BD 4537 78B9 A9F9 E407 1346
On 18 May 2004, at 17:12, Bruno Wolff III wrote: > On Tue, May 18, 2004 at 16:47:11 +0100, > Dan Field <dof@llgc.org.uk> wrote: >> FROM >> da_records >> WHERE >> da_records.DESCRIPTION_CY ~* '.*Aberystwyth*.' >> OR >> da_records.TITLE_CY ~* '.*Aberystwyth*.' >> limit 100 >> >> Is there a better way of matching the string? (Must be case >> insensitive) > > This is going to require a sequential scan and if you aren't going to > hit 100 entries near the beginning of the table, you will be > effectively > doing a full scan of the table. > > You might try looking at the tsearch2 contrib entry (a package for > full text > searching) as that will likely be able to help you out. OK It looks like the full text search is the way forward here, but I can only find the tsearch.sql file in contrib. I'm using RedHat EL 3.0 Advanced Server with PG version 7.3-RH. Is there any documentation available on how to go about creating these indexes on my text fields? I'm at a bit of a loss here, Google is failing me sadly -- Dan Field <dof@llgc.org.uk> - Support Programmer: Cymru ar y we cy_GB: http://www.cymruarywe.org en_GB: http://www.walesontheweb.org
Dan Field wrote: > Is there any documentation available on how to go about creating these > indexes on my text fields? > > I'm at a bit of a loss here, Google is failing me sadly http://www.sai.msu.su/~megera/postgres/gist/tsearch/V2/ We made quite good experiences with tsearch2. The problem for you could be that the case sensititity gets lost. Raph