Обсуждение: Very slow search using basic pattern matching

Поиск
Список
Период
Сортировка

Very slow search using basic pattern matching

От
Dan Field
Дата:
I'm trying to use the following statement as part of a search facility
on a site I develop. However the search is taking in excess of 10
seconds some times to return results. I'ts a dual opteron server with a
couple gigs of RAM so should be more than enough to deal with this
database.


SELECT
    da_records.RECORD_ID,
    da_records.TITLE_EN AS TITLE,
    da_records.AUTHOR_EN AS AUTHOR,
    da_records.DESCRIPTION_CY AS DESCRIPTION,
    da_records.PUBLISHER_CY AS PUBLISHER,
    da_records.URL_CY AS URL,
    da_records.RECORD_ID,
    da_records.KEYWORD_LIST_ID,
    da_records.LANGUAGE,
    da_records.CONTRIBUTOR_NAME,
    da_records.CONTRIBUTOR_EMAIL,
    da_records.CONTRIBUTOR_ORGANISATION,
    da_records.CONTRIBUTOR_CREDIT,
    da_records.DEWEY_LIST_ID,
    da_records.LISTING_PRIORITY,
    da_records.SUBMITTED_DATE,
    da_records.LAST_EDIT_DATE,
    da_records.STATUS
FROM
    da_records
WHERE
    da_records.DESCRIPTION_CY ~* '.*Aberystwyth*.'
  OR
    da_records.TITLE_CY ~* '.*Aberystwyth*.'
limit 100

Is there a better way of matching the string? (Must be case insensitive)

TIA
--
Dan Field <dof@llgc.org.uk> - Support Programmer: Cymru ar y we
cy_GB: http://www.cymruarywe.org
en_GB: http://www.walesontheweb.org

Re: Very slow search using basic pattern matching

От
Bruno Wolff III
Дата:
On Tue, May 18, 2004 at 16:47:11 +0100, Dan Field <dof@llgc.org.uk> wrote:
> FROM
>     da_records
> WHERE
>     da_records.DESCRIPTION_CY ~* '.*Aberystwyth*.'
>  OR
>     da_records.TITLE_CY ~* '.*Aberystwyth*.'
> limit 100
> 
> Is there a better way of matching the string? (Must be case insensitive)

This is going to require a sequential scan and if you aren't going to
hit 100 entries near the beginning of the table, you will be effectively
doing a full scan of the table.

You might try looking at the tsearch2 contrib entry (a package for full text
searching) as that will likely be able to help you out.


Re: Very slow search using basic pattern matching

От
Karsten Hilbert
Дата:
> WHERE  
>     da_records.DESCRIPTION_CY ~* '.*Aberystwyth*.' 
>  OR  
>     da_records.TITLE_CY ~* '.*Aberystwyth*.'  
> limit 100 
> 
> Is there a better way of matching the string? (Must be case 
> insensitive) 
Are you sure you can't anchor the search pattern ?

eg  ~* '^' || lower('Aberystwyth') || '.*'

That'd allow for use of (functional) indices.

Karsten
-- 
GPG key ID E4071346 @ wwwkeys.pgp.net
E167 67FD A291 2BEA 73BD  4537 78B9 A9F9 E407 1346


Re: Very slow search using basic pattern matching

От
Dan Field
Дата:
On 18 May 2004, at 17:12, Bruno Wolff III wrote:

> On Tue, May 18, 2004 at 16:47:11 +0100,
>   Dan Field <dof@llgc.org.uk> wrote:
>> FROM
>>     da_records
>> WHERE
>>     da_records.DESCRIPTION_CY ~* '.*Aberystwyth*.'
>>  OR
>>     da_records.TITLE_CY ~* '.*Aberystwyth*.'
>> limit 100
>>
>> Is there a better way of matching the string? (Must be case 
>> insensitive)
>
> This is going to require a sequential scan and if you aren't going to
> hit 100 entries near the beginning of the table, you will be 
> effectively
> doing a full scan of the table.
>
> You might try looking at the tsearch2 contrib entry (a package for 
> full text
> searching) as that will likely be able to help you out.

OK It looks like the full text search is the way forward here, but I 
can only find the tsearch.sql file in contrib. I'm using RedHat EL 3.0 
Advanced Server with PG version 7.3-RH.

Is there any documentation available on how to go about creating these 
indexes on my text fields?

I'm at a bit of a loss here, Google is failing me sadly

-- 
Dan Field <dof@llgc.org.uk> - Support Programmer: Cymru ar y we
cy_GB: http://www.cymruarywe.org
en_GB: http://www.walesontheweb.org



Re: Very slow search using basic pattern matching

От
"Raphael A. Bauer"
Дата:
Dan Field wrote:
> Is there any documentation available on how to go about creating these 
> indexes on my text fields?
> 
> I'm at a bit of a loss here, Google is failing me sadly

http://www.sai.msu.su/~megera/postgres/gist/tsearch/V2/

We made quite good experiences with tsearch2.
The problem for you could be that the case sensititity gets lost.

Raph