Re: Slow search.. quite clueless

Поиск
Список
Период
Сортировка
От Oleg Bartunov
Тема Re: Slow search.. quite clueless
Дата
Msg-id Pine.GSO.4.63.0509202114360.20320@ra.sai.msu.su
обсуждение исходный текст
Ответ на Slow search.. quite clueless  (Yonatan Ben-Nes <da@canaan.co.il>)
Ответы Re: Slow search.. quite clueless  (Alex Turner <armtuk@gmail.com>)
Re: Slow search.. quite clueless  (Philip Hallstrom <postgresql@philip.pjkh.com>)
Re: Slow search.. quite clueless  (Yonatan Ben-Nes <da@canaan.co.il>)
Список pgsql-general
contrib/tsearch2 ( http://www.sai.msu.su/~megera/postgres/gist/tsearch/V2/ )
might works for you. It might because performance depends on
cardinality of your keywords.

      Oleg
On Tue, 20 Sep 2005, Yonatan Ben-Nes wrote:

> Hi all,
>
> Im building a site where the users can search for products with up to 4
> diffrent keywords which all MUST match to each product which found as a
> result to the search.
>
> I got 2 tables (which are relevant to the issue :)), one is the product table
> (5 million rows) and the other is the keyword table which hold the keywords
> of each product (60 million rows).
>
> The scheme of the tables is as follows:
>
>                      Table "public.product"
>           Column           |     Type      |      Modifiers
> ----------------------------+---------------+---------------------
> product_id                 | text          | not null
> product_name               | text          | not null
> retail_price               | numeric(10,2) | not null
> etc...
> Indexes:
>    "product_product_id_key" UNIQUE, btree (product_id)
>
>         Table "public.keyword"
>   Column    |     Type      | Modifiers
> -------------+---------------+-----------
> product_id  | text          | not null
> keyword     | text          | not null
> Indexes:
>    "keyword_keyword" btree (keyword)
>
> The best query which I succeded to do till now is adding the keyword table
> for each keyword searched for example if someone search for "belt" & "black"
> & "pants" it will create the following query:
>
> poweraise.com=# EXPLAIN ANALYZE SELECT
>
product_id,product_name,product_image_url,short_product_description,long_product_description,discount,discount_type,sale_price,retail_price

> FROM product INNER JOIN keyword t1 USING(product_id) INNER JOIN keyword t2
> USING(product_id) INNER JOIN keyword t3 USING(product_id) WHERE
> t1.keyword='belt' AND t2.keyword='black' AND t3.keyword='pants' LIMIT 13;
>
>       QUERY PLAN
>
-----------------------------------------------------------------------------------------------------------------------------------------------------------------------
> Limit  (cost=37734.15..39957.20 rows=13 width=578) (actual
> time=969.798..1520.354 rows=6 loops=1)
>   ->  Hash Join  (cost=37734.15..3754162.82 rows=21733 width=578) (actual
> time=969.794..1520.337 rows=6 loops=1)
>         Hash Cond: ("outer".product_id = "inner".product_id)
>         ->  Nested Loop  (cost=18867.07..2858707.34 rows=55309 width=612)
> (actual time=82.266..1474.018 rows=156 loops=1)
>               ->  Hash Join  (cost=18867.07..2581181.09 rows=55309 width=34)
> (actual time=82.170..1462.104 rows=156 loops=1)
>                     Hash Cond: ("outer".product_id = "inner".product_id)
>                     ->  Index Scan using keyword_keyword on keyword t2
> (cost=0.00..331244.43 rows=140771 width=17) (actual time=0.033..1307.167
> rows=109007 loops=1)
>                           Index Cond: (keyword = 'black'::text)
>                     ->  Hash  (cost=18851.23..18851.23 rows=6337 width=17)
> (actual time=16.145..16.145 rows=0 loops=1)
>                           ->  Index Scan using keyword_keyword on keyword t1
> (cost=0.00..18851.23 rows=6337 width=17) (actual time=0.067..11.050 rows=3294
> loops=1)
>                                 Index Cond: (keyword = 'belt'::text)
>               ->  Index Scan using product_product_id_key on product
> (cost=0.00..5.01 rows=1 width=578) (actual time=0.058..0.060 rows=1
> loops=156)
>                     Index Cond: (product.product_id = "outer".product_id)
>         ->  Hash  (cost=18851.23..18851.23 rows=6337 width=17) (actual
> time=42.863..42.863 rows=0 loops=1)
>               ->  Index Scan using keyword_keyword on keyword t3
> (cost=0.00..18851.23 rows=6337 width=17) (actual time=0.073..36.120 rows=3932
> loops=1)
>                     Index Cond: (keyword = 'pants'::text)
> Total runtime: 1521.441 ms
> (17 rows)
>
> Sometimes the query work fast even for 3 keywords but that doesnt help me if
> at other times it take ages....
>
> Now to find a result for 1 keyword its really flying so I also tried to make
> 3 queries and do INTERSECT between them but it was found out to be extremly
> slow...
>
> Whats make this query slow as far as I understand is all the merging between
> the results of each table... I tried to divide the keyword table into lots of
> keywords table which each hold keywords which start only with a specific
> letter, it did improve the speeds but not in a real significant way.. tried
> clusters,indexes,SET STATISTICS,WITHOUT OIDS on the keyword table and what
> not.. im quite clueless...
>
> Actually I even started to look on other solutions and maybe you can say
> something about them also.. maybe they can help me:
> 1. Omega (From the Xapian project) - http://www.xapian.org/
> 2. mnoGoSearch - http://www.mnogosearch.org/doc.html
> 3. Swish-e - http://swish-e.org/index.html
>
> To add on everything I want at the end to be able to ORDER BY the results
> like order the product by price, but im less concerned about that cause I saw
> that with cluster I can do it without any extra overhead.
>
> Thanks alot in advance,
> Yonatan Ben-Nes
>
>
> ---------------------------(end of broadcast)---------------------------
> TIP 1: if posting/reading through Usenet, please send an appropriate
>      subscribe-nomail command to majordomo@postgresql.org so that your
>      message can get through to the mailing list cleanly
>

      Regards,
          Oleg
_____________________________________________________________
Oleg Bartunov, sci.researcher, hostmaster of AstroNet,
Sternberg Astronomical Institute, Moscow University (Russia)
Internet: oleg@sai.msu.su, http://www.sai.msu.su/~megera/
phone: +007(095)939-16-83, +007(095)939-23-83

В списке pgsql-general по дате отправления:

Предыдущее
От: Yonatan Ben-Nes
Дата:
Сообщение: Slow search.. quite clueless
Следующее
От: Alex Turner
Дата:
Сообщение: Re: Slow search.. quite clueless