Re: Incorrect FTS result with GIN index
От | Oleg Bartunov |
---|---|
Тема | Re: Incorrect FTS result with GIN index |
Дата | |
Msg-id | Pine.LNX.4.64.1007171450210.32129@sn.sai.msu.ru обсуждение исходный текст |
Ответ на | Incorrect FTS result with GIN index (Artur Dabrowski <ad@astec.com.pl>) |
Ответы |
Re: Incorrect FTS result with GIN index
(Artur Dabrowski <ad@astec.com.pl>)
|
Список | pgsql-general |
Artur, I downloaded your dump and tried your queries with index, I see no problem so far. Table "public.search_tab" Column | Type | Modifiers ----------------+---------+---------------------------------------------------------- id | integer | not null default nextval('search_tab_id_seq1'::regclass) keywords | text | collection_urn | text | bbox | text | object_urn | text | not null description | text | category | text | summary | text | priority | integer | Indexes: "search_tab_pkey1" PRIMARY KEY, btree (id) "idx_keywords_ger" gin (to_tsvector('german'::regconfig, keywords)) test=# explain analyze select count(*) from search_tab where (to_tsvector('german', keywords ) @@ to_tsquery('german', 'ee:*')) and (to_tsvector('german', keywords ) @@ to_tsquery('german', 'dd:*')); QUERY PLAN ------------------------------------------------------------------------------------------------------------------------------------------------------------------------- Aggregate (cost=103.87..103.88 rows=1 width=0) (actual time=24.784..24.784 rows=1 loops=1) -> Bitmap Heap Scan on search_tab (cost=5.21..103.80 rows=25 width=0) (actual time=24.642..24.769 rows=123 loops=1) Recheck Cond: ((to_tsvector('german'::regconfig, keywords) @@ '''ee'':*'::tsquery) AND (to_tsvector('german'::regconfig,keywords) @@ '''dd'':*'::tsquery)) -> Bitmap Index Scan on idx_keywords_ger (cost=0.00..5.21 rows=25 width=0) (actual time=24.620..24.620 rows=123loops=1) Index Cond: ((to_tsvector('german'::regconfig, keywords) @@ '''ee'':*'::tsquery) AND (to_tsvector('german'::regconfig,keywords) @@ '''dd'':*'::tsquery)) Total runtime: 24.830 ms (6 rows) see rows=123 On Thu, 15 Jul 2010, Artur Dabrowski wrote: > > Hello, > > I was trying to use GIN index, but the results seem be incorrect. > > > 1. QUERY WITHOUT INDEX > select count(*) from search_tab where > (to_tsvector('german', keywords ) @@ to_tsquery('german', 'ee:*')) and > (to_tsvector('german', keywords ) @@ to_tsquery('german', 'dd:*')); > > count > ------- > 123 > (1 row) > > > 2. CREATING INDEX > create index idx_keywords_ger on search_tab > using gin(to_tsvector('german', keywords)); > > > 3. QUERY WITH INDEX > select count(*) from search_tab where > (to_tsvector('german', keywords ) @@ to_tsquery('german', 'ee:*')) and > (to_tsvector('german', keywords ) @@ to_tsquery('german', 'dd:*')); > > count > ------- > 116 > (1 row) > > > The number of rows is different. To make things more funny and ensure > problem is not caused by dictionary normalisation: > > 4. EQUIVALENT QUERY WITH INDEX > select count(*) from search_tab where > (to_tsvector('german', keywords ) @@ to_tsquery('german', 'ee:* & dd:*')); > > count > ------- > 123 > (1 row) > > I tried the same with simple-based dictionary. The problem is always > reproducible. > > Total count of records in my database is 1 006 300 if it matters. > > One of missing results is the following: "lSWN eeIf hInEI IN > SIL3WugEOANcEGVWL1L LBAGAeLlGS ttfL DDhuDEIni9 ce". If the query is more > specifically targeted to find this row then it founds it: > > 5. MORE DETAILED QUERY WITH INDEX > select keywords from search_tab where > (to_tsvector('german', keywords ) @@ to_tsquery('german', 'eeI:* & dd:*')); > > keywords > > -------------------------------------------------------------------------------- > > lSWN eeIf hInEI IN SIL3WugEOANcEGVWL1L LBAGAeLlGS ttfL DDhuDEIni9 ce > tSALWIEEIn-3WNecGAINfLuLAV DDLIWNG E Lt h c8 BiIfgGl1 EeIhulSLenS6LDe5O > hGn DDlhIgGEAcS1O eeiEEI WnILWELS68VBLL AGNIAfINt6 lLuWuNeDc ItLfe SL > hGe WIiI EeItnLLuA1efOh3ALWc uGINEltcIBE LnegLDNA3 DD SVNG LSSIlWfE > eeIW ItueS W39LnELg-GuDLEhAn8BeFG IVi DDNEfLG1SI 1tNIOA lAhNLLccfWISE l > 6em on.0nsRH nehSA2l1HAsauncu0I65l7 ddnsn1SAS i u0eLAnlr t70gaains w gzsH > eeiog > rfiwgso0g364l1 1wU eei1n 5lL dDA 0 > DDInNcEfSWAEAtcL1IeSuAG5LE Lilh8tEGeDg f3B eEIOL7h uWV-L1IGN LINWeIn l S > ils eeiru00ewH.6sgAeHoSlLhglso0 asn0u2a atisA0 ddcngAnzRA Se Au2 nm8ns0 > uS8snH > DDD EWlE1GShhLe8L NENI tuL cgGGInfcBAlLfIO L1S eeIWeAEnILStu AViWNI > n IOLLt 0Alih tuWNE L nAGlVSNSDI DDeW BIegfG EeIhL9ELeScELWGAIfN1uIc > DnSE eeIWLu9tLNhNEuAt I1BelhGGfLWLS nSWINI eiELgAIG DDLEclV7 IO c Af > EeIElfN L4I lE2G cSOLniAWgSVItc ILDN L57BuDfALtSIe-WnGhGIW DDA NE1Lhuee > hNILN DD L6flSEeW1gthfI L1WAlENE eEIGIAt VGBDO uGLeLccAeSuLWIn Ii nS > (14 rows) > > > Did I misunderstood something or is it a bug? > > Best regards > Artur > Regards, Oleg _____________________________________________________________ Oleg Bartunov, Research Scientist, Head of AstroNet (www.astronet.ru), Sternberg Astronomical Institute, Moscow University, Russia Internet: oleg@sai.msu.su, http://www.sai.msu.su/~megera/ phone: +007(495)939-16-83, +007(495)939-23-83
В списке pgsql-general по дате отправления: