Re: possible bug in cover density ranking?
От | Sushant Sinha |
---|---|
Тема | Re: possible bug in cover density ranking? |
Дата | |
Msg-id | 1241227234.4633.1.camel@dragflick обсуждение исходный текст |
Ответ на | Re: possible bug in cover density ranking? (Sushant Sinha <sushant354@gmail.com>) |
Список | pgsql-hackers |
I see this as open items here http://wiki.postgresql.org/wiki/PostgreSQL_8.4_Open_Items Any interest in fixing this? -Sushant. On Thu, 2009-01-29 at 13:54 -0500, Sushant Sinha wrote: > > > On Thu, Jan 29, 2009 at 12:38 PM, Teodor Sigaev <teodor@sigaev.ru> > wrote: > Is this what is desired? It seems to me that Wdoc is > getting a high > ranking even when we are not sure of the position > information. > 0.1 is not very high rank, and we could not suggest any > reasonable rank in this case. This document may be good, may > be bad. rank_cd is not limited by 1. > > > For a cover of 2 query items, 0.1 is actually the maximum rank. This > is only possible when both query items are adjacent to each other. > > 0.1 may not seem too high when we look at its absoule value. But the > problem is we are ranking a document for which we have no positional > information available higher than a document for which we may have > positional information available with let suppose the cover length of > 3. I think we should rank the document with cover length 3 higher than > the document for which we have no positional information (and assume > cover length of 2 as we are doing now). > > I feel that if ext.p=ext.q for query items > 1, then we should not > count that cover for ranking at all. Or, another option will be to > significantly inflate nNoise in this scenrio to say 100. Putting > nNoise=(ext.end-ext.begin)/2 is way too low for covers that we have no > idea on (it is 0 for query items = 2). > > I am not assuming or suggesting that rank_cd is bounded by one. Off > course its rank increases as more and more covers are added. > > Thanks, > Sushant. > > > > The comment above says that "In this case we > approximate number of > noise word as half cover's length". But we do not know > the cover's > length in this case as ext.p and ext.q are both > unreliable. And ext.end > -ext.begin is not the "cover's length". It is the > number of query items > found in the cover. > > > Yeah, but if there is no information then information is > absent :), but I agree with you to change comment > -- > Teodor Sigaev E-mail: > teodor@sigaev.ru > WWW: > http://www.sigaev.ru/ >
В списке pgsql-hackers по дате отправления: