Re: tsearch comments
От | Uros Gruber |
---|---|
Тема | Re: tsearch comments |
Дата | |
Msg-id | 4521990343.20030128234833@sir-mag.com обсуждение исходный текст |
Ответ на | Re: tsearch comments (Oleg Bartunov <oleg@sai.msu.su>) |
Ответы |
Re: tsearch comments
("eric@did-it.com" <eric@did-it.com>)
|
Список | pgsql-general |
Hi! OpenFTS is great so far. But for example. We are working on directory engine and we would like to use some ranking on data we get from tsearch. The we have data like Page caption, description, keywords, url, page content .... and then we have another project we we search on complitely different kind of data. Using full text search in this scenario is very easy to use, because everything is in db and this is done on db level. Developer do not need to worry about that how to index something. It great because you can say this column is fulltext indexed. Second stage is ordering data you get from tsearch and thats where openFTS comes. But you have to make some middle ware which is great, but we need to focus on other problems not on middle ware. Moving this to C would be great but not solution to all of us we want to meka our searches good. I think relkov and relor is good for start and should be going that way. I think that everybody can very simple acomplish hilightning and generation of headlines once they get result ordered. As i say in my mail before and Oleg ask me "Could you elaborate this ?". I try to make some changes openFTS special in relkov and relkor. But i'm not god in advanced C programing so i spend a lot of time to find out what exactly code does. And here is my idea what would be great if this is possible to make, because i don't realy know how pg internaly works. Let say we create some table where we want to use full text search. CREATE table ..... .. mycolumn varchar, another_column varchar, .... fulltext(mycolumn,another_column) } the system then make all necessary index tables where those positions would be saved when some data is inserted. I don't know if this is possible to make somwehere in backgound so user don actualy se those tables, but this is not a problem. Parsing search words can anybody easily make in their own language. Or he could use OpenFTS functionality. I made it for PHP. So when you have those search words we passed it to sql query. something like this. SELECT mycolumn,another_column FROM mytable WHERE mycolumn @ 'search string' AND another_column @ 'search string'; This is done by tsearch and we get data searched but not orderd by relevance. For that we add something in that way SELECT mycolumn,another_column,rank() AS sumofrank FROM mytable WHERE....... ORDER my sumofrank I'll write this rank here for better understanding rank({mycolumn=>0.01},{another_column=>0.001},'search string') AS sumofrank This would read that mycolumn have base weight 0.01 and another column 0.001, so if search string is found in beginig of another column it would be ranked lower than same string found in mycolumn in the middle of it. Those weight could be summed. With this could be possible to make order what column is more important not only generaly but for every query we make. Sintax is just for easier understanding what i'm trying to solve. So far we orderd aout data and then we could make hilighning and stuff in any language we want. I hope everybody undestands what is my idea and i would like to help i just have to learn more from the code and what internaly is done with that data. I make some ranking in PHP but it was not fast becase there were a lot of data etc and php is not as fast as C is. But i get pretty results and also the concept how to rank something. I could also be made some rule engine how to rank something, but i think that first of all we have to start on something trivial and simple. And when this works we move to advanced. Let say we check if text is bold or is in CAPS... -- bye, Uros Tuesday, January 28, 2003, 8:11:36 PM, you wrote: OB> On Tue, 28 Jan 2003, Tomaz Borstnar wrote: >> At 13:47 28.1.2003 +0300, Oleg Bartunov wrote the following message: >> >We want to keep tsearch as simple as it's and now we just add >> >better and friendly configurability. Do we need complicate tsearch ? >> >> Sometimes you need that because some other app is putting data into database. >> OB> So, you'll end up with something like OpenFTS, which was designed as OB> *engine* to be integrated into other apps. The real problem is that OB> OpenFTS is written in perl and porting to other languages is OB> difficult task. new tsearch already has some features of OpenFTS and OB> we're slowly moving to idea we should rewrite OpenFTS in 'C', OB> so writing interfaces would be much simpler. OB> There is major problem with moving ALL features of OpenFTS to tsearch OB> we don't know how to resolve - generation of headlines, text fragments OB> with hilighted query terms. Once we resolve that we could concentrate OB> on tsearch with ranking support.
В списке pgsql-general по дате отправления: