Обсуждение: turning a tsvector without position in a weighted tsvector
If I convert a string to a tsvector just casting (::tsvector) I obtain a vector without positions. tsvectors without positions don't have weights too. I haven't found a way to turn a vector without weight/pos, into a vector with weight/pos. Is there a way to apply weight/add positions to tsvectors without positions? Is there any use-case? -- Ivan Sergio Borgonovo http://www.webthatworks.it
Ivan, what's wrong with: postgres=# select 'abc:1'::tsvector; tsvector ---------- 'abc':1 postgres=# select setweight('abc:1'::tsvector,'a'); setweight ----------- 'abc':1A or just use to_tsvector() instead of casting? Oleg On Mon, 8 Feb 2010, Ivan Sergio Borgonovo wrote: > If I convert a string to a tsvector just casting (::tsvector) I > obtain a vector without positions. > tsvectors without positions don't have weights too. > > I haven't found a way to turn a vector without weight/pos, into a > vector with weight/pos. > > Is there a way to apply weight/add positions to tsvectors without > positions? > Is there any use-case? > > Regards, Oleg _____________________________________________________________ Oleg Bartunov, Research Scientist, Head of AstroNet (www.astronet.ru), Sternberg Astronomical Institute, Moscow University, Russia Internet: oleg@sai.msu.su, http://www.sai.msu.su/~megera/ phone: +007(495)939-16-83, +007(495)939-23-83
On Mon, 8 Feb 2010 23:01:45 +0300 (MSK) Oleg Bartunov <oleg@sai.msu.su> wrote: > Ivan, > > what's wrong with: > > postgres=# select 'abc:1'::tsvector; > tsvector > ---------- > 'abc':1 Yes you're right. I think I misplaced some quotes. But still, once a vector has no position, I can't add the weights. test=# select setweight('tano'::tsvector, 'A'); setweight ----------- 'tano' (1 row) test=# select setweight('tano:1'::tsvector, 'A'); setweight ----------- 'tano':1A (1 row) Since I'm writing some helper to manipulate tsvectors I was wondering if a) there is any reasonable use case of adding weights to vectors with no position b) I missed any obvious way to add weights to tsvectors that were initially without positions thanks -- Ivan Sergio Borgonovo http://www.webthatworks.it
border case ::tsvector vs. to_tsvector was turning a tsvector without position in a weighted tsvector
От
Ivan Sergio Borgonovo
Дата:
This was what I was after: test=# select version(); version ------------------------------------------------------------------------------------------------ PostgreSQL 8.3.9 on x86_64-pc-linux-gnu, compiled by GCC gcc-4.3.real (Debian 4.3.2-1.1) 4.3.2 test=# select to_tsvector('pino gino'); to_tsvector ------------------- 'gino':2 'pino':1 (1 row) test=# select 'pino gino'::tsvector; tsvector --------------- 'gino' 'pino' (1 row) test=# select to_tsvector('pino gino') @@ 'gino:B'::tsquery; ?column? ---------- f (1 row) test=# select to_tsvector('pino gino') @@ 'gino:D'::tsquery; ?column? ---------- t (1 row) test=# select ('pino gino'::tsvector) @@ 'gino:B'::tsquery; ?column? ---------- t (1 row) test=# select to_tsvector('pino:1B gino') @@ 'pino'::tsquery; ?column? ---------- t (1 row) test=# select 'pino gino'::tsvector || to_tsvector('pino gino'); ?column? ------------------- 'gino':2 'pino':1 (1 row) test=# select 'pino gino'::tsvector || 'pino gino'::tsvector; ?column? --------------- 'gino' 'pino' (1 row) test=# select to_tsvector('pino gino') || to_tsvector('pino gino'); ?column? ----------------------- 'gino':2,4 'pino':1,3 (1 row) test=# select 'pino gino'::tsvector || to_tsvector('gino tano'); ?column? -------------------------- 'gino':1 'pino' 'tano':2 test=# select setweight('pino gino'::tsvector || to_tsvector('gino tano'), 'A'); setweight ---------------------------- 'gino':1A 'pino' 'tano':2A (1 row) So (even if it may sound obvious to many): - tsvectors may be a mix of lexemes with and without weights - a lexeme without a weight (=! default D weight) is a lexeme with ALL weights - you can't assign a weight to a lexeme without a position and it would be hard to assign a position after a document is parsed into a tsvector, so while in theory it could be reasonable to have lexemes with weight and no position, in practice you'll have to assign not meaningful positions if you'd like to assign a weight to a tsvector with no positions. I still wonder if it would be reasonable to write a function that forcefully assign a position and a weight to vectors to be used with ts_rank. I've some ideas about possible use cases but I'm still unsure if they are reasonable. eg. someone would be willing to save storage and CPU cycles storing part of documents in precomputed tsvectors with no weight and then build up a search with merged tsvectors with weights using ts_rank. OK.. trying to finish up my tsvector_to_tsquery function in a reasonable way first. -- Ivan Sergio Borgonovo http://www.webthatworks.it