The following bug has been logged on the website:
Bug reference: 8354
Logged by: Alex Hill
Email address: alex@hill.net.au
PostgreSQL version: 9.2.4
Operating system: OS X 10.8.4 Mountain Lion
Description:
Hi all,
The docs for ts_rank_cd state:
"This function requires positional information in its input. Therefore it
will not work on "stripped" tsvector values â it will always return zero."
However if a tsvector contains some stripped lexemes and some non-stripped,
ts_rank_cd will rank extents including the non-stripped values.
For example, this evaluates to zero as expected:
SELECT ts_rank_cd(strip(to_tsvector('text search')),
plainto_tsquery('text search'))
But this doesn't:
SELECT ts_rank_cd(to_tsvector('text') || strip(to_tsvector('search')),
plainto_tsquery('text search'))
I think this is a bug, if not in the code then in the documentation, which
isn't clear on what happens when stripped and positioned lexemes are mixed
in one tsvector.
I would prefer that stripped lexemes were completely ignored by ts_rank_cd:
my use case is using this as a fifth pseudo-weight, which matches a @@ query
but doesn't add to a ts_rank_cd ranking.
What do you think?
Cheers,
Alex