Further tsquery comparison fun:
=> SELECT q.qid, q.query, count(*) FROM doc.documents d, util.queries q
WHERE d.words @@ q.query AND (q.query::text=$$'tender'$$) GROUP BY
q.qid, q.query ; qid | query | count
-----+----------+------- 195 | 'tender' | 374 248 | 'tender' | 374 257 | 'tender' | 374 332 | 'tender' | 374
401| 'tender' | 374 409 | 'tender' | 374 519 | 'tender' | 374 557 | 'tender' | 374 736 | 'tender' | 374 749 |
'tender'| 374 869 | 'tender' | 374 879 | 'tender' | 374 926 | 'tender' | 374
(13 rows)
=> SELECT q.query, count(*) FROM doc.documents d, util.queries q WHERE
d.words @@ q.query AND (q.query::text=$$'tender'$$) GROUP BY q.query ; query | count
----------+------- 'tender' | 1870 'tender' | 1496 'tender' | 1496
(3 rows)
It seems to be that the tsquery is remembering the shape of the original
query, even though it's been trimmed.
=> SELECT q.query, min(qid), max(qid), count(*) FROM doc.documents d,
util.queries q WHERE d.words @@ q.query AND (q.query::text=$$'tender'$$)
GROUP BY q.query ; query | min | max | count
----------+-----+-----+------- 'tender' | 736 | 926 | 1870 (5 rows aggregated) 'tender' | 401 | 557 | 1496 (4 rows
aggregated)'tender' | 195 | 332 | 1496 (4 rows aggregated)
(3 rows)
=> SELECT * FROM util.queries WHERE qid IN (195,248, 257, 332,
401,409,519,557,736,749,869,879,926) ORDER BY qid; qid | words | query
-----+---------------------+---------- 195 | can & of & tenders | 'tender' (3 clauses) 248 | tender & the & this |
'tender'(3 clauses) 257 | have & tender & for | 'tender' (3 clauses) 332 | for & tenders & of | 'tender' (3 clauses)
401| tender & with | 'tender' (2 clauses) 409 | tenders & to | 'tender' (2 clauses) 519 | tender & to
| 'tender' (2 clauses) 557 | tenders & be | 'tender' (2 clauses) 736 | tenderer | 'tender' (1
clause)749 | tender | 'tender' (1 clause) 869 | tender | 'tender' (1 clause) 879 | tender
| 'tender' (1 clause) 926 | tender | 'tender' (1 clause)
(13 rows)
So - is this a bug, feature, "feature"?
-- Richard Huxton Archonet Ltd