Re: [HACKERS] Rethinking our fulltext phrase-search implementation

Поиск
Список
Период
Сортировка
От Tom Lane
Тема Re: [HACKERS] Rethinking our fulltext phrase-search implementation
Дата
Msg-id 18847.1482009337@sss.pgh.pa.us
обсуждение исходный текст
Ответ на [HACKERS] Rethinking our fulltext phrase-search implementation  (Tom Lane <tgl@sss.pgh.pa.us>)
Список pgsql-hackers
I wrote:
> It's worth noting that with these rules, phrase searches will act as
> though "!x" always matches somewhere; for instance "!a <-> !b" will match
> any tsvector.  I argue that this is not wrong, not even if the tsvector is
> empty: there could have been adjacent stopwords matching !a and !b in the
> original text.  Since we've adjusted the phrase matching rules to treat
> stopwords as unknown-but-present words in a phrase, I think this is
> consistent.  It's also pretty hard to assert this is wrong and at the same
> time accept "!a <-> b" matching b at the start of the document.

To clarify this point, I'm imagining that the patch would include
documentation changes like the attached.

            regards, tom lane

diff --git a/doc/src/sgml/datatype.sgml b/doc/src/sgml/datatype.sgml
index 67d0c34..464ce83 100644
*** a/doc/src/sgml/datatype.sgml
--- b/doc/src/sgml/datatype.sgml
*************** SELECT 'fat & rat & ! cat'::tsqu
*** 3959,3973 ****
          tsquery
  ------------------------
   'fat' & 'rat' & !'cat'
-
- SELECT '(fat | rat) <-> cat'::tsquery;
-               tsquery
- -----------------------------------
-  'fat' <-> 'cat' | 'rat' <-> 'cat'
  </programlisting>
-
-      The last example demonstrates that <type>tsquery</type> sometimes
-      rearranges nested operators into a logically equivalent formulation.
      </para>

      <para>
--- 3959,3965 ----
diff --git a/doc/src/sgml/textsearch.sgml b/doc/src/sgml/textsearch.sgml
index 2da7595..bc33a70 100644
*** a/doc/src/sgml/textsearch.sgml
--- b/doc/src/sgml/textsearch.sgml
*************** text @@ text
*** 323,328 ****
--- 323,330 ----
      at least one of its arguments must appear, while the <literal>!</> (NOT)
      operator specifies that its argument must <emphasis>not</> appear in
      order to have a match.
+     For example, the query <literal>fat & ! rat</> matches documents that
+     contain <literal>fat</> but not <literal>rat</>.
     </para>

     <para>
*************** SELECT phraseto_tsquery('the cats ate th
*** 377,382 ****
--- 379,401 ----
      then <literal>&</literal>, then <literal><-></literal>,
      and <literal>!</literal> most tightly.
     </para>
+
+    <para>
+     It's worth noticing that the AND/OR/NOT operators mean something subtly
+     different when they are within the arguments of a FOLLOWED BY operator
+     than when they are not, because then the position of the match is
+     significant.  Normally, <literal>!x</> matches only documents that do not
+     contain <literal>x</> anywhere.  But <literal>x <-> !y</>
+     matches <literal>x</> if it is not immediately followed by <literal>y</>;
+     an occurrence of <literal>y</> elsewhere in the document does not prevent
+     a match.  Another example is that <literal>x & y</> normally only
+     requires that <literal>x</> and <literal>y</> both appear somewhere in the
+     document, but <literal>(x & y) <-> z</> requires <literal>x</>
+     and <literal>y</> to match at the same place, immediately before
+     a <literal>z</>.  Thus this query behaves differently from <literal>x
+     <-> z & y <-> z</>, which would match a document
+     containing two separate sequences <literal>x z</> and <literal>y z</>.
+    </para>
    </sect2>

    <sect2 id="textsearch-intro-configurations">

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

В списке pgsql-hackers по дате отправления:

Предыдущее
От: Corey Huinker
Дата:
Сообщение: Re: [HACKERS] PSQL commands: \quit_if, \quit_unless
Следующее
От: Tom Lane
Дата:
Сообщение: Re: [HACKERS] CREATE OR REPLACE VIEW bug