Don't guess, but read docs
http://www.postgresql.org/docs/8.4/interactive/textsearch-dictionaries.html#TEXTSEARCH-SIMPLE-DICTIONARY
12.6.2. Simple Dictionary
The simple dictionary template operates by converting the input token to lower case and checking it against a file of
stopwords. If it is found in the file then an empty array is returned, causing the token to be discarded. If not, the
lower-casedform of the word is returned as the normalized lexeme. Alternatively, the dictionary can be configured to
reportnon-stop-words as unrecognized, allowing them to be passed on to the next dictionary in the list.
d=# \dFd+ simple
List of text search dictionaries
Schema | Name | Template | Init options | Description
------------+--------+-------------------+--------------+-----------------------------------------------------------
pg_catalog | simple | pg_catalog.simple | | simple dictionary: just lower case and check for stopword
By default it has no Init options, so it doesn't check for stopwords.
On Thu, 22 Jul 2010, Andreas Joseph Krogh wrote:
> On 07/22/2010 06:27 PM, John Gage wrote:
>> The easiest way to look at this is to give the simple dictionary a document
>> with to_tsvector() and see if stopwords pop out.
>>
>> In my experience they do. In my experience, the simple dictionary just
>> breaks the document down into the space etc. separated words in the
>> document. It doesn't analyze further.
>
> That's my experience too, I just want to make sure it doesn't actually have
> any stopwords which I've missed. Trying many phrases and checking for
> stopwords isn't really proving anything.
>
> Can anybody confirm the "simple" dict. only lowercases the words and
> "uniques" them?
>
>
Regards,
Oleg
_____________________________________________________________
Oleg Bartunov, Research Scientist, Head of AstroNet (www.astronet.ru),
Sternberg Astronomical Institute, Moscow University, Russia
Internet: oleg@sai.msu.su, http://www.sai.msu.su/~megera/
phone: +007(495)939-16-83, +007(495)939-23-83