Обсуждение: TSearch2: Problems with compound words and stop words

Поиск
Список
Период
Сортировка

TSearch2: Problems with compound words and stop words

От
Timo Haberkern
Дата:
Hi there,

 i have some troubles with my TSearch2 Installation. I have done this
 installation as described in

 http://www.sai.msu.su/~megera/oddmuse/index.cgi/Tsearch_V2_compound_words
<http://www.sai.msu.su/%7Emegera/oddmuse/index.cgi/Tsearch_V2_compound_words>

 I used the german myspell dictionary from
 http://lingucomponent.openoffice.org/spell_dic.html and converted it with
 my2ispell

 Nearly everything is working fine so far, except two problems:

 1.) The stopword-file seems to be ignored: If i try it with SELECT
 to_tsvector("default_german", "ein Haus") i get

 "ein":1 "haus":2

 ein should be a Stopword for german (and is defined the german.stop file as
 well)


 2.) The compound words feature doesn"t work too. I have tried a lot of words,
 i.e. "Fehlermeldung" with SELECT to_tsvector("default_german", "Fehlermeldung")
 i only get
 "fehlermeldung":1 but i would expect "fehler" and "meldung" as seperated
 entries. Is there anything wrong with the dictonary or my configuration?


 My current configuration:

 pg_ts_cfg:

 default    default    C
 default_russian    default    ru_RU.KOI8-R
 simple    default    NULL
 default_german    default    de_DE.ISO8859-1

 pg_ts_cfgmap:

 default_german    host    {simple}
 default_german    hword    {simple}
 default_german    int    {simple}
 default_german    nlhword    {simple}
 default_german    nlpart_hword    {simple}
 default_german    nlword    {simple}
 default_german    part_hword    {simple}
 default_german    sfloat    {simple}
 default_german    uint    {simple}
 default_german    uri    {simple}
 default_german    url    {simple}
 default_german    version    {simple}
 default_german    word    {simple}
 default_german    lpart_hword    {de_ispell,german_snowball}
 default_german    lword    {de_ispell,german_snowball}
 default_german    lhword    {de_ispell,german_snowball}


 pg_ts_dict:

 de_ispell | 17166    |
 DictFile="/usr/local/pgsql/share/contrib/dictonary/german.dict",
 AffFile="/usr/local/pgsql/share/contrib/dictonary/german.aff",
 StopFile="/usr/local/pgsql/share/contrib/dictonary/german.stop"    | 17167 | NULL
 german_snowball    | 17357 | NULL    | 17162 | Snowball stemmer for german




 Can anyone help me?

 regards

 Timo


Re: TSearch2: Problems with compound words and stop words

От
Oleg Bartunov
Дата:
Timo,

I forward your message to openfts mailing list.
Also, could you specify if locale settings are correct for your
database and what dictionary you have downloaded.

     Oleg
On Fri, 5 Nov 2004, Timo Haberkern wrote:

> Hi there,
>
> i have some troubles with my TSearch2 Installation. I have done this
> installation as described in
> http://www.sai.msu.su/~megera/oddmuse/index.cgi/Tsearch_V2_compound_words
> <http://www.sai.msu.su/%7Emegera/oddmuse/index.cgi/Tsearch_V2_compound_words>
>
> I used the german myspell dictionary from
> http://lingucomponent.openoffice.org/spell_dic.html and converted it with
> my2ispell
>
> Nearly everything is working fine so far, except two problems:
>
> 1.) The stopword-file seems to be ignored: If i try it with SELECT
> to_tsvector("default_german", "ein Haus") i get     "ein":1 "haus":2
>
> ein should be a Stopword for german (and is defined the german.stop file as
> well)
>
> 2.) The compound words feature doesn"t work too. I have tried a lot of words,
> i.e. "Fehlermeldung" with SELECT to_tsvector("default_german",
> "Fehlermeldung")
> i only get
> "fehlermeldung":1 but i would expect "fehler" and "meldung" as seperated
> entries. Is there anything wrong with the dictonary or my configuration?
>
>
> My current configuration:
>
> pg_ts_cfg:
>
> default    default    C
> default_russian    default    ru_RU.KOI8-R
> simple    default    NULL
> default_german    default    de_DE.ISO8859-1
>     pg_ts_cfgmap:
>
> default_german    host    {simple}
> default_german    hword    {simple}
> default_german    int    {simple}
> default_german    nlhword    {simple}
> default_german    nlpart_hword    {simple}
> default_german    nlword    {simple}
> default_german    part_hword    {simple}
> default_german    sfloat    {simple}
> default_german    uint    {simple}
> default_german    uri    {simple}
> default_german    url    {simple}
> default_german    version    {simple}
> default_german    word    {simple}
> default_german    lpart_hword    {de_ispell,german_snowball}
> default_german    lword    {de_ispell,german_snowball}
> default_german    lhword    {de_ispell,german_snowball}
>
>
> pg_ts_dict:
>
> de_ispell | 17166    |
> DictFile="/usr/local/pgsql/share/contrib/dictonary/german.dict",
> AffFile="/usr/local/pgsql/share/contrib/dictonary/german.aff",
> StopFile="/usr/local/pgsql/share/contrib/dictonary/german.stop"    | 17167 |
> NULL
> german_snowball    | 17357 | NULL    | 17162 | Snowball stemmer for german
>
>
>
> Can anyone help me?
>
> regards
>
> Timo
>
>
> ---------------------------(end of broadcast)---------------------------
> TIP 4: Don't 'kill -9' the postmaster
>

     Regards,
         Oleg
_____________________________________________________________
Oleg Bartunov, sci.researcher, hostmaster of AstroNet,
Sternberg Astronomical Institute, Moscow University (Russia)
Internet: oleg@sai.msu.su, http://www.sai.msu.su/~megera/
phone: +007(095)939-16-83, +007(095)939-23-83

Re: TSearch2: Problems with compound words and stop words

От
Oleg Bartunov
Дата:
Timo,

please, check you apply patch for compound word support.
What is version of postgresql ?
Does ispell dict works for non-compound words ?

     Oleg

On Fri, 5 Nov 2004, Timo Haberkern wrote:

> Hi there,
>
> i have some troubles with my TSearch2 Installation. I have done this
> installation as described in
> http://www.sai.msu.su/~megera/oddmuse/index.cgi/Tsearch_V2_compound_words
> <http://www.sai.msu.su/%7Emegera/oddmuse/index.cgi/Tsearch_V2_compound_words>
>
> I used the german myspell dictionary from
> http://lingucomponent.openoffice.org/spell_dic.html and converted it with
> my2ispell
>
> Nearly everything is working fine so far, except two problems:
>
> 1.) The stopword-file seems to be ignored: If i try it with SELECT
> to_tsvector("default_german", "ein Haus") i get     "ein":1 "haus":2
>
> ein should be a Stopword for german (and is defined the german.stop file as
> well)
>
> 2.) The compound words feature doesn"t work too. I have tried a lot of words,
> i.e. "Fehlermeldung" with SELECT to_tsvector("default_german",
> "Fehlermeldung")
> i only get
> "fehlermeldung":1 but i would expect "fehler" and "meldung" as seperated
> entries. Is there anything wrong with the dictonary or my configuration?
>
>
> My current configuration:
>
> pg_ts_cfg:
>
> default    default    C
> default_russian    default    ru_RU.KOI8-R
> simple    default    NULL
> default_german    default    de_DE.ISO8859-1
>     pg_ts_cfgmap:
>
> default_german    host    {simple}
> default_german    hword    {simple}
> default_german    int    {simple}
> default_german    nlhword    {simple}
> default_german    nlpart_hword    {simple}
> default_german    nlword    {simple}
> default_german    part_hword    {simple}
> default_german    sfloat    {simple}
> default_german    uint    {simple}
> default_german    uri    {simple}
> default_german    url    {simple}
> default_german    version    {simple}
> default_german    word    {simple}
> default_german    lpart_hword    {de_ispell,german_snowball}
> default_german    lword    {de_ispell,german_snowball}
> default_german    lhword    {de_ispell,german_snowball}
>
>
> pg_ts_dict:
>
> de_ispell | 17166    |
> DictFile="/usr/local/pgsql/share/contrib/dictonary/german.dict",
> AffFile="/usr/local/pgsql/share/contrib/dictonary/german.aff",
> StopFile="/usr/local/pgsql/share/contrib/dictonary/german.stop"    | 17167 |
> NULL
> german_snowball    | 17357 | NULL    | 17162 | Snowball stemmer for german
>
>
>
> Can anyone help me?
>
> regards
>
> Timo
>
>
> ---------------------------(end of broadcast)---------------------------
> TIP 4: Don't 'kill -9' the postmaster
>

     Regards,
         Oleg
_____________________________________________________________
Oleg Bartunov, sci.researcher, hostmaster of AstroNet,
Sternberg Astronomical Institute, Moscow University (Russia)
Internet: oleg@sai.msu.su, http://www.sai.msu.su/~megera/
phone: +007(095)939-16-83, +007(095)939-23-83

Re: TSearch2: Problems with compound words and stop words

От
Timo Haberkern
Дата:
Oleg,

i use TSearch2 with PostgreSQL 7.4.6 and i applied the compoundword
patch yesterday. The configuration changed a little bit but the result
is the same. I get no compound words. I'm using the locale de_DE with
encoding ISO8859-1 for the database.

I think i spell is working correctly except the compound words. If i try

SELECT lexize('de_ispell', 'springt')

i get

lexize
{springen,springen}

which seems correct.


But a SELECT lexize('de_ispell', 'Autobahn')

results in

lexize
{autobahn}

i would expect {auto,bahn, autobahn}

The new configuration after the compound word patch:


Actions     dict_name

<http://www.rotex-service.com/phppgadmin/display.php?database=selina_rotex&schema=public&table=pg_ts_dict&return_url=tblproperties.php%3Fdatabase%3Dselina_rotex%26amp%3Bschema%3Dpublic%26table%3Dpg_ts_dict&return_desc=Back&sortkey=2&sortdir=asc&strings=expanded&page=1>

    dict_init

<http://www.rotex-service.com/phppgadmin/display.php?database=selina_rotex&schema=public&table=pg_ts_dict&return_url=tblproperties.php%3Fdatabase%3Dselina_rotex%26amp%3Bschema%3Dpublic%26table%3Dpg_ts_dict&return_desc=Back&sortkey=3&sortdir=asc&strings=expanded&page=1>

    dict_initoption

<http://www.rotex-service.com/phppgadmin/display.php?database=selina_rotex&schema=public&table=pg_ts_dict&return_url=tblproperties.php%3Fdatabase%3Dselina_rotex%26amp%3Bschema%3Dpublic%26table%3Dpg_ts_dict&return_desc=Back&sortkey=4&sortdir=asc&strings=expanded&page=1>

    dict_lexize

<http://www.rotex-service.com/phppgadmin/display.php?database=selina_rotex&schema=public&table=pg_ts_dict&return_url=tblproperties.php%3Fdatabase%3Dselina_rotex%26amp%3Bschema%3Dpublic%26table%3Dpg_ts_dict&return_desc=Back&sortkey=5&sortdir=asc&strings=expanded&page=1>

    dict_comment

<http://www.rotex-service.com/phppgadmin/display.php?database=selina_rotex&schema=public&table=pg_ts_dict&return_url=tblproperties.php%3Fdatabase%3Dselina_rotex%26amp%3Bschema%3Dpublic%26table%3Dpg_ts_dict&return_desc=Back&sortkey=6&sortdir=asc&strings=expanded&page=1>


Edit

<http://www.rotex-service.com/phppgadmin/display.php?action=confeditrow&strings=expanded&page=1&key%5Bdict_name%5D=simple&database=selina_rotex&schema=public&table=pg_ts_dict&return_url=tblproperties.php%3Fdatabase%3Dselina_rotex%26amp%3Bschema%3Dpublic%26table%3Dpg_ts_dict&return_desc=Back&sortkey=&sortdir=>

    Delete

<http://www.rotex-service.com/phppgadmin/display.php?action=confdelrow&strings=expanded&page=1&key%5Bdict_name%5D=simple&database=selina_rotex&schema=public&table=pg_ts_dict&return_url=tblproperties.php%3Fdatabase%3Dselina_rotex%26amp%3Bschema%3Dpublic%26table%3Dpg_ts_dict&return_desc=Back&sortkey=&sortdir=>

    simple     dex_init(text)     /NULL/     dex_lexize(internal,internal,integer)
Simple example of dictionary.
Edit

<http://www.rotex-service.com/phppgadmin/display.php?action=confeditrow&strings=expanded&page=1&key%5Bdict_name%5D=en_stem&database=selina_rotex&schema=public&table=pg_ts_dict&return_url=tblproperties.php%3Fdatabase%3Dselina_rotex%26amp%3Bschema%3Dpublic%26table%3Dpg_ts_dict&return_desc=Back&sortkey=&sortdir=>

    Delete

<http://www.rotex-service.com/phppgadmin/display.php?action=confdelrow&strings=expanded&page=1&key%5Bdict_name%5D=en_stem&database=selina_rotex&schema=public&table=pg_ts_dict&return_url=tblproperties.php%3Fdatabase%3Dselina_rotex%26amp%3Bschema%3Dpublic%26table%3Dpg_ts_dict&return_desc=Back&sortkey=&sortdir=>

    en_stem     snb_en_init(text)
/usr/local/pgsql/share/contrib/english.stop
snb_lexize(internal,internal,integer)     English Stemmer. Snowball.
Edit

<http://www.rotex-service.com/phppgadmin/display.php?action=confeditrow&strings=expanded&page=1&key%5Bdict_name%5D=ru_stem&database=selina_rotex&schema=public&table=pg_ts_dict&return_url=tblproperties.php%3Fdatabase%3Dselina_rotex%26amp%3Bschema%3Dpublic%26table%3Dpg_ts_dict&return_desc=Back&sortkey=&sortdir=>

    Delete

<http://www.rotex-service.com/phppgadmin/display.php?action=confdelrow&strings=expanded&page=1&key%5Bdict_name%5D=ru_stem&database=selina_rotex&schema=public&table=pg_ts_dict&return_url=tblproperties.php%3Fdatabase%3Dselina_rotex%26amp%3Bschema%3Dpublic%26table%3Dpg_ts_dict&return_desc=Back&sortkey=&sortdir=>

    ru_stem     snb_ru_init(text)
/usr/local/pgsql/share/contrib/russian.stop
snb_lexize(internal,internal,integer)     Russian Stemmer. Snowball.
Edit

<http://www.rotex-service.com/phppgadmin/display.php?action=confeditrow&strings=expanded&page=1&key%5Bdict_name%5D=ispell_template&database=selina_rotex&schema=public&table=pg_ts_dict&return_url=tblproperties.php%3Fdatabase%3Dselina_rotex%26amp%3Bschema%3Dpublic%26table%3Dpg_ts_dict&return_desc=Back&sortkey=&sortdir=>

    Delete

<http://www.rotex-service.com/phppgadmin/display.php?action=confdelrow&strings=expanded&page=1&key%5Bdict_name%5D=ispell_template&database=selina_rotex&schema=public&table=pg_ts_dict&return_url=tblproperties.php%3Fdatabase%3Dselina_rotex%26amp%3Bschema%3Dpublic%26table%3Dpg_ts_dict&return_desc=Back&sortkey=&sortdir=>

    ispell_template     spell_init(text)     /NULL/
spell_lexize(internal,internal,integer)     ISpell interface. Must have
.dict and .aff files
Edit

<http://www.rotex-service.com/phppgadmin/display.php?action=confeditrow&strings=expanded&page=1&key%5Bdict_name%5D=synonym&database=selina_rotex&schema=public&table=pg_ts_dict&return_url=tblproperties.php%3Fdatabase%3Dselina_rotex%26amp%3Bschema%3Dpublic%26table%3Dpg_ts_dict&return_desc=Back&sortkey=&sortdir=>

    Delete

<http://www.rotex-service.com/phppgadmin/display.php?action=confdelrow&strings=expanded&page=1&key%5Bdict_name%5D=synonym&database=selina_rotex&schema=public&table=pg_ts_dict&return_url=tblproperties.php%3Fdatabase%3Dselina_rotex%26amp%3Bschema%3Dpublic%26table%3Dpg_ts_dict&return_desc=Back&sortkey=&sortdir=>

    synonym     syn_init(text)     /NULL/
syn_lexize(internal,internal,integer)     Example of synonym dictionary
Edit

<http://www.rotex-service.com/phppgadmin/display.php?action=confeditrow&strings=expanded&page=1&key%5Bdict_name%5D=de_ispell&database=selina_rotex&schema=public&table=pg_ts_dict&return_url=tblproperties.php%3Fdatabase%3Dselina_rotex%26amp%3Bschema%3Dpublic%26table%3Dpg_ts_dict&return_desc=Back&sortkey=&sortdir=>

    Delete

<http://www.rotex-service.com/phppgadmin/display.php?action=confdelrow&strings=expanded&page=1&key%5Bdict_name%5D=de_ispell&database=selina_rotex&schema=public&table=pg_ts_dict&return_url=tblproperties.php%3Fdatabase%3Dselina_rotex%26amp%3Bschema%3Dpublic%26table%3Dpg_ts_dict&return_desc=Back&sortkey=&sortdir=>

    de_ispell     spell_init(text)
DictFile="/usr/local/pgsql/share/contrib/dictonary/german_comb.dict",
AffFile="/usr/local/pgsql/share/contrib/dictonary/german_comb.aff",
StopFile="/usr/local/pgsql/share/contrib/dictonary/german.stop"
spell_lexize(internal,internal,integer)     /NULL/



Timo


Oleg Bartunov wrote:

> Timo,
>
> please, check you apply patch for compound word support.
> What is version of postgresql ?
> Does ispell dict works for non-compound words ?
>
>     Oleg
>
> On Fri, 5 Nov 2004, Timo Haberkern wrote:
>
>> Hi there,
>>
>> i have some troubles with my TSearch2 Installation. I have done this
>> installation as described in
>> http://www.sai.msu.su/~megera/oddmuse/index.cgi/Tsearch_V2_compound_words
>> <http://www.sai.msu.su/%7Emegera/oddmuse/index.cgi/Tsearch_V2_compound_words>
>>
>>
>> I used the german myspell dictionary from
>> http://lingucomponent.openoffice.org/spell_dic.html and converted it
>> with
>> my2ispell
>>
>> Nearly everything is working fine so far, except two problems:
>>
>> 1.) The stopword-file seems to be ignored: If i try it with SELECT
>> to_tsvector("default_german", "ein Haus") i get     "ein":1 "haus":2
>>
>> ein should be a Stopword for german (and is defined the german.stop
>> file as
>> well)
>>
>> 2.) The compound words feature doesn"t work too. I have tried a lot
>> of words,
>> i.e. "Fehlermeldung" with SELECT to_tsvector("default_german",
>> "Fehlermeldung")
>> i only get
>> "fehlermeldung":1 but i would expect "fehler" and "meldung" as seperated
>> entries. Is there anything wrong with the dictonary or my configuration?
>>
>>
>> My current configuration:
>>
>> pg_ts_cfg:
>>
>> default    default    C
>> default_russian    default    ru_RU.KOI8-R
>> simple    default    NULL
>> default_german    default    de_DE.ISO8859-1
>>     pg_ts_cfgmap:
>>
>> default_german    host    {simple}
>> default_german    hword    {simple}
>> default_german    int    {simple}
>> default_german    nlhword    {simple}
>> default_german    nlpart_hword    {simple}
>> default_german    nlword    {simple}
>> default_german    part_hword    {simple}
>> default_german    sfloat    {simple}
>> default_german    uint    {simple}
>> default_german    uri    {simple}
>> default_german    url    {simple}
>> default_german    version    {simple}
>> default_german    word    {simple}
>> default_german    lpart_hword    {de_ispell,german_snowball}
>> default_german    lword    {de_ispell,german_snowball}
>> default_german    lhword    {de_ispell,german_snowball}
>>
>>
>> pg_ts_dict:
>>
>> de_ispell | 17166    |
>> DictFile="/usr/local/pgsql/share/contrib/dictonary/german.dict",
>> AffFile="/usr/local/pgsql/share/contrib/dictonary/german.aff",
>> StopFile="/usr/local/pgsql/share/contrib/dictonary/german.stop"    |
>> 17167 | NULL
>> german_snowball    | 17357 | NULL    | 17162 | Snowball stemmer for
>> german
>>
>>
>>
>> Can anyone help me?
>>
>> regards
>>
>> Timo
>>
>>
>> ---------------------------(end of broadcast)---------------------------
>> TIP 4: Don't 'kill -9' the postmaster
>>
>
>     Regards,
>         Oleg
> _____________________________________________________________
> Oleg Bartunov, sci.researcher, hostmaster of AstroNet,
> Sternberg Astronomical Institute, Moscow University (Russia)
> Internet: oleg@sai.msu.su, http://www.sai.msu.su/~megera/
> phone: +007(095)939-16-83, +007(095)939-23-83
>
> ---------------------------(end of broadcast)---------------------------
> TIP 2: you can get off all lists at once with the unregister command
>    (send "unregister YourEmailAddressHere" to majordomo@postgresql.org)
>
>

Re: TSearch2: Problems with compound words and stop words

От
Oleg Bartunov
Дата:
On Fri, 5 Nov 2004, Timo Haberkern wrote:

> Oleg,
>
> i use TSearch2 with PostgreSQL 7.4.6 and i applied the compoundword patch
> yesterday. The configuration changed a little bit but the result is the same.
> I get no compound words. I'm using the locale de_DE with encoding ISO8859-1
> for the database.
>
> I think i spell is working correctly except the compound words. If i try
>
> SELECT lexize('de_ispell', 'springt')
>
> i get
>
> lexize
> {springen,springen}
>
> which seems correct.
>
>
> But a SELECT lexize('de_ispell', 'Autobahn')
>
> results in
>
> lexize
> {autobahn}
>
> i would expect {auto,bahn, autobahn}

Hmm, have you checked 'Autobahn' in ispell dictionary ? Does dictionary
you used supports 'Z' flag for compound words ?


>
> The new configuration after the compound word patch:
>

Seems you overestimate my capabilities :)


>
> Actions     dict_name
>
<http://www.rotex-service.com/phppgadmin/display.php?database=selina_rotex&schema=public&table=pg_ts_dict&return_url=tblproperties.php%3Fdatabase%3Dselina_rotex%26amp%3Bschema%3Dpublic%26table%3Dpg_ts_dict&return_desc=Back&sortkey=2&sortdir=asc&strings=expanded&page=1>

> dict_init
>
<http://www.rotex-service.com/phppgadmin/display.php?database=selina_rotex&schema=public&table=pg_ts_dict&return_url=tblproperties.php%3Fdatabase%3Dselina_rotex%26amp%3Bschema%3Dpublic%26table%3Dpg_ts_dict&return_desc=Back&sortkey=3&sortdir=asc&strings=expanded&page=1>

> dict_initoption
>
<http://www.rotex-service.com/phppgadmin/display.php?database=selina_rotex&schema=public&table=pg_ts_dict&return_url=tblproperties.php%3Fdatabase%3Dselina_rotex%26amp%3Bschema%3Dpublic%26table%3Dpg_ts_dict&return_desc=Back&sortkey=4&sortdir=asc&strings=expanded&page=1>

> dict_lexize
>
<http://www.rotex-service.com/phppgadmin/display.php?database=selina_rotex&schema=public&table=pg_ts_dict&return_url=tblproperties.php%3Fdatabase%3Dselina_rotex%26amp%3Bschema%3Dpublic%26table%3Dpg_ts_dict&return_desc=Back&sortkey=5&sortdir=asc&strings=expanded&page=1>

> dict_comment
>
<http://www.rotex-service.com/phppgadmin/display.php?database=selina_rotex&schema=public&table=pg_ts_dict&return_url=tblproperties.php%3Fdatabase%3Dselina_rotex%26amp%3Bschema%3Dpublic%26table%3Dpg_ts_dict&return_desc=Back&sortkey=6&sortdir=asc&strings=expanded&page=1>

> Edit
>
<http://www.rotex-service.com/phppgadmin/display.php?action=confeditrow&strings=expanded&page=1&key%5Bdict_name%5D=simple&database=selina_rotex&schema=public&table=pg_ts_dict&return_url=tblproperties.php%3Fdatabase%3Dselina_rotex%26amp%3Bschema%3Dpublic%26table%3Dpg_ts_dict&return_desc=Back&sortkey=&sortdir=>

> Delete
>
<http://www.rotex-service.com/phppgadmin/display.php?action=confdelrow&strings=expanded&page=1&key%5Bdict_name%5D=simple&database=selina_rotex&schema=public&table=pg_ts_dict&return_url=tblproperties.php%3Fdatabase%3Dselina_rotex%26amp%3Bschema%3Dpublic%26table%3Dpg_ts_dict&return_desc=Back&sortkey=&sortdir=>

> simple     dex_init(text)     /NULL/     dex_lexize(internal,internal,integer) Simple
> example of dictionary.
> Edit
>
<http://www.rotex-service.com/phppgadmin/display.php?action=confeditrow&strings=expanded&page=1&key%5Bdict_name%5D=en_stem&database=selina_rotex&schema=public&table=pg_ts_dict&return_url=tblproperties.php%3Fdatabase%3Dselina_rotex%26amp%3Bschema%3Dpublic%26table%3Dpg_ts_dict&return_desc=Back&sortkey=&sortdir=>

> Delete
>
<http://www.rotex-service.com/phppgadmin/display.php?action=confdelrow&strings=expanded&page=1&key%5Bdict_name%5D=en_stem&database=selina_rotex&schema=public&table=pg_ts_dict&return_url=tblproperties.php%3Fdatabase%3Dselina_rotex%26amp%3Bschema%3Dpublic%26table%3Dpg_ts_dict&return_desc=Back&sortkey=&sortdir=>

> en_stem     snb_en_init(text) /usr/local/pgsql/share/contrib/english.stop
> snb_lexize(internal,internal,integer)     English Stemmer. Snowball.
> Edit
>
<http://www.rotex-service.com/phppgadmin/display.php?action=confeditrow&strings=expanded&page=1&key%5Bdict_name%5D=ru_stem&database=selina_rotex&schema=public&table=pg_ts_dict&return_url=tblproperties.php%3Fdatabase%3Dselina_rotex%26amp%3Bschema%3Dpublic%26table%3Dpg_ts_dict&return_desc=Back&sortkey=&sortdir=>

> Delete
>
<http://www.rotex-service.com/phppgadmin/display.php?action=confdelrow&strings=expanded&page=1&key%5Bdict_name%5D=ru_stem&database=selina_rotex&schema=public&table=pg_ts_dict&return_url=tblproperties.php%3Fdatabase%3Dselina_rotex%26amp%3Bschema%3Dpublic%26table%3Dpg_ts_dict&return_desc=Back&sortkey=&sortdir=>

> ru_stem     snb_ru_init(text) /usr/local/pgsql/share/contrib/russian.stop
> snb_lexize(internal,internal,integer)     Russian Stemmer. Snowball.
> Edit
>
<http://www.rotex-service.com/phppgadmin/display.php?action=confeditrow&strings=expanded&page=1&key%5Bdict_name%5D=ispell_template&database=selina_rotex&schema=public&table=pg_ts_dict&return_url=tblproperties.php%3Fdatabase%3Dselina_rotex%26amp%3Bschema%3Dpublic%26table%3Dpg_ts_dict&return_desc=Back&sortkey=&sortdir=>

> Delete
>
<http://www.rotex-service.com/phppgadmin/display.php?action=confdelrow&strings=expanded&page=1&key%5Bdict_name%5D=ispell_template&database=selina_rotex&schema=public&table=pg_ts_dict&return_url=tblproperties.php%3Fdatabase%3Dselina_rotex%26amp%3Bschema%3Dpublic%26table%3Dpg_ts_dict&return_desc=Back&sortkey=&sortdir=>

> ispell_template     spell_init(text)     /NULL/
> spell_lexize(internal,internal,integer)     ISpell interface. Must have
> .dict and .aff files
> Edit
>
<http://www.rotex-service.com/phppgadmin/display.php?action=confeditrow&strings=expanded&page=1&key%5Bdict_name%5D=synonym&database=selina_rotex&schema=public&table=pg_ts_dict&return_url=tblproperties.php%3Fdatabase%3Dselina_rotex%26amp%3Bschema%3Dpublic%26table%3Dpg_ts_dict&return_desc=Back&sortkey=&sortdir=>

> Delete
>
<http://www.rotex-service.com/phppgadmin/display.php?action=confdelrow&strings=expanded&page=1&key%5Bdict_name%5D=synonym&database=selina_rotex&schema=public&table=pg_ts_dict&return_url=tblproperties.php%3Fdatabase%3Dselina_rotex%26amp%3Bschema%3Dpublic%26table%3Dpg_ts_dict&return_desc=Back&sortkey=&sortdir=>

> synonym     syn_init(text)     /NULL/ syn_lexize(internal,internal,integer)
> Example of synonym dictionary
> Edit
>
<http://www.rotex-service.com/phppgadmin/display.php?action=confeditrow&strings=expanded&page=1&key%5Bdict_name%5D=de_ispell&database=selina_rotex&schema=public&table=pg_ts_dict&return_url=tblproperties.php%3Fdatabase%3Dselina_rotex%26amp%3Bschema%3Dpublic%26table%3Dpg_ts_dict&return_desc=Back&sortkey=&sortdir=>

> Delete
>
<http://www.rotex-service.com/phppgadmin/display.php?action=confdelrow&strings=expanded&page=1&key%5Bdict_name%5D=de_ispell&database=selina_rotex&schema=public&table=pg_ts_dict&return_url=tblproperties.php%3Fdatabase%3Dselina_rotex%26amp%3Bschema%3Dpublic%26table%3Dpg_ts_dict&return_desc=Back&sortkey=&sortdir=>

> de_ispell     spell_init(text)
> DictFile="/usr/local/pgsql/share/contrib/dictonary/german_comb.dict",
> AffFile="/usr/local/pgsql/share/contrib/dictonary/german_comb.aff",
> StopFile="/usr/local/pgsql/share/contrib/dictonary/german.stop"
> spell_lexize(internal,internal,integer)     /NULL/
>
>
>
> Timo
>
>
> Oleg Bartunov wrote:
>
>> Timo,
>>
>> please, check you apply patch for compound word support.
>> What is version of postgresql ?
>> Does ispell dict works for non-compound words ?
>>
>>     Oleg
>>
>> On Fri, 5 Nov 2004, Timo Haberkern wrote:
>>
>>> Hi there,
>>>
>>> i have some troubles with my TSearch2 Installation. I have done this
>>> installation as described in
>>> http://www.sai.msu.su/~megera/oddmuse/index.cgi/Tsearch_V2_compound_words
>>> <http://www.sai.msu.su/%7Emegera/oddmuse/index.cgi/Tsearch_V2_compound_words>
>>>
>>> I used the german myspell dictionary from
>>> http://lingucomponent.openoffice.org/spell_dic.html and converted it with
>>> my2ispell
>>>
>>> Nearly everything is working fine so far, except two problems:
>>>
>>> 1.) The stopword-file seems to be ignored: If i try it with SELECT
>>> to_tsvector("default_german", "ein Haus") i get     "ein":1 "haus":2
>>>
>>> ein should be a Stopword for german (and is defined the german.stop file
>>> as
>>> well)
>>>
>>> 2.) The compound words feature doesn"t work too. I have tried a lot of
>>> words,
>>> i.e. "Fehlermeldung" with SELECT to_tsvector("default_german",
>>> "Fehlermeldung")
>>> i only get
>>> "fehlermeldung":1 but i would expect "fehler" and "meldung" as seperated
>>> entries. Is there anything wrong with the dictonary or my configuration?
>>>
>>>
>>> My current configuration:
>>>
>>> pg_ts_cfg:
>>>
>>> default    default    C
>>> default_russian    default    ru_RU.KOI8-R
>>> simple    default    NULL
>>> default_german    default    de_DE.ISO8859-1
>>>     pg_ts_cfgmap:
>>>
>>> default_german    host    {simple}
>>> default_german    hword    {simple}
>>> default_german    int    {simple}
>>> default_german    nlhword    {simple}
>>> default_german    nlpart_hword    {simple}
>>> default_german    nlword    {simple}
>>> default_german    part_hword    {simple}
>>> default_german    sfloat    {simple}
>>> default_german    uint    {simple}
>>> default_german    uri    {simple}
>>> default_german    url    {simple}
>>> default_german    version    {simple}
>>> default_german    word    {simple}
>>> default_german    lpart_hword    {de_ispell,german_snowball}
>>> default_german    lword    {de_ispell,german_snowball}
>>> default_german    lhword    {de_ispell,german_snowball}
>>>
>>>
>>> pg_ts_dict:
>>>
>>> de_ispell | 17166    |
>>> DictFile="/usr/local/pgsql/share/contrib/dictonary/german.dict",
>>> AffFile="/usr/local/pgsql/share/contrib/dictonary/german.aff",
>>> StopFile="/usr/local/pgsql/share/contrib/dictonary/german.stop"    | 17167
>>> | NULL
>>> german_snowball    | 17357 | NULL    | 17162 | Snowball stemmer for german
>>>
>>>
>>>
>>> Can anyone help me?
>>>
>>> regards
>>>
>>> Timo
>>>
>>>
>>> ---------------------------(end of broadcast)---------------------------
>>> TIP 4: Don't 'kill -9' the postmaster
>>>
>>
>>     Regards,
>>         Oleg
>> _____________________________________________________________
>> Oleg Bartunov, sci.researcher, hostmaster of AstroNet,
>> Sternberg Astronomical Institute, Moscow University (Russia)
>> Internet: oleg@sai.msu.su, http://www.sai.msu.su/~megera/
>> phone: +007(095)939-16-83, +007(095)939-23-83
>>
>> ---------------------------(end of broadcast)---------------------------
>> TIP 2: you can get off all lists at once with the unregister command
>>    (send "unregister YourEmailAddressHere" to majordomo@postgresql.org)
>>
>>
>

     Regards,
         Oleg
_____________________________________________________________
Oleg Bartunov, sci.researcher, hostmaster of AstroNet,
Sternberg Astronomical Institute, Moscow University (Russia)
Internet: oleg@sai.msu.su, http://www.sai.msu.su/~megera/
phone: +007(095)939-16-83, +007(095)939-23-83

Re: TSearch2: Problems with compound words and stop words

От
Timo Haberkern
Дата:
sorry for the late answer, i was on holyday,

see my remarks below


Oleg Bartunov wrote:

> On Fri, 5 Nov 2004, Timo Haberkern wrote:
>
>> Oleg,
>>
>> i use TSearch2 with PostgreSQL 7.4.6 and i applied the compoundword
>> patch yesterday. The configuration changed a little bit but the
>> result is the same. I get no compound words. I'm using the locale
>> de_DE with encoding ISO8859-1 for the database.
>>
>> I think i spell is working correctly except the compound words. If i try
>>
>> SELECT lexize('de_ispell', 'springt')
>>
>> i get
>>
>> lexize
>> {springen,springen}
>>
>> which seems correct.
>>
>>
>> But a SELECT lexize('de_ispell', 'Autobahn')
>>
>> results in
>>
>> lexize
>> {autobahn}
>>
>> i would expect {auto,bahn, autobahn}
>
>
> Hmm, have you checked 'Autobahn' in ispell dictionary ? Does
> dictionary you used supports 'Z' flag for compound words ?

Autobahn is in the ispell dictionary. What does a ispell dictionary
need to support the Z flag???


Timo





>
>
>>
>> The new configuration after the compound word patch:
>>
>
> Seems you overestimate my capabilities :)
>
>
>>
>> Actions     dict_name
>>
<http://www.rotex-service.com/phppgadmin/display.php?database=selina_rotex&schema=public&table=pg_ts_dict&return_url=tblproperties.php%3Fdatabase%3Dselina_rotex%26amp%3Bschema%3Dpublic%26table%3Dpg_ts_dict&return_desc=Back&sortkey=2&sortdir=asc&strings=expanded&page=1>

>> dict_init
>>
<http://www.rotex-service.com/phppgadmin/display.php?database=selina_rotex&schema=public&table=pg_ts_dict&return_url=tblproperties.php%3Fdatabase%3Dselina_rotex%26amp%3Bschema%3Dpublic%26table%3Dpg_ts_dict&return_desc=Back&sortkey=3&sortdir=asc&strings=expanded&page=1>

>> dict_initoption
>>
<http://www.rotex-service.com/phppgadmin/display.php?database=selina_rotex&schema=public&table=pg_ts_dict&return_url=tblproperties.php%3Fdatabase%3Dselina_rotex%26amp%3Bschema%3Dpublic%26table%3Dpg_ts_dict&return_desc=Back&sortkey=4&sortdir=asc&strings=expanded&page=1>

>> dict_lexize
>>
<http://www.rotex-service.com/phppgadmin/display.php?database=selina_rotex&schema=public&table=pg_ts_dict&return_url=tblproperties.php%3Fdatabase%3Dselina_rotex%26amp%3Bschema%3Dpublic%26table%3Dpg_ts_dict&return_desc=Back&sortkey=5&sortdir=asc&strings=expanded&page=1>

>> dict_comment
>>
<http://www.rotex-service.com/phppgadmin/display.php?database=selina_rotex&schema=public&table=pg_ts_dict&return_url=tblproperties.php%3Fdatabase%3Dselina_rotex%26amp%3Bschema%3Dpublic%26table%3Dpg_ts_dict&return_desc=Back&sortkey=6&sortdir=asc&strings=expanded&page=1>

>> Edit
>>
<http://www.rotex-service.com/phppgadmin/display.php?action=confeditrow&strings=expanded&page=1&key%5Bdict_name%5D=simple&database=selina_rotex&schema=public&table=pg_ts_dict&return_url=tblproperties.php%3Fdatabase%3Dselina_rotex%26amp%3Bschema%3Dpublic%26table%3Dpg_ts_dict&return_desc=Back&sortkey=&sortdir=>

>> Delete
>>
<http://www.rotex-service.com/phppgadmin/display.php?action=confdelrow&strings=expanded&page=1&key%5Bdict_name%5D=simple&database=selina_rotex&schema=public&table=pg_ts_dict&return_url=tblproperties.php%3Fdatabase%3Dselina_rotex%26amp%3Bschema%3Dpublic%26table%3Dpg_ts_dict&return_desc=Back&sortkey=&sortdir=>

>> simple     dex_init(text)     /NULL/
>> dex_lexize(internal,internal,integer) Simple example of dictionary.
>> Edit
>>
<http://www.rotex-service.com/phppgadmin/display.php?action=confeditrow&strings=expanded&page=1&key%5Bdict_name%5D=en_stem&database=selina_rotex&schema=public&table=pg_ts_dict&return_url=tblproperties.php%3Fdatabase%3Dselina_rotex%26amp%3Bschema%3Dpublic%26table%3Dpg_ts_dict&return_desc=Back&sortkey=&sortdir=>

>> Delete
>>
<http://www.rotex-service.com/phppgadmin/display.php?action=confdelrow&strings=expanded&page=1&key%5Bdict_name%5D=en_stem&database=selina_rotex&schema=public&table=pg_ts_dict&return_url=tblproperties.php%3Fdatabase%3Dselina_rotex%26amp%3Bschema%3Dpublic%26table%3Dpg_ts_dict&return_desc=Back&sortkey=&sortdir=>

>> en_stem     snb_en_init(text)
>> /usr/local/pgsql/share/contrib/english.stop
>> snb_lexize(internal,internal,integer)     English Stemmer. Snowball.
>> Edit
>>
<http://www.rotex-service.com/phppgadmin/display.php?action=confeditrow&strings=expanded&page=1&key%5Bdict_name%5D=ru_stem&database=selina_rotex&schema=public&table=pg_ts_dict&return_url=tblproperties.php%3Fdatabase%3Dselina_rotex%26amp%3Bschema%3Dpublic%26table%3Dpg_ts_dict&return_desc=Back&sortkey=&sortdir=>

>> Delete
>>
<http://www.rotex-service.com/phppgadmin/display.php?action=confdelrow&strings=expanded&page=1&key%5Bdict_name%5D=ru_stem&database=selina_rotex&schema=public&table=pg_ts_dict&return_url=tblproperties.php%3Fdatabase%3Dselina_rotex%26amp%3Bschema%3Dpublic%26table%3Dpg_ts_dict&return_desc=Back&sortkey=&sortdir=>

>> ru_stem     snb_ru_init(text)
>> /usr/local/pgsql/share/contrib/russian.stop
>> snb_lexize(internal,internal,integer)     Russian Stemmer. Snowball.
>> Edit
>>
<http://www.rotex-service.com/phppgadmin/display.php?action=confeditrow&strings=expanded&page=1&key%5Bdict_name%5D=ispell_template&database=selina_rotex&schema=public&table=pg_ts_dict&return_url=tblproperties.php%3Fdatabase%3Dselina_rotex%26amp%3Bschema%3Dpublic%26table%3Dpg_ts_dict&return_desc=Back&sortkey=&sortdir=>

>> Delete
>>
<http://www.rotex-service.com/phppgadmin/display.php?action=confdelrow&strings=expanded&page=1&key%5Bdict_name%5D=ispell_template&database=selina_rotex&schema=public&table=pg_ts_dict&return_url=tblproperties.php%3Fdatabase%3Dselina_rotex%26amp%3Bschema%3Dpublic%26table%3Dpg_ts_dict&return_desc=Back&sortkey=&sortdir=>

>> ispell_template     spell_init(text)     /NULL/
>> spell_lexize(internal,internal,integer)     ISpell interface. Must
>> have .dict and .aff files
>> Edit
>>
<http://www.rotex-service.com/phppgadmin/display.php?action=confeditrow&strings=expanded&page=1&key%5Bdict_name%5D=synonym&database=selina_rotex&schema=public&table=pg_ts_dict&return_url=tblproperties.php%3Fdatabase%3Dselina_rotex%26amp%3Bschema%3Dpublic%26table%3Dpg_ts_dict&return_desc=Back&sortkey=&sortdir=>

>> Delete
>>
<http://www.rotex-service.com/phppgadmin/display.php?action=confdelrow&strings=expanded&page=1&key%5Bdict_name%5D=synonym&database=selina_rotex&schema=public&table=pg_ts_dict&return_url=tblproperties.php%3Fdatabase%3Dselina_rotex%26amp%3Bschema%3Dpublic%26table%3Dpg_ts_dict&return_desc=Back&sortkey=&sortdir=>

>> synonym     syn_init(text)     /NULL/
>> syn_lexize(internal,internal,integer) Example of synonym dictionary
>> Edit
>>
<http://www.rotex-service.com/phppgadmin/display.php?action=confeditrow&strings=expanded&page=1&key%5Bdict_name%5D=de_ispell&database=selina_rotex&schema=public&table=pg_ts_dict&return_url=tblproperties.php%3Fdatabase%3Dselina_rotex%26amp%3Bschema%3Dpublic%26table%3Dpg_ts_dict&return_desc=Back&sortkey=&sortdir=>

>> Delete
>>
<http://www.rotex-service.com/phppgadmin/display.php?action=confdelrow&strings=expanded&page=1&key%5Bdict_name%5D=de_ispell&database=selina_rotex&schema=public&table=pg_ts_dict&return_url=tblproperties.php%3Fdatabase%3Dselina_rotex%26amp%3Bschema%3Dpublic%26table%3Dpg_ts_dict&return_desc=Back&sortkey=&sortdir=>

>> de_ispell     spell_init(text)
>> DictFile="/usr/local/pgsql/share/contrib/dictonary/german_comb.dict",
>> AffFile="/usr/local/pgsql/share/contrib/dictonary/german_comb.aff",
>> StopFile="/usr/local/pgsql/share/contrib/dictonary/german.stop"
>> spell_lexize(internal,internal,integer)     /NULL/
>>
>>
>>
>> Timo
>>
>>
>> Oleg Bartunov wrote:
>>
>>> Timo,
>>>
>>> please, check you apply patch for compound word support.
>>> What is version of postgresql ?
>>> Does ispell dict works for non-compound words ?
>>>
>>>     Oleg
>>>
>>> On Fri, 5 Nov 2004, Timo Haberkern wrote:
>>>
>>>> Hi there,
>>>>
>>>> i have some troubles with my TSearch2 Installation. I have done this
>>>> installation as described in
>>>> http://www.sai.msu.su/~megera/oddmuse/index.cgi/Tsearch_V2_compound_words
>>>> <http://www.sai.msu.su/%7Emegera/oddmuse/index.cgi/Tsearch_V2_compound_words>
>>>>
>>>> I used the german myspell dictionary from
>>>> http://lingucomponent.openoffice.org/spell_dic.html and converted
>>>> it with
>>>> my2ispell
>>>>
>>>> Nearly everything is working fine so far, except two problems:
>>>>
>>>> 1.) The stopword-file seems to be ignored: If i try it with SELECT
>>>> to_tsvector("default_german", "ein Haus") i get     "ein":1 "haus":2
>>>>
>>>> ein should be a Stopword for german (and is defined the german.stop
>>>> file as
>>>> well)
>>>>
>>>> 2.) The compound words feature doesn"t work too. I have tried a lot
>>>> of words,
>>>> i.e. "Fehlermeldung" with SELECT to_tsvector("default_german",
>>>> "Fehlermeldung")
>>>> i only get
>>>> "fehlermeldung":1 but i would expect "fehler" and "meldung" as
>>>> seperated
>>>> entries. Is there anything wrong with the dictonary or my
>>>> configuration?
>>>>
>>>>
>>>> My current configuration:
>>>>
>>>> pg_ts_cfg:
>>>>
>>>> default    default    C
>>>> default_russian    default    ru_RU.KOI8-R
>>>> simple    default    NULL
>>>> default_german    default    de_DE.ISO8859-1
>>>>     pg_ts_cfgmap:
>>>>
>>>> default_german    host    {simple}
>>>> default_german    hword    {simple}
>>>> default_german    int    {simple}
>>>> default_german    nlhword    {simple}
>>>> default_german    nlpart_hword    {simple}
>>>> default_german    nlword    {simple}
>>>> default_german    part_hword    {simple}
>>>> default_german    sfloat    {simple}
>>>> default_german    uint    {simple}
>>>> default_german    uri    {simple}
>>>> default_german    url    {simple}
>>>> default_german    version    {simple}
>>>> default_german    word    {simple}
>>>> default_german    lpart_hword    {de_ispell,german_snowball}
>>>> default_german    lword    {de_ispell,german_snowball}
>>>> default_german    lhword    {de_ispell,german_snowball}
>>>>
>>>>
>>>> pg_ts_dict:
>>>>
>>>> de_ispell | 17166    |
>>>> DictFile="/usr/local/pgsql/share/contrib/dictonary/german.dict",
>>>> AffFile="/usr/local/pgsql/share/contrib/dictonary/german.aff",
>>>> StopFile="/usr/local/pgsql/share/contrib/dictonary/german.stop"
>>>> | 17167 | NULL
>>>> german_snowball    | 17357 | NULL    | 17162 | Snowball stemmer for
>>>> german
>>>>
>>>>
>>>>
>>>> Can anyone help me?
>>>>
>>>> regards
>>>>
>>>> Timo
>>>>
>>>>
>>>> ---------------------------(end of
>>>> broadcast)---------------------------
>>>> TIP 4: Don't 'kill -9' the postmaster
>>>>
>>>
>>>     Regards,
>>>         Oleg
>>> _____________________________________________________________
>>> Oleg Bartunov, sci.researcher, hostmaster of AstroNet,
>>> Sternberg Astronomical Institute, Moscow University (Russia)
>>> Internet: oleg@sai.msu.su, http://www.sai.msu.su/~megera/
>>> phone: +007(095)939-16-83, +007(095)939-23-83
>>>
>>> ---------------------------(end of
>>> broadcast)---------------------------
>>> TIP 2: you can get off all lists at once with the unregister command
>>>    (send "unregister YourEmailAddressHere" to majordomo@postgresql.org)
>>>
>>>
>>
>
>     Regards,
>         Oleg
> _____________________________________________________________
> Oleg Bartunov, sci.researcher, hostmaster of AstroNet,
> Sternberg Astronomical Institute, Moscow University (Russia)
> Internet: oleg@sai.msu.su, http://www.sai.msu.su/~megera/
> phone: +007(095)939-16-83, +007(095)939-23-83
>
> ---------------------------(end of broadcast)---------------------------
> TIP 2: you can get off all lists at once with the unregister command
>    (send "unregister YourEmailAddressHere" to majordomo@postgresql.org)
>
>

Re: TSearch2: Problems with compound words and stop words

От
Oleg Bartunov
Дата:
On Wed, 17 Nov 2004, Timo Haberkern wrote:

> sorry for the late answer, i was on holyday,
>
> see my remarks below
>
>
> Oleg Bartunov wrote:
>
>> On Fri, 5 Nov 2004, Timo Haberkern wrote:
>>
>>> Oleg,
>>>
>>> i use TSearch2 with PostgreSQL 7.4.6 and i applied the compoundword patch
>>> yesterday. The configuration changed a little bit but the result is the
>>> same. I get no compound words. I'm using the locale de_DE with encoding
>>> ISO8859-1 for the database.
>>>
>>> I think i spell is working correctly except the compound words. If i try
>>>
>>> SELECT lexize('de_ispell', 'springt')
>>>
>>> i get
>>>
>>> lexize
>>> {springen,springen}
>>>
>>> which seems correct.
>>>
>>>
>>> But a SELECT lexize('de_ispell', 'Autobahn')
>>>
>>> results in
>>>
>>> lexize
>>> {autobahn}
>>>
>>> i would expect {auto,bahn, autobahn}
>>
>>
>> Hmm, have you checked 'Autobahn' in ispell dictionary ? Does dictionary you
>> used supports 'Z' flag for compound words ?
>
> Autobahn is in the ispell dictionary. What does a ispell dictionary  need to
> support the Z flag???
>

Try ispell -C Autobahn
search 'compound' in  'man ispell' for details.
the problem exists only if ispell *does* splits word correctly while tsearch2
doesn't. You should find correct ispell dictionary for german or create it
yourself. You may consult monzilla.net
http://staff.science.uva.nl/~christof/monzilla/research/project-dr.html


>
> Timo
>
>
>
>
>
>>
>>
>>>
>>> The new configuration after the compound word patch:
>>>
>>
>> Seems you overestimate my capabilities :)
>>
>>
>>>
>>> Actions     dict_name
>>>
<http://www.rotex-service.com/phppgadmin/display.php?database=selina_rotex&schema=public&table=pg_ts_dict&return_url=tblproperties.php%3Fdatabase%3Dselina_rotex%26amp%3Bschema%3Dpublic%26table%3Dpg_ts_dict&return_desc=Back&sortkey=2&sortdir=asc&strings=expanded&page=1>

>>> dict_init
>>>
<http://www.rotex-service.com/phppgadmin/display.php?database=selina_rotex&schema=public&table=pg_ts_dict&return_url=tblproperties.php%3Fdatabase%3Dselina_rotex%26amp%3Bschema%3Dpublic%26table%3Dpg_ts_dict&return_desc=Back&sortkey=3&sortdir=asc&strings=expanded&page=1>

>>> dict_initoption
>>>
<http://www.rotex-service.com/phppgadmin/display.php?database=selina_rotex&schema=public&table=pg_ts_dict&return_url=tblproperties.php%3Fdatabase%3Dselina_rotex%26amp%3Bschema%3Dpublic%26table%3Dpg_ts_dict&return_desc=Back&sortkey=4&sortdir=asc&strings=expanded&page=1>

>>> dict_lexize
>>>
<http://www.rotex-service.com/phppgadmin/display.php?database=selina_rotex&schema=public&table=pg_ts_dict&return_url=tblproperties.php%3Fdatabase%3Dselina_rotex%26amp%3Bschema%3Dpublic%26table%3Dpg_ts_dict&return_desc=Back&sortkey=5&sortdir=asc&strings=expanded&page=1>

>>> dict_comment
>>>
<http://www.rotex-service.com/phppgadmin/display.php?database=selina_rotex&schema=public&table=pg_ts_dict&return_url=tblproperties.php%3Fdatabase%3Dselina_rotex%26amp%3Bschema%3Dpublic%26table%3Dpg_ts_dict&return_desc=Back&sortkey=6&sortdir=asc&strings=expanded&page=1>

>>> Edit
>>>
<http://www.rotex-service.com/phppgadmin/display.php?action=confeditrow&strings=expanded&page=1&key%5Bdict_name%5D=simple&database=selina_rotex&schema=public&table=pg_ts_dict&return_url=tblproperties.php%3Fdatabase%3Dselina_rotex%26amp%3Bschema%3Dpublic%26table%3Dpg_ts_dict&return_desc=Back&sortkey=&sortdir=>

>>> Delete
>>>
<http://www.rotex-service.com/phppgadmin/display.php?action=confdelrow&strings=expanded&page=1&key%5Bdict_name%5D=simple&database=selina_rotex&schema=public&table=pg_ts_dict&return_url=tblproperties.php%3Fdatabase%3Dselina_rotex%26amp%3Bschema%3Dpublic%26table%3Dpg_ts_dict&return_desc=Back&sortkey=&sortdir=>

>>> simple     dex_init(text)     /NULL/
>>> dex_lexize(internal,internal,integer) Simple example of dictionary.
>>> Edit
>>>
<http://www.rotex-service.com/phppgadmin/display.php?action=confeditrow&strings=expanded&page=1&key%5Bdict_name%5D=en_stem&database=selina_rotex&schema=public&table=pg_ts_dict&return_url=tblproperties.php%3Fdatabase%3Dselina_rotex%26amp%3Bschema%3Dpublic%26table%3Dpg_ts_dict&return_desc=Back&sortkey=&sortdir=>

>>> Delete
>>>
<http://www.rotex-service.com/phppgadmin/display.php?action=confdelrow&strings=expanded&page=1&key%5Bdict_name%5D=en_stem&database=selina_rotex&schema=public&table=pg_ts_dict&return_url=tblproperties.php%3Fdatabase%3Dselina_rotex%26amp%3Bschema%3Dpublic%26table%3Dpg_ts_dict&return_desc=Back&sortkey=&sortdir=>

>>> en_stem     snb_en_init(text) /usr/local/pgsql/share/contrib/english.stop
>>> snb_lexize(internal,internal,integer)     English Stemmer. Snowball.
>>> Edit
>>>
<http://www.rotex-service.com/phppgadmin/display.php?action=confeditrow&strings=expanded&page=1&key%5Bdict_name%5D=ru_stem&database=selina_rotex&schema=public&table=pg_ts_dict&return_url=tblproperties.php%3Fdatabase%3Dselina_rotex%26amp%3Bschema%3Dpublic%26table%3Dpg_ts_dict&return_desc=Back&sortkey=&sortdir=>

>>> Delete
>>>
<http://www.rotex-service.com/phppgadmin/display.php?action=confdelrow&strings=expanded&page=1&key%5Bdict_name%5D=ru_stem&database=selina_rotex&schema=public&table=pg_ts_dict&return_url=tblproperties.php%3Fdatabase%3Dselina_rotex%26amp%3Bschema%3Dpublic%26table%3Dpg_ts_dict&return_desc=Back&sortkey=&sortdir=>

>>> ru_stem     snb_ru_init(text) /usr/local/pgsql/share/contrib/russian.stop
>>> snb_lexize(internal,internal,integer)     Russian Stemmer. Snowball.
>>> Edit
>>>
<http://www.rotex-service.com/phppgadmin/display.php?action=confeditrow&strings=expanded&page=1&key%5Bdict_name%5D=ispell_template&database=selina_rotex&schema=public&table=pg_ts_dict&return_url=tblproperties.php%3Fdatabase%3Dselina_rotex%26amp%3Bschema%3Dpublic%26table%3Dpg_ts_dict&return_desc=Back&sortkey=&sortdir=>

>>> Delete
>>>
<http://www.rotex-service.com/phppgadmin/display.php?action=confdelrow&strings=expanded&page=1&key%5Bdict_name%5D=ispell_template&database=selina_rotex&schema=public&table=pg_ts_dict&return_url=tblproperties.php%3Fdatabase%3Dselina_rotex%26amp%3Bschema%3Dpublic%26table%3Dpg_ts_dict&return_desc=Back&sortkey=&sortdir=>

>>> ispell_template     spell_init(text)     /NULL/
>>> spell_lexize(internal,internal,integer)     ISpell interface. Must have
>>> .dict and .aff files
>>> Edit
>>>
<http://www.rotex-service.com/phppgadmin/display.php?action=confeditrow&strings=expanded&page=1&key%5Bdict_name%5D=synonym&database=selina_rotex&schema=public&table=pg_ts_dict&return_url=tblproperties.php%3Fdatabase%3Dselina_rotex%26amp%3Bschema%3Dpublic%26table%3Dpg_ts_dict&return_desc=Back&sortkey=&sortdir=>

>>> Delete
>>>
<http://www.rotex-service.com/phppgadmin/display.php?action=confdelrow&strings=expanded&page=1&key%5Bdict_name%5D=synonym&database=selina_rotex&schema=public&table=pg_ts_dict&return_url=tblproperties.php%3Fdatabase%3Dselina_rotex%26amp%3Bschema%3Dpublic%26table%3Dpg_ts_dict&return_desc=Back&sortkey=&sortdir=>

>>> synonym     syn_init(text)     /NULL/
>>> syn_lexize(internal,internal,integer) Example of synonym dictionary
>>> Edit
>>>
<http://www.rotex-service.com/phppgadmin/display.php?action=confeditrow&strings=expanded&page=1&key%5Bdict_name%5D=de_ispell&database=selina_rotex&schema=public&table=pg_ts_dict&return_url=tblproperties.php%3Fdatabase%3Dselina_rotex%26amp%3Bschema%3Dpublic%26table%3Dpg_ts_dict&return_desc=Back&sortkey=&sortdir=>

>>> Delete
>>>
<http://www.rotex-service.com/phppgadmin/display.php?action=confdelrow&strings=expanded&page=1&key%5Bdict_name%5D=de_ispell&database=selina_rotex&schema=public&table=pg_ts_dict&return_url=tblproperties.php%3Fdatabase%3Dselina_rotex%26amp%3Bschema%3Dpublic%26table%3Dpg_ts_dict&return_desc=Back&sortkey=&sortdir=>

>>> de_ispell     spell_init(text)
>>> DictFile="/usr/local/pgsql/share/contrib/dictonary/german_comb.dict",
>>> AffFile="/usr/local/pgsql/share/contrib/dictonary/german_comb.aff",
>>> StopFile="/usr/local/pgsql/share/contrib/dictonary/german.stop"
>>> spell_lexize(internal,internal,integer)     /NULL/
>>>
>>>
>>>
>>> Timo
>>>
>>>
>>> Oleg Bartunov wrote:
>>>
>>>> Timo,
>>>>
>>>> please, check you apply patch for compound word support.
>>>> What is version of postgresql ?
>>>> Does ispell dict works for non-compound words ?
>>>>
>>>>     Oleg
>>>>
>>>> On Fri, 5 Nov 2004, Timo Haberkern wrote:
>>>>
>>>>> Hi there,
>>>>>
>>>>> i have some troubles with my TSearch2 Installation. I have done this
>>>>> installation as described in
>>>>> http://www.sai.msu.su/~megera/oddmuse/index.cgi/Tsearch_V2_compound_words
>>>>> <http://www.sai.msu.su/%7Emegera/oddmuse/index.cgi/Tsearch_V2_compound_words>
>>>>> I used the german myspell dictionary from
>>>>> http://lingucomponent.openoffice.org/spell_dic.html and converted it
>>>>> with
>>>>> my2ispell
>>>>>
>>>>> Nearly everything is working fine so far, except two problems:
>>>>>
>>>>> 1.) The stopword-file seems to be ignored: If i try it with SELECT
>>>>> to_tsvector("default_german", "ein Haus") i get     "ein":1 "haus":2
>>>>>
>>>>> ein should be a Stopword for german (and is defined the german.stop file
>>>>> as
>>>>> well)
>>>>>
>>>>> 2.) The compound words feature doesn"t work too. I have tried a lot of
>>>>> words,
>>>>> i.e. "Fehlermeldung" with SELECT to_tsvector("default_german",
>>>>> "Fehlermeldung")
>>>>> i only get
>>>>> "fehlermeldung":1 but i would expect "fehler" and "meldung" as seperated
>>>>> entries. Is there anything wrong with the dictonary or my configuration?
>>>>>
>>>>>
>>>>> My current configuration:
>>>>>
>>>>> pg_ts_cfg:
>>>>>
>>>>> default    default    C
>>>>> default_russian    default    ru_RU.KOI8-R
>>>>> simple    default    NULL
>>>>> default_german    default    de_DE.ISO8859-1
>>>>>     pg_ts_cfgmap:
>>>>>
>>>>> default_german    host    {simple}
>>>>> default_german    hword    {simple}
>>>>> default_german    int    {simple}
>>>>> default_german    nlhword    {simple}
>>>>> default_german    nlpart_hword    {simple}
>>>>> default_german    nlword    {simple}
>>>>> default_german    part_hword    {simple}
>>>>> default_german    sfloat    {simple}
>>>>> default_german    uint    {simple}
>>>>> default_german    uri    {simple}
>>>>> default_german    url    {simple}
>>>>> default_german    version    {simple}
>>>>> default_german    word    {simple}
>>>>> default_german    lpart_hword    {de_ispell,german_snowball}
>>>>> default_german    lword    {de_ispell,german_snowball}
>>>>> default_german    lhword    {de_ispell,german_snowball}
>>>>>
>>>>>
>>>>> pg_ts_dict:
>>>>>
>>>>> de_ispell | 17166    |
>>>>> DictFile="/usr/local/pgsql/share/contrib/dictonary/german.dict",
>>>>> AffFile="/usr/local/pgsql/share/contrib/dictonary/german.aff",
>>>>> StopFile="/usr/local/pgsql/share/contrib/dictonary/german.stop"    |
>>>>> 17167 | NULL
>>>>> german_snowball    | 17357 | NULL    | 17162 | Snowball stemmer for
>>>>> german
>>>>>
>>>>>
>>>>>
>>>>> Can anyone help me?
>>>>>
>>>>> regards
>>>>>
>>>>> Timo
>>>>>
>>>>>
>>>>> ---------------------------(end of broadcast)---------------------------
>>>>> TIP 4: Don't 'kill -9' the postmaster
>>>>>
>>>>
>>>>     Regards,
>>>>         Oleg
>>>> _____________________________________________________________
>>>> Oleg Bartunov, sci.researcher, hostmaster of AstroNet,
>>>> Sternberg Astronomical Institute, Moscow University (Russia)
>>>> Internet: oleg@sai.msu.su, http://www.sai.msu.su/~megera/
>>>> phone: +007(095)939-16-83, +007(095)939-23-83
>>>>
>>>> ---------------------------(end of broadcast)---------------------------
>>>> TIP 2: you can get off all lists at once with the unregister command
>>>>    (send "unregister YourEmailAddressHere" to majordomo@postgresql.org)
>>>>
>>>>
>>>
>>
>>     Regards,
>>         Oleg
>> _____________________________________________________________
>> Oleg Bartunov, sci.researcher, hostmaster of AstroNet,
>> Sternberg Astronomical Institute, Moscow University (Russia)
>> Internet: oleg@sai.msu.su, http://www.sai.msu.su/~megera/
>> phone: +007(095)939-16-83, +007(095)939-23-83
>>
>> ---------------------------(end of broadcast)---------------------------
>> TIP 2: you can get off all lists at once with the unregister command
>>    (send "unregister YourEmailAddressHere" to majordomo@postgresql.org)
>>
>>
>

     Regards,
         Oleg
_____________________________________________________________
Oleg Bartunov, sci.researcher, hostmaster of AstroNet,
Sternberg Astronomical Institute, Moscow University (Russia)
Internet: oleg@sai.msu.su, http://www.sai.msu.su/~megera/
phone: +007(095)939-16-83, +007(095)939-23-83

Re: TSearch2: Problems with compound words and stop words

От
Oleg Bartunov
Дата:
Timo,

take a look into .aff file and search 'compoundwords'.
german ispell file I got from http://j3e.de/ispell/igerman98/ has no
support for compound words: 'compoundwords off'

Norwegian, for example, has:

compoundwords controlled z

compoundmin 4


     Oleg


On Wed, 17 Nov 2004, Oleg Bartunov wrote:

> On Wed, 17 Nov 2004, Timo Haberkern wrote:
>
>> sorry for the late answer, i was on holyday,
>>
>> see my remarks below
>>
>>
>> Oleg Bartunov wrote:
>>
>>> On Fri, 5 Nov 2004, Timo Haberkern wrote:
>>>
>>>> Oleg,
>>>>
>>>> i use TSearch2 with PostgreSQL 7.4.6 and i applied the compoundword patch
>>>> yesterday. The configuration changed a little bit but the result is the
>>>> same. I get no compound words. I'm using the locale de_DE with encoding
>>>> ISO8859-1 for the database.
>>>>
>>>> I think i spell is working correctly except the compound words. If i try
>>>>
>>>> SELECT lexize('de_ispell', 'springt')
>>>>
>>>> i get
>>>>
>>>> lexize
>>>> {springen,springen}
>>>>
>>>> which seems correct.
>>>>
>>>>
>>>> But a SELECT lexize('de_ispell', 'Autobahn')
>>>>
>>>> results in
>>>>
>>>> lexize
>>>> {autobahn}
>>>>
>>>> i would expect {auto,bahn, autobahn}
>>>
>>>
>>> Hmm, have you checked 'Autobahn' in ispell dictionary ? Does dictionary
>>> you used supports 'Z' flag for compound words ?
>>
>> Autobahn is in the ispell dictionary. What does a ispell dictionary  need
>> to support the Z flag???
>>
>
> Try ispell -C Autobahn search 'compound' in  'man ispell' for details. the
> problem exists only if ispell *does* splits word correctly while tsearch2
> doesn't. You should find correct ispell dictionary for german or create it
> yourself. You may consult monzilla.net
> http://staff.science.uva.nl/~christof/monzilla/research/project-dr.html
>
>
>>
>> Timo
>>
>>
>>
>>
>>
>>>
>>>
>>>>
>>>> The new configuration after the compound word patch:
>>>>
>>>
>>> Seems you overestimate my capabilities :)
>>>
>>>
>>>>
>>>> Actions     dict_name
>>>>
<http://www.rotex-service.com/phppgadmin/display.php?database=selina_rotex&schema=public&table=pg_ts_dict&return_url=tblproperties.php%3Fdatabase%3Dselina_rotex%26amp%3Bschema%3Dpublic%26table%3Dpg_ts_dict&return_desc=Back&sortkey=2&sortdir=asc&strings=expanded&page=1>

>>>> dict_init
>>>>
<http://www.rotex-service.com/phppgadmin/display.php?database=selina_rotex&schema=public&table=pg_ts_dict&return_url=tblproperties.php%3Fdatabase%3Dselina_rotex%26amp%3Bschema%3Dpublic%26table%3Dpg_ts_dict&return_desc=Back&sortkey=3&sortdir=asc&strings=expanded&page=1>

>>>> dict_initoption
>>>>
<http://www.rotex-service.com/phppgadmin/display.php?database=selina_rotex&schema=public&table=pg_ts_dict&return_url=tblproperties.php%3Fdatabase%3Dselina_rotex%26amp%3Bschema%3Dpublic%26table%3Dpg_ts_dict&return_desc=Back&sortkey=4&sortdir=asc&strings=expanded&page=1>

>>>> dict_lexize
>>>>
<http://www.rotex-service.com/phppgadmin/display.php?database=selina_rotex&schema=public&table=pg_ts_dict&return_url=tblproperties.php%3Fdatabase%3Dselina_rotex%26amp%3Bschema%3Dpublic%26table%3Dpg_ts_dict&return_desc=Back&sortkey=5&sortdir=asc&strings=expanded&page=1>

>>>> dict_comment
>>>>
<http://www.rotex-service.com/phppgadmin/display.php?database=selina_rotex&schema=public&table=pg_ts_dict&return_url=tblproperties.php%3Fdatabase%3Dselina_rotex%26amp%3Bschema%3Dpublic%26table%3Dpg_ts_dict&return_desc=Back&sortkey=6&sortdir=asc&strings=expanded&page=1>

>>>> Edit
>>>>
<http://www.rotex-service.com/phppgadmin/display.php?action=confeditrow&strings=expanded&page=1&key%5Bdict_name%5D=simple&database=selina_rotex&schema=public&table=pg_ts_dict&return_url=tblproperties.php%3Fdatabase%3Dselina_rotex%26amp%3Bschema%3Dpublic%26table%3Dpg_ts_dict&return_desc=Back&sortkey=&sortdir=>

>>>> Delete
>>>>
<http://www.rotex-service.com/phppgadmin/display.php?action=confdelrow&strings=expanded&page=1&key%5Bdict_name%5D=simple&database=selina_rotex&schema=public&table=pg_ts_dict&return_url=tblproperties.php%3Fdatabase%3Dselina_rotex%26amp%3Bschema%3Dpublic%26table%3Dpg_ts_dict&return_desc=Back&sortkey=&sortdir=>

>>>> simple     dex_init(text)     /NULL/
>>>> dex_lexize(internal,internal,integer) Simple example of dictionary.
>>>> Edit
>>>>
<http://www.rotex-service.com/phppgadmin/display.php?action=confeditrow&strings=expanded&page=1&key%5Bdict_name%5D=en_stem&database=selina_rotex&schema=public&table=pg_ts_dict&return_url=tblproperties.php%3Fdatabase%3Dselina_rotex%26amp%3Bschema%3Dpublic%26table%3Dpg_ts_dict&return_desc=Back&sortkey=&sortdir=>

>>>> Delete
>>>>
<http://www.rotex-service.com/phppgadmin/display.php?action=confdelrow&strings=expanded&page=1&key%5Bdict_name%5D=en_stem&database=selina_rotex&schema=public&table=pg_ts_dict&return_url=tblproperties.php%3Fdatabase%3Dselina_rotex%26amp%3Bschema%3Dpublic%26table%3Dpg_ts_dict&return_desc=Back&sortkey=&sortdir=>

>>>> en_stem     snb_en_init(text) /usr/local/pgsql/share/contrib/english.stop
>>>> snb_lexize(internal,internal,integer)     English Stemmer. Snowball.
>>>> Edit
>>>>
<http://www.rotex-service.com/phppgadmin/display.php?action=confeditrow&strings=expanded&page=1&key%5Bdict_name%5D=ru_stem&database=selina_rotex&schema=public&table=pg_ts_dict&return_url=tblproperties.php%3Fdatabase%3Dselina_rotex%26amp%3Bschema%3Dpublic%26table%3Dpg_ts_dict&return_desc=Back&sortkey=&sortdir=>

>>>> Delete
>>>>
<http://www.rotex-service.com/phppgadmin/display.php?action=confdelrow&strings=expanded&page=1&key%5Bdict_name%5D=ru_stem&database=selina_rotex&schema=public&table=pg_ts_dict&return_url=tblproperties.php%3Fdatabase%3Dselina_rotex%26amp%3Bschema%3Dpublic%26table%3Dpg_ts_dict&return_desc=Back&sortkey=&sortdir=>

>>>> ru_stem     snb_ru_init(text) /usr/local/pgsql/share/contrib/russian.stop
>>>> snb_lexize(internal,internal,integer)     Russian Stemmer. Snowball.
>>>> Edit
>>>>
<http://www.rotex-service.com/phppgadmin/display.php?action=confeditrow&strings=expanded&page=1&key%5Bdict_name%5D=ispell_template&database=selina_rotex&schema=public&table=pg_ts_dict&return_url=tblproperties.php%3Fdatabase%3Dselina_rotex%26amp%3Bschema%3Dpublic%26table%3Dpg_ts_dict&return_desc=Back&sortkey=&sortdir=>

>>>> Delete
>>>>
<http://www.rotex-service.com/phppgadmin/display.php?action=confdelrow&strings=expanded&page=1&key%5Bdict_name%5D=ispell_template&database=selina_rotex&schema=public&table=pg_ts_dict&return_url=tblproperties.php%3Fdatabase%3Dselina_rotex%26amp%3Bschema%3Dpublic%26table%3Dpg_ts_dict&return_desc=Back&sortkey=&sortdir=>

>>>> ispell_template     spell_init(text)     /NULL/
>>>> spell_lexize(internal,internal,integer)     ISpell interface. Must have
>>>> .dict and .aff files
>>>> Edit
>>>>
<http://www.rotex-service.com/phppgadmin/display.php?action=confeditrow&strings=expanded&page=1&key%5Bdict_name%5D=synonym&database=selina_rotex&schema=public&table=pg_ts_dict&return_url=tblproperties.php%3Fdatabase%3Dselina_rotex%26amp%3Bschema%3Dpublic%26table%3Dpg_ts_dict&return_desc=Back&sortkey=&sortdir=>

>>>> Delete
>>>>
<http://www.rotex-service.com/phppgadmin/display.php?action=confdelrow&strings=expanded&page=1&key%5Bdict_name%5D=synonym&database=selina_rotex&schema=public&table=pg_ts_dict&return_url=tblproperties.php%3Fdatabase%3Dselina_rotex%26amp%3Bschema%3Dpublic%26table%3Dpg_ts_dict&return_desc=Back&sortkey=&sortdir=>

>>>> synonym     syn_init(text)     /NULL/
>>>> syn_lexize(internal,internal,integer) Example of synonym dictionary
>>>> Edit
>>>>
<http://www.rotex-service.com/phppgadmin/display.php?action=confeditrow&strings=expanded&page=1&key%5Bdict_name%5D=de_ispell&database=selina_rotex&schema=public&table=pg_ts_dict&return_url=tblproperties.php%3Fdatabase%3Dselina_rotex%26amp%3Bschema%3Dpublic%26table%3Dpg_ts_dict&return_desc=Back&sortkey=&sortdir=>

>>>> Delete
>>>>
<http://www.rotex-service.com/phppgadmin/display.php?action=confdelrow&strings=expanded&page=1&key%5Bdict_name%5D=de_ispell&database=selina_rotex&schema=public&table=pg_ts_dict&return_url=tblproperties.php%3Fdatabase%3Dselina_rotex%26amp%3Bschema%3Dpublic%26table%3Dpg_ts_dict&return_desc=Back&sortkey=&sortdir=>

>>>> de_ispell     spell_init(text)
>>>> DictFile="/usr/local/pgsql/share/contrib/dictonary/german_comb.dict",
>>>> AffFile="/usr/local/pgsql/share/contrib/dictonary/german_comb.aff",
>>>> StopFile="/usr/local/pgsql/share/contrib/dictonary/german.stop"
>>>> spell_lexize(internal,internal,integer)     /NULL/
>>>>
>>>>
>>>>
>>>> Timo
>>>>
>>>>
>>>> Oleg Bartunov wrote:
>>>>
>>>>> Timo,
>>>>>
>>>>> please, check you apply patch for compound word support.
>>>>> What is version of postgresql ?
>>>>> Does ispell dict works for non-compound words ?
>>>>>
>>>>>     Oleg
>>>>>
>>>>> On Fri, 5 Nov 2004, Timo Haberkern wrote:
>>>>>
>>>>>> Hi there,
>>>>>>
>>>>>> i have some troubles with my TSearch2 Installation. I have done this
>>>>>> installation as described in
>>>>>> http://www.sai.msu.su/~megera/oddmuse/index.cgi/Tsearch_V2_compound_words
>>>>>> <http://www.sai.msu.su/%7Emegera/oddmuse/index.cgi/Tsearch_V2_compound_words>
>>>>>> I used the german myspell dictionary from
>>>>>> http://lingucomponent.openoffice.org/spell_dic.html and converted it
>>>>>> with
>>>>>> my2ispell
>>>>>>
>>>>>> Nearly everything is working fine so far, except two problems:
>>>>>>
>>>>>> 1.) The stopword-file seems to be ignored: If i try it with SELECT
>>>>>> to_tsvector("default_german", "ein Haus") i get     "ein":1 "haus":2
>>>>>>
>>>>>> ein should be a Stopword for german (and is defined the german.stop
>>>>>> file as
>>>>>> well)
>>>>>>
>>>>>> 2.) The compound words feature doesn"t work too. I have tried a lot of
>>>>>> words,
>>>>>> i.e. "Fehlermeldung" with SELECT to_tsvector("default_german",
>>>>>> "Fehlermeldung")
>>>>>> i only get
>>>>>> "fehlermeldung":1 but i would expect "fehler" and "meldung" as
>>>>>> seperated
>>>>>> entries. Is there anything wrong with the dictonary or my
>>>>>> configuration?
>>>>>>
>>>>>>
>>>>>> My current configuration:
>>>>>>
>>>>>> pg_ts_cfg:
>>>>>>
>>>>>> default    default    C
>>>>>> default_russian    default    ru_RU.KOI8-R
>>>>>> simple    default    NULL
>>>>>> default_german    default    de_DE.ISO8859-1
>>>>>>     pg_ts_cfgmap:
>>>>>>
>>>>>> default_german    host    {simple}
>>>>>> default_german    hword    {simple}
>>>>>> default_german    int    {simple}
>>>>>> default_german    nlhword    {simple}
>>>>>> default_german    nlpart_hword    {simple}
>>>>>> default_german    nlword    {simple}
>>>>>> default_german    part_hword    {simple}
>>>>>> default_german    sfloat    {simple}
>>>>>> default_german    uint    {simple}
>>>>>> default_german    uri    {simple}
>>>>>> default_german    url    {simple}
>>>>>> default_german    version    {simple}
>>>>>> default_german    word    {simple}
>>>>>> default_german    lpart_hword    {de_ispell,german_snowball}
>>>>>> default_german    lword    {de_ispell,german_snowball}
>>>>>> default_german    lhword    {de_ispell,german_snowball}
>>>>>>
>>>>>>
>>>>>> pg_ts_dict:
>>>>>>
>>>>>> de_ispell | 17166    |
>>>>>> DictFile="/usr/local/pgsql/share/contrib/dictonary/german.dict",
>>>>>> AffFile="/usr/local/pgsql/share/contrib/dictonary/german.aff",
>>>>>> StopFile="/usr/local/pgsql/share/contrib/dictonary/german.stop"    |
>>>>>> 17167 | NULL
>>>>>> german_snowball    | 17357 | NULL    | 17162 | Snowball stemmer for
>>>>>> german
>>>>>>
>>>>>>
>>>>>>
>>>>>> Can anyone help me?
>>>>>>
>>>>>> regards
>>>>>>
>>>>>> Timo
>>>>>>
>>>>>>
>>>>>> ---------------------------(end of
>>>>>> broadcast)---------------------------
>>>>>> TIP 4: Don't 'kill -9' the postmaster
>>>>>>
>>>>>
>>>>>     Regards,
>>>>>         Oleg
>>>>> _____________________________________________________________
>>>>> Oleg Bartunov, sci.researcher, hostmaster of AstroNet,
>>>>> Sternberg Astronomical Institute, Moscow University (Russia)
>>>>> Internet: oleg@sai.msu.su, http://www.sai.msu.su/~megera/
>>>>> phone: +007(095)939-16-83, +007(095)939-23-83
>>>>>
>>>>> ---------------------------(end of broadcast)---------------------------
>>>>> TIP 2: you can get off all lists at once with the unregister command
>>>>>    (send "unregister YourEmailAddressHere" to majordomo@postgresql.org)
>>>>>
>>>>>
>>>>
>>>
>>>     Regards,
>>>         Oleg
>>> _____________________________________________________________
>>> Oleg Bartunov, sci.researcher, hostmaster of AstroNet,
>>> Sternberg Astronomical Institute, Moscow University (Russia)
>>> Internet: oleg@sai.msu.su, http://www.sai.msu.su/~megera/
>>> phone: +007(095)939-16-83, +007(095)939-23-83
>>>
>>> ---------------------------(end of broadcast)---------------------------
>>> TIP 2: you can get off all lists at once with the unregister command
>>>    (send "unregister YourEmailAddressHere" to majordomo@postgresql.org)
>>>
>>>
>>
>
>     Regards,
>         Oleg
> _____________________________________________________________
> Oleg Bartunov, sci.researcher, hostmaster of AstroNet,
> Sternberg Astronomical Institute, Moscow University (Russia)
> Internet: oleg@sai.msu.su, http://www.sai.msu.su/~megera/
> phone: +007(095)939-16-83, +007(095)939-23-83
>
> ---------------------------(end of broadcast)---------------------------
> TIP 2: you can get off all lists at once with the unregister command
>   (send "unregister YourEmailAddressHere" to majordomo@postgresql.org)
>

     Regards,
         Oleg
_____________________________________________________________
Oleg Bartunov, sci.researcher, hostmaster of AstroNet,
Sternberg Astronomical Institute, Moscow University (Russia)
Internet: oleg@sai.msu.su, http://www.sai.msu.su/~megera/
phone: +007(095)939-16-83, +007(095)939-23-83