Обсуждение: Tsearch2 - spanish
Hi
I had installed postgresql-8.2.4 and tsearch2 with dictionary spanish.
My problem is:
        prueba=# select to_tsvector('espanol','melón');
        ERROR:  Affix parse error at 506 line
And if execute:
        prueba=# select lexize('sp','melón');
         lexize
        ---------
         {melon}
        (1 row)
I tried many dictionaries with the same results. Also I change the
codeset of files :aff and dict (from "latin1 to utf8" and "utf8 to
iso88591") and got the same error
where  can I investigate for resolve about this problem?
My dictionary at 506 line had:
flag *J:        # isimo
    E   > -E, ÍSIMO     # grande grandísimo
    E   > -E, ÍSIMOS    # grande grandísimos
    E   > -E, ÍSIMA     # grande grandísima
    E   > -E, ÍSIMAS    # grande grandísimas
    O   > -O, ÍSIMO     # tonto tontísimo
    O   > -O, ÍSIMA     # tonto tontísima
    O   > -O, ÍSIMOS    # tonto tontísimos
    O   > -O, ÍSIMAS    # tonto tontísimas
    L   > ÍSIMO # formal formalísimo
    L   > ÍSIMA # formal formalísima
    L   > ÍSIMOS        # formal formalísimos
    L   > ÍSIMAS        # formal formalísimas
If removed "Í" then I don't have problem, but the lexema is incorrect
I saw the post
http://archives.postgresql.org/pgsql-general/2007-07/msg00888.php
Maybe Marcelo had resolve the problem, can you tell me your
configuration of tsearch2?
best regards
PD I need to resolve it for my work
			
		>         prueba=# select to_tsvector('espanol','melón');
>         ERROR:  Affix parse error at 506 line
and
>         prueba=# select lexize('sp','melón');
>          lexize
>         ---------
>          {melon}
>         (1 row)
Looks very strange, can you provide list of dictionaries and configuration map?
> I tried many dictionaries with the same results. Also I change the
> codeset of files :aff and dict (from "latin1 to utf8" and "utf8 to
> iso88591") and got the same error
>
> where  can I investigate for resolve about this problem?
>
> My dictionary at 506 line had:
Where do you take this file? And what is encdoing/locale setting of your db?
--
Teodor Sigaev                                   E-mail: teodor@sigaev.ru
                                                    WWW: http://www.sigaev.ru/
			
		Hi
You are rigth, the output of "show lc_ctype;" is C.
Then I did is:
prueba1=# show lc_ctype;
    lc_ctype
-----------------
 es_MX.ISO8859-1
(1 row)
and do it
 % initdb -D /YOUR/PATH -E LATIN1 --locale es_ES.ISO8859-1
(how you do say)
and "createdb -E iso8859-1 prueba1" and finally tsearch2
the original problem is resolved
prueba1=# select to_tsvector('espanol','melón');
 to_tsvector
-------------
 'melón':1
(1 row)
but if I change the sentece for it:
prueba1=# select to_tsvector('espanol','melón  perro mordelón');
server closed the connection unexpectedly
        This probably means the server terminated abnormally
        before or while processing the request.
The connection to the server was lost. Attempting reset: Failed.
!>
??? lost the connection ... the server is up .... any idea?
The synonym is intentional
thanks in advanced
El mar, 18-09-2007 a las 21:40 +0400, Teodor Sigaev escribió:
> >         LC_CTYPE="POSIX"
>
>
> pls, output of "show lc_ctype;" command. If it's C locale then I can identify
> problem - characters diacritical mark (as ó) is not an alpha character, and
> ispell dictionary will fail. To fix that you should run initdb with options:
> % initdb -D /YOUR/PATH -E LATIN1 --locale es_ES.ISO8859-1
> or
> % initdb -D /YOUR/PATH -E UTF8 --locale es_ES.UTF8
>
> In last case you should also recode all dictionary's datafile in utf8 encoding.
>
> >>>         prueba=# select to_tsvector('espanol','melón');
> >>>         ERROR:  Affix parse error at 506 line
> >> and
> >>>         prueba=# select lexize('sp','melón');
> >>>          lexize
> >>>         ---------
> >>>          {melon}
> >>>         (1 row)
> sp is a Snowball stemmer, it doesn't require affix file, so it works.
>
> By the way, why is synonym dictionary paced after ispell? is it intentional?
> Usually, synonym dictionary goes first, then ispell and after all of them snowball.
>
			
		> prueba1=# select to_tsvector('espanol','melón  perro mordelón');
> server closed the connection unexpectedly
>         This probably means the server terminated abnormally
>         before or while processing the request.
> The connection to the server was lost. Attempting reset: Failed.
> !>
>
Hmm, can you provide backtrace?
--
Teodor Sigaev                                   E-mail: teodor@sigaev.ru
                                                    WWW: http://www.sigaev.ru/
			
		Felipe
--- Felipe de Jesús Molina Bravo
<felipe.molina@inegi.gob.mx> escribió:
> Hi
>
> You are rigth, the output of "show lc_ctype;" is C.
>
> Then I did is:
>
> prueba1=# show lc_ctype;
>     lc_ctype
> -----------------
>  es_MX.ISO8859-1
> (1 row)
>
> and do it
>
>  % initdb -D /YOUR/PATH -E LATIN1 --locale
> es_ES.ISO8859-1
>
> (how you do say)
>
> and "createdb -E iso8859-1 prueba1" and finally
> tsearch2
>
> the original problem is resolved
>
> prueba1=# select to_tsvector('espanol','melón');
>  to_tsvector
> -------------
>  'melón':1
> (1 row)
>
>
> but if I change the sentece for it:
>
> prueba1=# select to_tsvector('espanol','melón  perro
> mordelón');
> server closed the connection unexpectedly
>         This probably means the server terminated
> abnormally
>         before or while processing the request.
> The connection to the server was lost. Attempting
> reset: Failed.
> !>
 The same thing he same thing happened my to me at
first time with
 Tsearch2 - spanish , i think you need
 patch snowball with tsearch_snowball_82 file ,
googling
 you find instructions how doit .
 best regards
 mdc
>
>
> ??? lost the connection ... the server is up ....
> any idea?
>
> The synonym is intentional
>
>
> thanks in advanced
>
>
> El mar, 18-09-2007 a las 21:40 +0400, Teodor Sigaev
> escribió:
> > >         LC_CTYPE="POSIX"
> >
> >
> > pls, output of "show lc_ctype;" command. If it's C
> locale then I can identify
> > problem - characters diacritical mark (as ó) is
> not an alpha character, and
> > ispell dictionary will fail. To fix that you
> should run initdb with options:
> > % initdb -D /YOUR/PATH -E LATIN1 --locale
> es_ES.ISO8859-1
> > or
> > % initdb -D /YOUR/PATH -E UTF8 --locale es_ES.UTF8
> >
> > In last case you should also recode all
> dictionary's datafile in utf8 encoding.
> >
> > >>>         prueba=# select
> to_tsvector('espanol','melón');
> > >>>         ERROR:  Affix parse error at 506 line
> > >> and
> > >>>         prueba=# select lexize('sp','melón');
> > >>>          lexize
> > >>>         ---------
> > >>>          {melon}
> > >>>         (1 row)
> > sp is a Snowball stemmer, it doesn't require affix
> file, so it works.
> >
> > By the way, why is synonym dictionary paced after
> ispell? is it intentional?
> > Usually, synonym dictionary goes first, then
> ispell and after all of them snowball.
> >
>
> ---------------------------(end of
> broadcast)---------------------------
> TIP 1: if posting/reading through Usenet, please
> send an appropriate
>        subscribe-nomail command to
> majordomo@postgresql.org so that your
>        message can get through to the mailing list
> cleanly
>
      Seguí de cerca a la Selección Argentina de Rugby en el Mundial de Francia 2007.
http://ar.sports.yahoo.com/mundialderugby
			
		Hi
Thank's Teodor and Marcelo
the problem is solved
regards
-----Mensaje original-----
De: marcelo Cortez [mailto:jmdc_marcelo@yahoo.com.ar]
Enviado el: jue 20/09/2007 7:13
Para: MOLINA BRAVO FELIPE DE JESUS; Teodor Sigaev
CC: PostgreSQL General
Asunto: Re: [GENERAL] Tsearch2 - spanish
Felipe
--- Felipe de Jesús Molina Bravo
<felipe.molina@inegi.gob.mx> escribió:
> Hi
>
> You are rigth, the output of "show lc_ctype;" is C.
>
> Then I did is:
>
> prueba1=# show lc_ctype;
>     lc_ctype
> -----------------
>  es_MX.ISO8859-1
> (1 row)
>
> and do it
>
>  % initdb -D /YOUR/PATH -E LATIN1 --locale
> es_ES.ISO8859-1
>
> (how you do say)
>
> and "createdb -E iso8859-1 prueba1" and finally
> tsearch2
>
> the original problem is resolved
>
> prueba1=# select to_tsvector('espanol','melón');
>  to_tsvector
> -------------
>  'melón':1
> (1 row)
>
>
> but if I change the sentece for it:
>
> prueba1=# select to_tsvector('espanol','melón  perro
> mordelón');
> server closed the connection unexpectedly
>         This probably means the server terminated
> abnormally
>         before or while processing the request.
> The connection to the server was lost. Attempting
> reset: Failed.
> !>
 The same thing he same thing happened my to me at
first time with
 Tsearch2 - spanish , i think you need
 patch snowball with tsearch_snowball_82 file ,
googling
 you find instructions how doit .
 best regards
 mdc
>
>
> ??? lost the connection ... the server is up ....
> any idea?
>
> The synonym is intentional
>
>
> thanks in advanced
>
>
> El mar, 18-09-2007 a las 21:40 +0400, Teodor Sigaev
> escribió:
> > >         LC_CTYPE="POSIX"
> >
> >
> > pls, output of "show lc_ctype;" command. If it's C
> locale then I can identify
> > problem - characters diacritical mark (as ó) is
> not an alpha character, and
> > ispell dictionary will fail. To fix that you
> should run initdb with options:
> > % initdb -D /YOUR/PATH -E LATIN1 --locale
> es_ES.ISO8859-1
> > or
> > % initdb -D /YOUR/PATH -E UTF8 --locale es_ES.UTF8
> >
> > In last case you should also recode all
> dictionary's datafile in utf8 encoding.
> >
> > >>>         prueba=# select
> to_tsvector('espanol','melón');
> > >>>         ERROR:  Affix parse error at 506 line
> > >> and
> > >>>         prueba=# select lexize('sp','melón');
> > >>>          lexize
> > >>>         ---------
> > >>>          {melon}
> > >>>         (1 row)
> > sp is a Snowball stemmer, it doesn't require affix
> file, so it works.
> >
> > By the way, why is synonym dictionary paced after
> ispell? is it intentional?
> > Usually, synonym dictionary goes first, then
> ispell and after all of them snowball.
> >
>
> ---------------------------(end of
> broadcast)---------------------------
> TIP 1: if posting/reading through Usenet, please
> send an appropriate
>        subscribe-nomail command to
> majordomo@postgresql.org so that your
>        message can get through to the mailing list
> cleanly
>
      Seguí de cerca a la Selección Argentina de Rugby en el Mundial de Francia 2007.
http://ar.sports.yahoo.com/mundialderugby
			
		Hello group :) How do a clear bits in a number in PostGreSQL? in c++ its: 0xffffff00 &~ 0x0000ffff what is it in PostGreSQL from the psql command line app? select ... Thanx:)
nevermind, I figured it out ... fails: 0xffffff00 &~ 0x0000ffff succeeds: 0xffffff00 & ~ 0x0000ffff I had to add a space. ----- Original Message ----- From: "madhtr" <madhtr@schif.org> To: "PostgreSQL General" <pgsql-general@postgresql.org> Sent: Thursday, September 20, 2007 13:01 Subject: [GENERAL] How to clear bits? > Hello group :) > > How do a clear bits in a number in PostGreSQL? > > in c++ its: > > 0xffffff00 &~ 0x0000ffff > > what is it in PostGreSQL from the psql command line app? > > select ... > > Thanx:) > > > ---------------------------(end of broadcast)--------------------------- > TIP 9: In versions below 8.0, the planner will ignore your desire to > choose an index scan if your joining column's datatypes do not > match