Re: Feature: Add Greek language fulltext search

Поиск
Список
Период
Сортировка
От Panagiotis Mavrogiorgos
Тема Re: Feature: Add Greek language fulltext search
Дата
Msg-id CAAVvtwrnGCoiG5csey14=mrn_jTUEO2R2TzUWR2+TuezA3wR3A@mail.gmail.com
обсуждение исходный текст
Ответ на Re: Feature: Add Greek language fulltext search  (Peter Eisentraut <peter.eisentraut@2ndquadrant.com>)
Список pgsql-hackers


On Thu, Jul 4, 2019 at 1:39 PM Peter Eisentraut <peter.eisentraut@2ndquadrant.com> wrote:
On 2019-03-25 12:04, Panagiotis Mavrogiorgos wrote:
> Last November snowball added support for Greek language [1]. Following
> the instructions [2], I wrote a patch that adds fulltext search for
> Greek in Postgres. The patch is attached. 

I have committed a full sync from the upstream snowball repository,
which pulled in the new greek stemmer.

Could you please clarify where you got the stopword list from?  The
README says those need to be downloaded separately, but I wasn't able to
find the download location.  It would be good to document this, for
example in the commit message.  I haven't committed the stopword list yet.

Thank you Peter,

Here is the repo with the stop-words: https://github.com/pmav99/greek_stopwords
The list is based on an earlier publication with modification by me. All the relevant info is on github.

Disclaimer 1: The list has not been validated by an expert.

Disclaimer 2: There are more stop-words lists on the internet, but they are less complete and they also use ancient greek words. Furthermore, my testing showed that snowball needs to handle accents (tonous) and ς (teliko sigma) in a special way if you want the stemmer to work with capitalized words too.


all the best,
Panagiotis

В списке pgsql-hackers по дате отправления:

Предыдущее
От: Antonin Houska
Дата:
Сообщение: Re: [HACKERS] WIP: Aggregation push-down
Следующее
От: Bruce Momjian
Дата:
Сообщение: Re: [Proposal] Table-level Transparent Data Encryption (TDE) and KeyManagement Service (KMS)