On Sat, Nov 23, 2019 at 10:42 AM Christoph Gößmann <mail@goessmann.io> wrote:
Hi Jeff,
You're right about that point. Let me redefine. I would like to drop all tokens which neither are the stemmed or unstemmed version of a known word. Would there be the possibility of putting a wordlist as a filter ahead of the stemming? Or do you know about a good English lexeme list that could be used to filter after stemming?
I think what you describe is the opposite of what snowball was designed to do. You want an ispell-based dictionary instead.