| Class | Description |
|---|---|
| CommonPreprocessor | |
| CustomStemmingPreprocessor |
This is StemmingPreprocessor compatible with different StemmingProcessors defined as lucene/tartarus SnowballProgram
Like, but not limited to: RussianStemmer, DutchStemmer, FrenchStemmer etc
PLEASE NOTE: This preprocessor is NOT thread-safe.
|
| EndingPreProcessor |
Gets rid of endings:
ed,ing, ly, s, .
|
| LowCasePreProcessor | |
| StemmingPreprocessor |
This tokenizer preprocessor implements basic cleaning inherited from CommonPreprocessor + does english Porter stemming on tokens
|
| StringCleaning |
Various string cleaning utils
|
Copyright © 2016. All Rights Reserved.