Class SegmenterImpl

  • All Implemented Interfaces:
    Segmenter

    public class SegmenterImpl
    extends java.lang.Object
    implements Segmenter
    Author:
    Simon Thoresen Hult
    • Method Summary

      All Methods Instance Methods Concrete Methods 
      Modifier and Type Method Description
      java.util.List<java.lang.String> segment​(java.lang.String input, Language language)
      Split input-string into tokens, and returned a list of tokens in unprocessed form (i.e.
      • Methods inherited from class java.lang.Object

        clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
    • Constructor Detail

      • SegmenterImpl

        public SegmenterImpl​(Tokenizer tokenizer)
    • Method Detail

      • segment

        public java.util.List<java.lang.String> segment​(java.lang.String input,
                                                        Language language)
        Description copied from interface: Segmenter
        Split input-string into tokens, and returned a list of tokens in unprocessed form (i.e. lowercased, normalized and stemmed if applicable, see @link{StemMode} for list of stemming options). It is assumed that the input only contains word-characters, any punctuation and spacing tokens will be removed.
        Specified by:
        segment in interface Segmenter
        Parameters:
        input - the text to segment.
        language - language of input text.
        Returns:
        the list of segments.