Class NGramTokenizer

    • Method Detail

      • hasMoreTokens

        public boolean hasMoreTokens()
        Description copied from interface: Tokenizer
        An iterator for tracking whether more tokens are left in the iterator not
        Specified by:
        hasMoreTokens in interface Tokenizer
        Returns:
        whether there is anymore tokens to iterate over
      • countTokens

        public int countTokens()
        Description copied from interface: Tokenizer
        The number of tokens in the tokenizer
        Specified by:
        countTokens in interface Tokenizer
        Returns:
        the number of tokens
      • nextToken

        public String nextToken()
        Description copied from interface: Tokenizer
        The next token (word usually) in the string
        Specified by:
        nextToken in interface Tokenizer
        Returns:
        the next token in the string if any
      • getTokens

        public List<String> getTokens()
        Description copied from interface: Tokenizer
        Returns a list of all the tokens
        Specified by:
        getTokens in interface Tokenizer
        Returns:
        a list of all the tokens
      • setTokenPreProcessor

        public void setTokenPreProcessor​(TokenPreProcess tokenPreProcessor)
        Description copied from interface: Tokenizer
        Set the token pre process
        Specified by:
        setTokenPreProcessor in interface Tokenizer
        Parameters:
        tokenPreProcessor - the token pre processor to set