Class SimpleToken

  • All Implemented Interfaces:
    Token

    public class SimpleToken
    extends java.lang.Object
    implements Token
    Author:
    Mathias Mølster Lidal
    • Constructor Detail

      • SimpleToken

        public SimpleToken​(java.lang.String orig)
    • Method Detail

      • getOrig

        public java.lang.String getOrig()
        Description copied from interface: Token
        Returns the original form of this token
        Specified by:
        getOrig in interface Token
      • getNumStems

        public int getNumStems()
        Description copied from interface: Token
        Returns the number of stem forms available for this token.
        Specified by:
        getNumStems in interface Token
      • getStem

        public java.lang.String getStem​(int i)
        Description copied from interface: Token
        Returns the stem at position i
        Specified by:
        getStem in interface Token
      • getNumComponents

        public int getNumComponents()
        Description copied from interface: Token
        Returns the number of components, if this token is a compound word (e.g. german "kommunikationsfehler". Otherwise, return 0
        Specified by:
        getNumComponents in interface Token
        Returns:
        number of components, or 0 if none
      • getComponent

        public Token getComponent​(int i)
        Description copied from interface: Token
        Returns a component token of this
        Specified by:
        getComponent in interface Token
      • getTokenString

        public java.lang.String getTokenString()
        Description copied from interface: Token
        Returns token string in a form suitable for indexing: The most lowercased variant of the most processed token form available. If called on a compound token this returns a lowercased form of the entire word.
        Specified by:
        getTokenString in interface Token
        Returns:
        token string value
      • setTokenString

        public SimpleToken setTokenString​(java.lang.String str)
      • getType

        public TokenType getType()
        Description copied from interface: Token
        Returns the type of this token - word, space or punctuation etc.
        Specified by:
        getType in interface Token
      • getScript

        public TokenScript getScript()
        Description copied from interface: Token
        Returns the script of this token
        Specified by:
        getScript in interface Token
      • isSpecialToken

        public boolean isSpecialToken()
        Description copied from interface: Token
        Returns whether this is an instance of a declared special token (e.g. c++)
        Specified by:
        isSpecialToken in interface Token
      • setSpecialToken

        public SimpleToken setSpecialToken​(boolean specialToken)
      • getOffset

        public long getOffset()
        Description copied from interface: Token
        Returns the offset position of this token
        Specified by:
        getOffset in interface Token
      • setOffset

        public SimpleToken setOffset​(long offset)
      • hashCode

        public int hashCode()
        Overrides:
        hashCode in class java.lang.Object
      • equals

        public boolean equals​(java.lang.Object obj)
        Overrides:
        equals in class java.lang.Object
      • toString

        public java.lang.String toString()
        Overrides:
        toString in class java.lang.Object
      • isIndexable

        public boolean isIndexable()
        Description copied from interface: Token
        Whether this token should be indexed
        Specified by:
        isIndexable in interface Token