Class Lexer<TT extends TokenType,​T extends Token<TT>>

  • Type Parameters:
    TT - The token-type enum class.
    T - The token implementation class.
    All Implemented Interfaces:
    java.lang.Iterable<T>

    public class Lexer<TT extends TokenType,​T extends Token<TT>>
    extends java.lang.Object
    implements java.lang.Iterable<T>
    Base lexer class with helper methods that does not need to be implemented. The base lexer should be able to continuously return tokens until end of the stream, or the lexer process fails.

    PS: This class may not be suited for tokenization where the stream property of the tokenizer is present, e.g. see the JSON JsonTokenizer, which must be able to parse arbitrarily large inputs with no definite end of stream without consuming any non-read content.

    • Constructor Summary

      Constructors 
      Modifier Constructor Description
      protected Lexer​(Tokenizer<TT,​T> tokenizer)
      Create a lexer instance using a specific tokenizer.
    • Method Summary

      All Methods Instance Methods Concrete Methods 
      Modifier and Type Method Description
      protected LexerException eofFailure​(java.lang.CharSequence line, int lineNo, int linePos, java.lang.String message, java.lang.Object... args)
      Make a lexing / parsing failure exception.
      T expect​(java.lang.String what)
      Expect a new token, and fail there is no next token.
      T expect​(java.lang.String what, java.util.function.Predicate<T> validator)
      Expect a new token, and fail if the token does not validate.
      T expect​(java.lang.String what, TokenType type)
      Expect a new token, and fail if the token is not of the given token type.
      T expectSymbol​(java.lang.String what, char... symbols)  
      LexerException failure​(T token, java.lang.String message, java.lang.Object... args)
      Make a lexing / parsing failure exception.
      boolean hasNext()
      Return true if there is a 'next' token.
      java.util.Iterator<T> iterator()  
      T next()
      Consume and return the next token.
      T peek()
      Return the token that will be returned by 'next', but do not 'consume' it.
      T peek​(java.lang.String what)
      Peek the next token, and fail if the token is not present.
      T readUntil​(java.lang.String term, TT type, boolean allowEof)
      Read until termination string.
      java.lang.String toString()  
      • Methods inherited from class java.lang.Object

        clone, equals, finalize, getClass, hashCode, notify, notifyAll, wait, wait, wait
      • Methods inherited from interface java.lang.Iterable

        forEach, spliterator
    • Constructor Detail

      • Lexer

        protected Lexer​(Tokenizer<TT,​T> tokenizer)
        Create a lexer instance using a specific tokenizer.
        Parameters:
        tokenizer - The tokenizer to be used to get tokens.
    • Method Detail

      • eofFailure

        @Nonnull
        protected LexerException eofFailure​(java.lang.CharSequence line,
                                            int lineNo,
                                            int linePos,
                                            java.lang.String message,
                                            java.lang.Object... args)
        Make a lexing / parsing failure exception.
        Parameters:
        line - The line for the failure.
        lineNo - The line no for the failure.
        linePos - The line pos for the failure.
        message - The message for the failure.
        args - Arguments to format message.
        Returns:
        The failure exception.
      • failure

        @Nonnull
        public LexerException failure​(T token,
                                      java.lang.String message,
                                      java.lang.Object... args)
        Make a lexing / parsing failure exception.
        Parameters:
        token - The token causing the failure.
        message - The message for the failure.
        args - Arguments for formatting message.
        Returns:
        The failure exception.
      • next

        @Nullable
        public T next()
               throws LexerException,
                      java.io.IOException
        Consume and return the next token. This should not trigger parsing anything after this token.
        Returns:
        The next token, or null if it's end of the stream.
        Throws:
        LexerException - If parsing token failed.
        java.io.IOException - If reading failed.
      • hasNext

        public boolean hasNext()
                        throws LexerException,
                               java.io.IOException
        Return true if there is a 'next' token. If this method returns true, then 'peek' must return non-null until otherwise modified, and the next call to 'next' must return non-null.
        Returns:
        If there is a next token.
        Throws:
        LexerException - If parsing token failed.
        java.io.IOException - If reading failed.
      • peek

        @Nullable
        public T peek()
               throws LexerException,
                      java.io.IOException
        Return the token that will be returned by 'next', but do not 'consume' it. If this method returns a non-null value, 'next' must return the same value exactly once.
        Returns:
        The next token.
        Throws:
        LexerException - If parsing token failed.
        java.io.IOException - If reading failed.
      • peek

        @Nonnull
        public T peek​(java.lang.String what)
               throws LexerException,
                      java.io.IOException
        Peek the next token, and fail if the token is not present.
        Parameters:
        what - The exception message on failure.
        Returns:
        The token to be the next.
        Throws:
        LexerException - On parse errors.
        java.io.IOException - If reading failed.
      • expect

        @Nonnull
        public T expect​(java.lang.String what)
                 throws LexerException,
                        java.io.IOException
        Expect a new token, and fail there is no next token.
        Parameters:
        what - What is expected.
        Returns:
        The next token.
        Throws:
        LexerException - On parse errors or missing next token.
        java.io.IOException - If reading failed.
      • expect

        @Nonnull
        public T expect​(java.lang.String what,
                        TokenType type)
                 throws LexerException,
                        java.io.IOException
        Expect a new token, and fail if the token is not of the given token type.
        Parameters:
        what - The exception message on failure.
        type - The token type being expected.
        Returns:
        The token to be the next.
        Throws:
        LexerException - On parse errors or validation failures.
        java.io.IOException - If reading failed.
      • expect

        @Nonnull
        public T expect​(java.lang.String what,
                        java.util.function.Predicate<T> validator)
                 throws LexerException,
                        java.io.IOException
        Expect a new token, and fail if the token does not validate.
        Parameters:
        what - The exception message on failure.
        validator - Validator to check on the token.
        Returns:
        The token to be the next.
        Throws:
        LexerException - On parse errors or validation failure.
        java.io.IOException - If reading failed.
      • expectSymbol

        @Nonnull
        public T expectSymbol​(java.lang.String what,
                              char... symbols)
                       throws LexerException,
                              java.io.IOException
        Parameters:
        what - The exception message on failure.
        symbols - Symbols to be expected.
        Returns:
        The token of the symbol.
        Throws:
        LexerException - On parse errors or validation failure.
        java.io.IOException - If unable to parse token, or not applicable symbol.
      • readUntil

        @Nullable
        public T readUntil​(java.lang.String term,
                           TT type,
                           boolean allowEof)
                    throws LexerException,
                           java.io.IOException
        Read until termination string.
        Parameters:
        term - The termination string.
        type - The type of token to be generated.
        allowEof - If we allow end of file to termainate the token.
        Returns:
        The read token if it has any size.
        Throws:
        LexerException - On parse errors or validation failure.
        java.io.IOException - If unable to parse token.
      • iterator

        @Nonnull
        public java.util.Iterator<T> iterator()
        Specified by:
        iterator in interface java.lang.Iterable<TT extends TokenType>
      • toString

        public java.lang.String toString()
        Overrides:
        toString in class java.lang.Object