Class ParserATNSimulator
- java.lang.Object
-
- org.antlr.v4.runtime.atn.ATNSimulator
-
- org.antlr.v4.runtime.atn.ParserATNSimulator
-
- Direct Known Subclasses:
ProfilingATNSimulator
public class ParserATNSimulator extends ATNSimulator
The embodiment of the adaptive LL(*), ALL(*), parsing strategy.The basic complexity of the adaptive strategy makes it harder to understand. We begin with ATN simulation to build paths in a DFA. Subsequent prediction requests go through the DFA first. If they reach a state without an edge for the current symbol, the algorithm fails over to the ATN simulation to complete the DFA path for the current input (until it finds a conflict state or uniquely predicting state).
All of that is done without using the outer context because we want to create a DFA that is not dependent upon the rule invocation stack when we do a prediction. One DFA works in all contexts. We avoid using context not necessarily because it's slower, although it can be, but because of the DFA caching problem. The closure routine only considers the rule invocation stack created during prediction beginning in the decision rule. For example, if prediction occurs without invoking another rule's ATN, there are no context stacks in the configurations. When lack of context leads to a conflict, we don't know if it's an ambiguity or a weakness in the strong LL(*) parsing strategy (versus full LL(*)).
When SLL yields a configuration set with conflict, we rewind the input and retry the ATN simulation, this time using full outer context without adding to the DFA. Configuration context stacks will be the full invocation stacks from the start rule. If we get a conflict using full context, then we can definitively say we have a true ambiguity for that input sequence. If we don't get a conflict, it implies that the decision is sensitive to the outer context. (It is not context-sensitive in the sense of context-sensitive grammars.)
The next time we reach this DFA state with an SLL conflict, through DFA simulation, we will again retry the ATN simulation using full context mode. This is slow because we can't save the results and have to "interpret" the ATN each time we get that input.
CACHING FULL CONTEXT PREDICTIONS
We could cache results from full context to predicted alternative easily and that saves a lot of time but doesn't work in presence of predicates. The set of visible predicates from the ATN start state changes depending on the context, because closure can fall off the end of a rule. I tried to cache tuples (stack context, semantic context, predicted alt) but it was slower than interpreting and much more complicated. Also required a huge amount of memory. The goal is not to create the world's fastest parser anyway. I'd like to keep this algorithm simple. By launching multiple threads, we can improve the speed of parsing across a large number of files.
There is no strict ordering between the amount of input used by SLL vs LL, which makes it really hard to build a cache for full context. Let's say that we have input A B C that leads to an SLL conflict with full context X. That implies that using X we might only use A B but we could also use A B C D to resolve conflict. Input A B C D could predict alternative 1 in one position in the input and A B C E could predict alternative 2 in another position in input. The conflicting SLL configurations could still be non-unique in the full context prediction, which would lead us to requiring more input than the original A B C. To make a prediction cache work, we have to track the exact input used during the previous prediction. That amounts to a cache that maps X to a specific DFA for that context.
Something should be done for left-recursive expression predictions. They are likely LL(1) + pred eval. Easier to do the whole SLL unless error and retry with full LL thing Sam does.
AVOIDING FULL CONTEXT PREDICTION
We avoid doing full context retry when the outer context is empty, we did not dip into the outer context by falling off the end of the decision state rule, or when we force SLL mode.
As an example of the not dip into outer context case, consider as super constructor calls versus function calls. One grammar might look like this:
ctorBody : '{' superCall? stat* '}' ;
Or, you might see something like
stat : superCall ';' | expression ';' | ... ;
In both cases I believe that no closure operations will dip into the outer context. In the first case ctorBody in the worst case will stop at the '}'. In the 2nd case it should stop at the ';'. Both cases should stay within the entry rule and not dip into the outer context.
PREDICATES
Predicates are always evaluated if present in either SLL or LL both. SLL and LL simulation deals with predicates differently. SLL collects predicates as it performs closure operations like ANTLR v3 did. It delays predicate evaluation until it reaches and accept state. This allows us to cache the SLL ATN simulation whereas, if we had evaluated predicates on-the-fly during closure, the DFA state configuration sets would be different and we couldn't build up a suitable DFA.
When building a DFA accept state during ATN simulation, we evaluate any predicates and return the sole semantically valid alternative. If there is more than 1 alternative, we report an ambiguity. If there are 0 alternatives, we throw an exception. Alternatives without predicates act like they have true predicates. The simple way to think about it is to strip away all alternatives with false predicates and choose the minimum alternative that remains.
When we start in the DFA and reach an accept state that's predicated, we test those and return the minimum semantically viable alternative. If no alternatives are viable, we throw an exception.
During full LL ATN simulation, closure always evaluates predicates and on-the-fly. This is crucial to reducing the configuration set size during closure. It hits a landmine when parsing with the Java grammar, for example, without this on-the-fly evaluation.
SHARING DFA
All instances of the same parser share the same decision DFAs through a static field. Each instance gets its own ATN simulator but they share the same
decisionToDFA
field. They also share aPredictionContextCache
object that makes sure that allPredictionContext
objects are shared among the DFA states. This makes a big size difference.THREAD SAFETY
The
ParserATNSimulator
locks on thedecisionToDFA
field when it adds a new DFA object to that array.addDFAEdge(org.antlr.v4.runtime.dfa.DFA, org.antlr.v4.runtime.dfa.DFAState, int, org.antlr.v4.runtime.dfa.DFAState)
locks on the DFA for the current decision when setting theDFAState.edges
field.addDFAState(org.antlr.v4.runtime.dfa.DFA, org.antlr.v4.runtime.dfa.DFAState)
locks on the DFA for the current decision when looking up a DFA state to see if it already exists. We must make sure that all requests to add DFA states that are equivalent result in the same shared DFA object. This is because lots of threads will be trying to update the DFA at once. TheaddDFAState(org.antlr.v4.runtime.dfa.DFA, org.antlr.v4.runtime.dfa.DFAState)
method also locks inside the DFA lock but this time on the shared context cache when it rebuilds the configurations'PredictionContext
objects using cached subgraphs/nodes. No other locking occurs, even during DFA simulation. This is safe as long as we can guarantee that all threads referencings.edge[t]
get the same physical targetDFAState
, ornull
. Once into the DFA, the DFA simulation does not reference theDFA.states
map. It follows theDFAState.edges
field to new targets. The DFA simulator will either findDFAState.edges
to benull
, to be non-null
anddfa.edges[t]
null, ordfa.edges[t]
to be non-null. TheaddDFAEdge(org.antlr.v4.runtime.dfa.DFA, org.antlr.v4.runtime.dfa.DFAState, int, org.antlr.v4.runtime.dfa.DFAState)
method could be racing to set the field but in either case the DFA simulator works; ifnull
, and requests ATN simulation. It could also race trying to getdfa.edges[t]
, but either way it will work because it's not doing a test and set operation.Starting with SLL then failing to combined SLL/LL (Two-Stage Parsing)
Sam pointed out that if SLL does not give a syntax error, then there is no point in doing full LL, which is slower. We only have to try LL if we get a syntax error. For maximum speed, Sam starts the parser set to pure SLL mode with the
BailErrorStrategy
:parser.
getInterpreter()
.setPredictionMode
(
PredictionMode.SLL
)
; parser.setErrorHandler
(newBailErrorStrategy
());If it does not get a syntax error, then we're done. If it does get a syntax error, we need to retry with the combined SLL/LL strategy.
The reason this works is as follows. If there are no SLL conflicts, then the grammar is SLL (at least for that input set). If there is an SLL conflict, the full LL analysis must yield a set of viable alternatives which is a subset of the alternatives reported by SLL. If the LL set is a singleton, then the grammar is LL but not SLL. If the LL set is the same size as the SLL set, the decision is SLL. If the LL set has size > 1, then that decision is truly ambiguous on the current input. If the LL set is smaller, then the SLL conflict resolution might choose an alternative that the full LL would rule out as a possibility based upon better context information. If that's the case, then the SLL parse will definitely get an error because the full LL analysis says it's not viable. If SLL conflict resolution chooses an alternative within the LL set, them both SLL and LL would choose the same alternative because they both choose the minimum of multiple conflicting alternatives.
Let's say we have a set of SLL conflicting alternatives
{1, 2, 3}
and a smaller LL set called s. If s is{2, 3}
, then SLL parsing will get an error because SLL will pursue alternative 1. If s is{1, 2}
or{1, 3}
then both SLL and LL will choose the same alternative because alternative one is the minimum of either set. If s is{2}
or{3}
then SLL will get a syntax error. If s is{1}
then SLL will succeed.Of course, if the input is invalid, then we will get an error for sure in both SLL and LL parsing. Erroneous input will therefore require 2 passes over the input.
-
-
Field Summary
Fields Modifier and Type Field Description protected DFA
_dfa
protected TokenStream
_input
protected ParserRuleContext
_outerContext
protected int
_startIndex
static boolean
debug
static boolean
debug_list_atn_decisions
DFA[]
decisionToDFA
static boolean
dfa_debug
protected DoubleKeyMap<PredictionContext,PredictionContext,PredictionContext>
mergeCache
Each prediction operation uses a cache for merge of prediction contexts.protected Parser
parser
static boolean
retry_debug
static boolean
TURN_OFF_LR_LOOP_ENTRY_BRANCH_OPT
Just in case this optimization is bad, add an ENV variable to turn it off-
Fields inherited from class org.antlr.v4.runtime.atn.ATNSimulator
atn, ERROR, sharedContextCache
-
-
Constructor Summary
Constructors Constructor Description ParserATNSimulator(ATN atn, DFA[] decisionToDFA, PredictionContextCache sharedContextCache)
Testing only!ParserATNSimulator(Parser parser, ATN atn, DFA[] decisionToDFA, PredictionContextCache sharedContextCache)
-
Method Summary
All Methods Static Methods Instance Methods Concrete Methods Modifier and Type Method Description protected ATNConfig
actionTransition(ATNConfig config, ActionTransition t)
int
adaptivePredict(TokenStream input, int decision, ParserRuleContext outerContext)
protected DFAState
addDFAEdge(DFA dfa, DFAState from, int t, DFAState to)
Add an edge to the DFA, if possible.protected DFAState
addDFAState(DFA dfa, DFAState D)
Add stateD
to the DFA if it is not already present, and return the actual instance stored in the DFA.protected ATNConfigSet
applyPrecedenceFilter(ATNConfigSet configs)
This method transforms the start state computed bycomputeStartState(org.antlr.v4.runtime.atn.ATNState, org.antlr.v4.runtime.RuleContext, boolean)
to the special start state used by a precedence DFA for a particular precedence value.protected boolean
canDropLoopEntryEdgeInLeftRecursiveRule(ATNConfig config)
Implements first-edge (loop entry) elimination as an optimization during closure operations.void
clearDFA()
Clear the DFA cache used by the current instance.protected void
closure(ATNConfig config, ATNConfigSet configs, Set<ATNConfig> closureBusy, boolean collectPredicates, boolean fullCtx, boolean treatEofAsEpsilon)
protected void
closure_(ATNConfig config, ATNConfigSet configs, Set<ATNConfig> closureBusy, boolean collectPredicates, boolean fullCtx, int depth, boolean treatEofAsEpsilon)
Do the actual work of walking epsilon edgesprotected void
closureCheckingStopState(ATNConfig config, ATNConfigSet configs, Set<ATNConfig> closureBusy, boolean collectPredicates, boolean fullCtx, int depth, boolean treatEofAsEpsilon)
protected ATNConfigSet
computeReachSet(ATNConfigSet closure, int t, boolean fullCtx)
protected ATNConfigSet
computeStartState(ATNState p, RuleContext ctx, boolean fullCtx)
protected DFAState
computeTargetState(DFA dfa, DFAState previousD, int t)
Compute a target state for an edge in the DFA, and attempt to add the computed state and corresponding edge to the DFA.void
dumpDeadEndConfigs(NoViableAltException nvae)
Used for debugging in adaptivePredict around execATN but I cut it out for clarity now that alg.protected boolean
evalSemanticContext(SemanticContext pred, ParserRuleContext parserCallStack, int alt, boolean fullCtx)
Evaluate a semantic context within a specific parser context.protected BitSet
evalSemanticContext(DFAState.PredPrediction[] predPredictions, ParserRuleContext outerContext, boolean complete)
Look through a list of predicate/alt pairs, returning alts for the pairs that win.protected int
execATN(DFA dfa, DFAState s0, TokenStream input, int startIndex, ParserRuleContext outerContext)
Performs ATN simulation to compute a predicted alternative based upon the remaining input, but also updates the DFA cache to avoid having to traverse the ATN again for the same input sequence.protected int
execATNWithFullContext(DFA dfa, DFAState D, ATNConfigSet s0, TokenStream input, int startIndex, ParserRuleContext outerContext)
protected int
getAltThatFinishedDecisionEntryRule(ATNConfigSet configs)
protected BitSet
getConflictingAlts(ATNConfigSet configs)
Gets aBitSet
containing the alternatives inconfigs
which are part of one or more conflicting alternative subsets.protected BitSet
getConflictingAltsOrUniqueAlt(ATNConfigSet configs)
Sam pointed out a problem with the previous definition, v3, of ambiguous states.protected ATNConfig
getEpsilonTarget(ATNConfig config, Transition t, boolean collectPredicates, boolean inContext, boolean fullCtx, boolean treatEofAsEpsilon)
protected DFAState
getExistingTargetState(DFAState previousD, int t)
Get an existing target state for an edge in the DFA.String
getLookaheadName(TokenStream input)
Parser
getParser()
protected DFAState.PredPrediction[]
getPredicatePredictions(BitSet ambigAlts, SemanticContext[] altToPred)
PredictionMode
getPredictionMode()
protected SemanticContext[]
getPredsForAmbigAlts(BitSet ambigAlts, ATNConfigSet configs, int nalts)
protected ATNState
getReachableTarget(Transition trans, int ttype)
String
getRuleName(int index)
static String
getSafeEnv(String envName)
protected int
getSynValidOrSemInvalidAltThatFinishedDecisionEntryRule(ATNConfigSet configs, ParserRuleContext outerContext)
This method is used to improve the localization of error messages by choosing an alternative rather than throwing aNoViableAltException
in particular prediction scenarios where theATNSimulator.ERROR
state was reached during ATN simulation.String
getTokenName(int t)
protected static int
getUniqueAlt(ATNConfigSet configs)
protected NoViableAltException
noViableAlt(TokenStream input, ParserRuleContext outerContext, ATNConfigSet configs, int startIndex)
ATNConfig
precedenceTransition(ATNConfig config, PrecedencePredicateTransition pt, boolean collectPredicates, boolean inContext, boolean fullCtx)
protected void
predicateDFAState(DFAState dfaState, DecisionState decisionState)
protected ATNConfig
predTransition(ATNConfig config, PredicateTransition pt, boolean collectPredicates, boolean inContext, boolean fullCtx)
protected ATNConfigSet
removeAllConfigsNotInRuleStopState(ATNConfigSet configs, boolean lookToEndOfRule)
Return a configuration set containing only the configurations fromconfigs
which are in aRuleStopState
.protected void
reportAmbiguity(DFA dfa, DFAState D, int startIndex, int stopIndex, boolean exact, BitSet ambigAlts, ATNConfigSet configs)
If context sensitive parsing, we know it's ambiguity not conflictprotected void
reportAttemptingFullContext(DFA dfa, BitSet conflictingAlts, ATNConfigSet configs, int startIndex, int stopIndex)
protected void
reportContextSensitivity(DFA dfa, int prediction, ATNConfigSet configs, int startIndex, int stopIndex)
void
reset()
protected ATNConfig
ruleTransition(ATNConfig config, RuleTransition t)
void
setPredictionMode(PredictionMode mode)
protected Pair<ATNConfigSet,ATNConfigSet>
splitAccordingToSemanticValidity(ATNConfigSet configs, ParserRuleContext outerContext)
Walk the list of configurations and split them according to those that have preds evaluating to true/false.-
Methods inherited from class org.antlr.v4.runtime.atn.ATNSimulator
getCachedContext, getSharedContextCache
-
-
-
-
Field Detail
-
debug
public static final boolean debug
- See Also:
- Constant Field Values
-
debug_list_atn_decisions
public static final boolean debug_list_atn_decisions
- See Also:
- Constant Field Values
-
dfa_debug
public static final boolean dfa_debug
- See Also:
- Constant Field Values
-
retry_debug
public static final boolean retry_debug
- See Also:
- Constant Field Values
-
TURN_OFF_LR_LOOP_ENTRY_BRANCH_OPT
public static final boolean TURN_OFF_LR_LOOP_ENTRY_BRANCH_OPT
Just in case this optimization is bad, add an ENV variable to turn it off
-
parser
protected final Parser parser
-
decisionToDFA
public final DFA[] decisionToDFA
-
mergeCache
protected DoubleKeyMap<PredictionContext,PredictionContext,PredictionContext> mergeCache
Each prediction operation uses a cache for merge of prediction contexts. Don't keep around as it wastes huge amounts of memory. DoubleKeyMap isn't synchronized but we're ok since two threads shouldn't reuse same parser/atnsim object because it can only handle one input at a time. This maps graphs a and b to merged result c. (a,b)→c. We can avoid the merge if we ever see a and b again. Note that (b,a)→c should also be examined during cache lookup.
-
_input
protected TokenStream _input
-
_startIndex
protected int _startIndex
-
_outerContext
protected ParserRuleContext _outerContext
-
_dfa
protected DFA _dfa
-
-
Constructor Detail
-
ParserATNSimulator
public ParserATNSimulator(ATN atn, DFA[] decisionToDFA, PredictionContextCache sharedContextCache)
Testing only!
-
ParserATNSimulator
public ParserATNSimulator(Parser parser, ATN atn, DFA[] decisionToDFA, PredictionContextCache sharedContextCache)
-
-
Method Detail
-
reset
public void reset()
- Specified by:
reset
in classATNSimulator
-
clearDFA
public void clearDFA()
Description copied from class:ATNSimulator
Clear the DFA cache used by the current instance. Since the DFA cache may be shared by multiple ATN simulators, this method may affect the performance (but not accuracy) of other parsers which are being used concurrently.- Overrides:
clearDFA
in classATNSimulator
-
adaptivePredict
public int adaptivePredict(TokenStream input, int decision, ParserRuleContext outerContext)
-
execATN
protected int execATN(DFA dfa, DFAState s0, TokenStream input, int startIndex, ParserRuleContext outerContext)
Performs ATN simulation to compute a predicted alternative based upon the remaining input, but also updates the DFA cache to avoid having to traverse the ATN again for the same input sequence. There are some key conditions we're looking for after computing a new set of ATN configs (proposed DFA state): if the set is empty, there is no viable alternative for current symbol does the state uniquely predict an alternative? does the state have a conflict that would prevent us from putting it on the work list? We also have some key operations to do: add an edge from previous DFA state to potentially new DFA state, D, upon current symbol but only if adding to work list, which means in all cases except no viable alternative (and possibly non-greedy decisions?) collecting predicates and adding semantic context to DFA accept states adding rule context to context-sensitive DFA accept states consuming an input symbol reporting a conflict reporting an ambiguity reporting a context sensitivity reporting insufficient predicates cover these cases: dead end single alt single alt + preds conflict conflict + preds
-
getExistingTargetState
protected DFAState getExistingTargetState(DFAState previousD, int t)
Get an existing target state for an edge in the DFA. If the target state for the edge has not yet been computed or is otherwise not available, this method returnsnull
.- Parameters:
previousD
- The current DFA statet
- The next input symbol- Returns:
- The existing target DFA state for the given input symbol
t
, ornull
if the target state for this edge is not already cached
-
computeTargetState
protected DFAState computeTargetState(DFA dfa, DFAState previousD, int t)
Compute a target state for an edge in the DFA, and attempt to add the computed state and corresponding edge to the DFA.- Parameters:
dfa
- The DFApreviousD
- The current DFA statet
- The next input symbol- Returns:
- The computed target DFA state for the given input symbol
t
. Ift
does not lead to a valid DFA state, this method returnsATNSimulator.ERROR
.
-
predicateDFAState
protected void predicateDFAState(DFAState dfaState, DecisionState decisionState)
-
execATNWithFullContext
protected int execATNWithFullContext(DFA dfa, DFAState D, ATNConfigSet s0, TokenStream input, int startIndex, ParserRuleContext outerContext)
-
computeReachSet
protected ATNConfigSet computeReachSet(ATNConfigSet closure, int t, boolean fullCtx)
-
removeAllConfigsNotInRuleStopState
protected ATNConfigSet removeAllConfigsNotInRuleStopState(ATNConfigSet configs, boolean lookToEndOfRule)
Return a configuration set containing only the configurations fromconfigs
which are in aRuleStopState
. If all configurations inconfigs
are already in a rule stop state, this method simply returnsconfigs
.When
lookToEndOfRule
is true, this method usesATN.nextTokens(org.antlr.v4.runtime.atn.ATNState, org.antlr.v4.runtime.RuleContext)
for each configuration inconfigs
which is not already in a rule stop state to see if a rule stop state is reachable from the configuration via epsilon-only transitions.- Parameters:
configs
- the configuration set to updatelookToEndOfRule
- when true, this method checks for rule stop states reachable by epsilon-only transitions from each configuration inconfigs
.- Returns:
configs
if all configurations inconfigs
are in a rule stop state, otherwise return a new configuration set containing only the configurations fromconfigs
which are in a rule stop state
-
computeStartState
protected ATNConfigSet computeStartState(ATNState p, RuleContext ctx, boolean fullCtx)
-
applyPrecedenceFilter
protected ATNConfigSet applyPrecedenceFilter(ATNConfigSet configs)
This method transforms the start state computed bycomputeStartState(org.antlr.v4.runtime.atn.ATNState, org.antlr.v4.runtime.RuleContext, boolean)
to the special start state used by a precedence DFA for a particular precedence value. The transformation process applies the following changes to the start state's configuration set.- Evaluate the precedence predicates for each configuration using
SemanticContext.evalPrecedence(org.antlr.v4.runtime.Recognizer<?, ?>, org.antlr.v4.runtime.RuleContext)
. - When
ATNConfig.isPrecedenceFilterSuppressed()
isfalse
, remove all configurations which predict an alternative greater than 1, for which another configuration that predicts alternative 1 is in the same ATN state with the same prediction context. This transformation is valid for the following reasons:- The closure block cannot contain any epsilon transitions which bypass the body of the closure, so all states reachable via alternative 1 are part of the precedence alternatives of the transformed left-recursive rule.
- The "primary" portion of a left recursive rule cannot contain an
epsilon transition, so the only way an alternative other than 1 can exist
in a state that is also reachable via alternative 1 is by nesting calls
to the left-recursive rule, with the outer calls not being at the
preferred precedence level. The
ATNConfig.isPrecedenceFilterSuppressed()
property marks ATN configurations which do not meet this condition, and therefore are not eligible for elimination during the filtering process.
The prediction context must be considered by this filter to address situations like the following.
grammar TA; prog: statement* EOF; statement: letterA | statement letterA 'b' ; letterA: 'a';
If the above grammar, the ATN state immediately before the token reference
'a'
inletterA
is reachable from the left edge of both the primary and closure blocks of the left-recursive rulestatement
. The prediction context associated with each of these configurations distinguishes between them, and prevents the alternative which stepped out toprog
(and then back in tostatement
from being eliminated by the filter.- Parameters:
configs
- The configuration set computed bycomputeStartState(org.antlr.v4.runtime.atn.ATNState, org.antlr.v4.runtime.RuleContext, boolean)
as the start state for the DFA.- Returns:
- The transformed configuration set representing the start state
for a precedence DFA at a particular precedence level (determined by
calling
Parser.getPrecedence()
).
- Evaluate the precedence predicates for each configuration using
-
getReachableTarget
protected ATNState getReachableTarget(Transition trans, int ttype)
-
getPredsForAmbigAlts
protected SemanticContext[] getPredsForAmbigAlts(BitSet ambigAlts, ATNConfigSet configs, int nalts)
-
getPredicatePredictions
protected DFAState.PredPrediction[] getPredicatePredictions(BitSet ambigAlts, SemanticContext[] altToPred)
-
getSynValidOrSemInvalidAltThatFinishedDecisionEntryRule
protected int getSynValidOrSemInvalidAltThatFinishedDecisionEntryRule(ATNConfigSet configs, ParserRuleContext outerContext)
This method is used to improve the localization of error messages by choosing an alternative rather than throwing aNoViableAltException
in particular prediction scenarios where theATNSimulator.ERROR
state was reached during ATN simulation.The default implementation of this method uses the following algorithm to identify an ATN configuration which successfully parsed the decision entry rule. Choosing such an alternative ensures that the
ParserRuleContext
returned by the calling rule will be complete and valid, and the syntax error will be reported later at a more localized location.- If a syntactically valid path or paths reach the end of the decision rule and they are semantically valid if predicated, return the min associated alt.
- Else, if a semantically invalid but syntactically valid path exist or paths exist, return the minimum associated alt.
- Otherwise, return
ATN.INVALID_ALT_NUMBER
.
In some scenarios, the algorithm described above could predict an alternative which will result in a
FailedPredicateException
in the parser. Specifically, this could occur if the only configuration capable of successfully parsing to the end of the decision rule is blocked by a semantic predicate. By choosing this alternative withinadaptivePredict(org.antlr.v4.runtime.TokenStream, int, org.antlr.v4.runtime.ParserRuleContext)
instead of throwing aNoViableAltException
, the resultingFailedPredicateException
in the parser will identify the specific predicate which is preventing the parser from successfully parsing the decision rule, which helps developers identify and correct logic errors in semantic predicates.- Parameters:
configs
- The ATN configurations which were valid immediately before theATNSimulator.ERROR
state was reachedouterContext
- The is the \gamma_0 initial parser context from the paper or the parser stack at the instant before prediction commences.- Returns:
- The value to return from
adaptivePredict(org.antlr.v4.runtime.TokenStream, int, org.antlr.v4.runtime.ParserRuleContext)
, orATN.INVALID_ALT_NUMBER
if a suitable alternative was not identified andadaptivePredict(org.antlr.v4.runtime.TokenStream, int, org.antlr.v4.runtime.ParserRuleContext)
should report an error instead.
-
getAltThatFinishedDecisionEntryRule
protected int getAltThatFinishedDecisionEntryRule(ATNConfigSet configs)
-
splitAccordingToSemanticValidity
protected Pair<ATNConfigSet,ATNConfigSet> splitAccordingToSemanticValidity(ATNConfigSet configs, ParserRuleContext outerContext)
Walk the list of configurations and split them according to those that have preds evaluating to true/false. If no pred, assume true pred and include in succeeded set. Returns Pair of sets. Create a new set so as not to alter the incoming parameter. Assumption: the input stream has been restored to the starting point prediction, which is where predicates need to evaluate.
-
evalSemanticContext
protected BitSet evalSemanticContext(DFAState.PredPrediction[] predPredictions, ParserRuleContext outerContext, boolean complete)
Look through a list of predicate/alt pairs, returning alts for the pairs that win. ANONE
predicate indicates an alt containing an unpredicated config which behaves as "always true." If !complete then we stop at the first predicate that evaluates to true. This includes pairs with null predicates.
-
evalSemanticContext
protected boolean evalSemanticContext(SemanticContext pred, ParserRuleContext parserCallStack, int alt, boolean fullCtx)
Evaluate a semantic context within a specific parser context.This method might not be called for every semantic context evaluated during the prediction process. In particular, we currently do not evaluate the following but it may change in the future:
- Precedence predicates (represented by
SemanticContext.PrecedencePredicate
) are not currently evaluated through this method. - Operator predicates (represented by
SemanticContext.AND
andSemanticContext.OR
) are evaluated as a single semantic context, rather than evaluating the operands individually. Implementations which require evaluation results from individual predicates should override this method to explicitly handle evaluation of the operands within operator predicates.
- Parameters:
pred
- The semantic context to evaluateparserCallStack
- The parser context in which to evaluate the semantic contextalt
- The alternative which is guarded bypred
fullCtx
-true
if the evaluation is occurring during LL prediction; otherwise,false
if the evaluation is occurring during SLL prediction- Since:
- 4.3
- Precedence predicates (represented by
-
closure
protected void closure(ATNConfig config, ATNConfigSet configs, Set<ATNConfig> closureBusy, boolean collectPredicates, boolean fullCtx, boolean treatEofAsEpsilon)
-
closureCheckingStopState
protected void closureCheckingStopState(ATNConfig config, ATNConfigSet configs, Set<ATNConfig> closureBusy, boolean collectPredicates, boolean fullCtx, int depth, boolean treatEofAsEpsilon)
-
closure_
protected void closure_(ATNConfig config, ATNConfigSet configs, Set<ATNConfig> closureBusy, boolean collectPredicates, boolean fullCtx, int depth, boolean treatEofAsEpsilon)
Do the actual work of walking epsilon edges
-
canDropLoopEntryEdgeInLeftRecursiveRule
protected boolean canDropLoopEntryEdgeInLeftRecursiveRule(ATNConfig config)
Implements first-edge (loop entry) elimination as an optimization during closure operations. See antlr/antlr4#1398. The optimization is to avoid adding the loop entry config when the exit path can only lead back to the same StarLoopEntryState after popping context at the rule end state (traversing only epsilon edges, so we're still in closure, in this same rule). We need to detect any state that can reach loop entry on epsilon w/o exiting rule. We don't have to look at FOLLOW links, just ensure that all stack tops for config refer to key states in LR rule. To verify we are in the right situation we must first check closure is at a StarLoopEntryState generated during LR removal. Then we check that each stack top of context is a return state from one of these cases: 1. 'not' expr, '(' type ')' expr. The return state points at loop entry state 2. expr op expr. The return state is the block end of internal block of (...)* 3. 'between' expr 'and' expr. The return state of 2nd expr reference. That state points at block end of internal block of (...)*. 4. expr '?' expr ':' expr. The return state points at block end, which points at loop entry state. If any is true for each stack top, then closure does not add a config to the current config set for edge[0], the loop entry branch. Conditions fail if any context for the current config is: a. empty (we'd fall out of expr to do a global FOLLOW which could even be to some weird spot in expr) or, b. lies outside of expr or, c. lies within expr but at a state not the BlockEndState generated during LR removal Do we need to evaluate predicates ever in closure for this case? No. Predicates, including precedence predicates, are only evaluated when computing a DFA start state. I.e., only before the lookahead (but not parser) consumes a token. There are no epsilon edges allowed in LR rule alt blocks or in the "primary" part (ID here). If closure is in StarLoopEntryState any lookahead operation will have consumed a token as there are no epsilon-paths that lead to StarLoopEntryState. We do not have to evaluate predicates therefore if we are in the generated StarLoopEntryState of a LR rule. Note that when making a prediction starting at that decision point, decision d=2, compute-start-state performs closure starting at edges[0], edges[1] emanating from StarLoopEntryState. That means it is not performing closure on StarLoopEntryState during compute-start-state. How do we know this always gives same prediction answer? Without predicates, loop entry and exit paths are ambiguous upon remaining input +b (in, say, a+b). Either paths lead to valid parses. Closure can lead to consuming + immediately or by falling out of this call to expr back into expr and loop back again to StarLoopEntryState to match +b. In this special case, we choose the more efficient path, which is to take the bypass path. The lookahead language has not changed because closure chooses one path over the other. Both paths lead to consuming the same remaining input during a lookahead operation. If the next token is an operator, lookahead will enter the choice block with operators. If it is not, lookahead will exit expr. Same as if closure had chosen to enter the choice block immediately. Closure is examining one config (some loopentrystate, some alt, context) which means it is considering exactly one alt. Closure always copies the same alt to any derived configs. How do we know this optimization doesn't mess up precedence in our parse trees? Looking through expr from left edge of stat only has to confirm that an input, say, a+b+c; begins with any valid interpretation of an expression. The precedence actually doesn't matter when making a decision in stat seeing through expr. It is only when parsing rule expr that we must use the precedence to get the right interpretation and, hence, parse tree.- Since:
- 4.6
-
getRuleName
public String getRuleName(int index)
-
getEpsilonTarget
protected ATNConfig getEpsilonTarget(ATNConfig config, Transition t, boolean collectPredicates, boolean inContext, boolean fullCtx, boolean treatEofAsEpsilon)
-
actionTransition
protected ATNConfig actionTransition(ATNConfig config, ActionTransition t)
-
precedenceTransition
public ATNConfig precedenceTransition(ATNConfig config, PrecedencePredicateTransition pt, boolean collectPredicates, boolean inContext, boolean fullCtx)
-
predTransition
protected ATNConfig predTransition(ATNConfig config, PredicateTransition pt, boolean collectPredicates, boolean inContext, boolean fullCtx)
-
ruleTransition
protected ATNConfig ruleTransition(ATNConfig config, RuleTransition t)
-
getConflictingAlts
protected BitSet getConflictingAlts(ATNConfigSet configs)
Gets aBitSet
containing the alternatives inconfigs
which are part of one or more conflicting alternative subsets.- Parameters:
configs
- TheATNConfigSet
to analyze.- Returns:
- The alternatives in
configs
which are part of one or more conflicting alternative subsets. Ifconfigs
does not contain any conflicting subsets, this method returns an emptyBitSet
.
-
getConflictingAltsOrUniqueAlt
protected BitSet getConflictingAltsOrUniqueAlt(ATNConfigSet configs)
Sam pointed out a problem with the previous definition, v3, of ambiguous states. If we have another state associated with conflicting alternatives, we should keep going. For example, the following grammar s : (ID | ID ID?) ';' ; When the ATN simulation reaches the state before ';', it has a DFA state that looks like: [12|1|[], 6|2|[], 12|2|[]]. Naturally 12|1|[] and 12|2|[] conflict, but we cannot stop processing this node because alternative to has another way to continue, via [6|2|[]]. The key is that we have a single state that has config's only associated with a single alternative, 2, and crucially the state transitions among the configurations are all non-epsilon transitions. That means we don't consider any conflicts that include alternative 2. So, we ignore the conflict between alts 1 and 2. We ignore a set of conflicting alts when there is an intersection with an alternative associated with a single alt state in the state→config-list map. It's also the case that we might have two conflicting configurations but also a 3rd nonconflicting configuration for a different alternative: [1|1|[], 1|2|[], 8|3|[]]. This can come about from grammar: a : A | A | A B ; After matching input A, we reach the stop state for rule A, state 1. State 8 is the state right before B. Clearly alternatives 1 and 2 conflict and no amount of further lookahead will separate the two. However, alternative 3 will be able to continue and so we do not stop working on this state. In the previous example, we're concerned with states associated with the conflicting alternatives. Here alt 3 is not associated with the conflicting configs, but since we can continue looking for input reasonably, I don't declare the state done. We ignore a set of conflicting alts when we have an alternative that we still need to pursue.
-
getTokenName
public String getTokenName(int t)
-
getLookaheadName
public String getLookaheadName(TokenStream input)
-
dumpDeadEndConfigs
public void dumpDeadEndConfigs(NoViableAltException nvae)
Used for debugging in adaptivePredict around execATN but I cut it out for clarity now that alg. works well. We can leave this "dead" code for a bit.
-
noViableAlt
protected NoViableAltException noViableAlt(TokenStream input, ParserRuleContext outerContext, ATNConfigSet configs, int startIndex)
-
getUniqueAlt
protected static int getUniqueAlt(ATNConfigSet configs)
-
addDFAEdge
protected DFAState addDFAEdge(DFA dfa, DFAState from, int t, DFAState to)
Add an edge to the DFA, if possible. This method callsaddDFAState(org.antlr.v4.runtime.dfa.DFA, org.antlr.v4.runtime.dfa.DFAState)
to ensure theto
state is present in the DFA. Iffrom
isnull
, or ift
is outside the range of edges that can be represented in the DFA tables, this method returns without adding the edge to the DFA.If
to
isnull
, this method returnsnull
. Otherwise, this method returns theDFAState
returned by callingaddDFAState(org.antlr.v4.runtime.dfa.DFA, org.antlr.v4.runtime.dfa.DFAState)
for theto
state.- Parameters:
dfa
- The DFAfrom
- The source state for the edget
- The input symbolto
- The target state for the edge- Returns:
- If
to
isnull
, this method returnsnull
; otherwise this method returns the result of callingaddDFAState(org.antlr.v4.runtime.dfa.DFA, org.antlr.v4.runtime.dfa.DFAState)
onto
-
addDFAState
protected DFAState addDFAState(DFA dfa, DFAState D)
Add stateD
to the DFA if it is not already present, and return the actual instance stored in the DFA. If a state equivalent toD
is already in the DFA, the existing state is returned. Otherwise this method returnsD
after adding it to the DFA.If
D
isATNSimulator.ERROR
, this method returnsATNSimulator.ERROR
and does not change the DFA.- Parameters:
dfa
- The dfaD
- The DFA state to add- Returns:
- The state stored in the DFA. This will be either the existing
state if
D
is already in the DFA, orD
itself if the state was not already present.
-
reportAttemptingFullContext
protected void reportAttemptingFullContext(DFA dfa, BitSet conflictingAlts, ATNConfigSet configs, int startIndex, int stopIndex)
-
reportContextSensitivity
protected void reportContextSensitivity(DFA dfa, int prediction, ATNConfigSet configs, int startIndex, int stopIndex)
-
reportAmbiguity
protected void reportAmbiguity(DFA dfa, DFAState D, int startIndex, int stopIndex, boolean exact, BitSet ambigAlts, ATNConfigSet configs)
If context sensitive parsing, we know it's ambiguity not conflict
-
setPredictionMode
public final void setPredictionMode(PredictionMode mode)
-
getPredictionMode
public final PredictionMode getPredictionMode()
-
getParser
public Parser getParser()
- Since:
- 4.3
-
-