A ClassificationTask specifies a particular classification task for which we want to collect feature vectors and train a classifier.
The DTCostFunctionTrainer uses the our in-house decision tree implementation (org.allenai.nlpstack.parse.poly.decisiontree) to train a StateCostFunction.
The DTCostFunctionTrainer uses the our in-house decision tree implementation (org.allenai.nlpstack.parse.poly.decisiontree) to train a StateCostFunction. Training is triggered during construction, after which the .costFunction field contains the trained StateCostFunction.
An EmbeddedClassifier wraps a org.allenai.nlpstack.parse.poly.decisiontree.ProbabilisticClassifier implementation to provide a classifier interface that maps Transitions to probabilities.
An EmbeddedClassifier wraps a org.allenai.nlpstack.parse.poly.decisiontree.ProbabilisticClassifier implementation to provide a classifier interface that maps Transitions to probabilities.
the underlying classifier
the possible outcomes of the underlying classifier
a list of the feature indices followed by their names
the number of features in the underlying classifier
A TrainingVector is a triple of the form (task, featureVector, transition), where
task
is the ClassificationTask associated with the feature vector (featureVector
), and
transition
is the correct classification of the feature vector.
A TrainingVector is a triple of the form (task, featureVector, transition), where
task
is the ClassificationTask associated with the feature vector (featureVector
), and
transition
is the correct classification of the feature vector.
These labeled feature vectors are used to train classifiers.
A FeatureUnion simply merges the output of a list of features.
A FeatureUnion simply merges the output of a list of features.
a list of the features we want to merge into a single feature
A StateSource that keeps all its states in memory.
A MarbleBlock is an unstructured input corresponding to a start state of a finite-state machine.
A MarbleBlock is an unstructured input corresponding to a start state of a finite-state machine. The goal of the finite-state machine is to find a final state (which correponds to a Sculpture, i.e. a structured output).
As an example, consider a transition-based parser. A MarbleBlock would be a sentence to be parsed, whereas a Sculpture would be a parse tree for that sentence.
A sequence of NbestLists.
A sequence of (scored) sculptures.
Finds the best n greedy paths through a finite-state machine.
Like the GreedyTransitionParser, except that it remembers promising transitions that were not taken from the greedy (one-best) walk and returns those to the user.
Chooses the lowest cost parse from an n-best list (according to the reranking function).
A cost function for a pre-scored parse.
A ScoredWalk attaches a score to a Walk.
A ScoredWalk attaches a score to a Walk.
the unscored Walk
the floating-point score
A Sculpture is a structured output corresponding to a final state of a finite-state machine, whose goal is to transform an unstructured input (a MarbleBlock) into a structured output.
A Sculpture is a structured output corresponding to a final state of a finite-state machine, whose goal is to transform an unstructured input (a MarbleBlock) into a structured output.
As an example, consider a transition-based parser. A MarbleBlock would be a sentence to be parsed, whereas a Sculpture would be a parse tree for that sentence.
A SculptureFeature computes a feature vector corresponding to a given sculpture.
A state of a finite-state machine.
A StateCost maps a state to a cost.
A StateCostFunction assigns a (real-valued) cost to the Transitions that can potentially be applied to a State.
A StateCostFunction assigns a (real-valued) cost to the Transitions that can potentially be applied to a State. Generally speaking: the lower the cost, the better the transition.
Typically, instances of StateCostFunction will compute this cost using a feature representation of the State. But this is not always the case -- see the GuidedCostFunction in org.allenai.nlpstack.parse.poly.polyparser.ArcEagerGuidedCostFunction for a cost function that uses a gold parse tree as the basis for its cost function.
A StateCostFunctionTrainer trains a StateCostFunction from data.
A StateCostFunctionTrainer trains a StateCostFunction from data. Training is triggered during construction, after which the .costFunction field contains the trained TransitionCostFunctionAndClassifier.
A StateFeature computes a feature vector corresponding to a given parser state.
The TaskConjunction is a conjunction of ClassificationTasks.
The TaskConjunction is a conjunction of ClassificationTasks.
the tasks we want to conjoin
The TaskConjunctionIdentifier allows you to create a TaskIdentifier by conjoining existing TaskIdentifiers.
The TaskConjunctionIdentifier allows you to create a TaskIdentifier by conjoining existing TaskIdentifiers. You will want to do this if you want to partition feature vectors according to multiple criteria at once.
the task identifiers you want to conjoin
A TaskIdentifier identifies the ClassificationTask required to determine the next transition from a given parser state.
A TaskTree can be viewed as a tree-structured TaskConjunctionIdentifier.
A TaskTree can be viewed as a tree-structured TaskConjunctionIdentifier. Recall that a TaskConjunctionIdentifier associates every TransitionParserState with a TaskConjunction.
To do this, each TaskTree is associated with an optional TaskIdentifier ident. If this is None, then it will associate every state with the trivial TaskConjunction (i.e. TaskConjunction(List())). Otherwise, it will compute the ClassificationTask returned by applying the TaskIdentifier to the state. If this task is not contained in its children map, then it will associate the state with TaskConjunction(List(ident)). Otherwise it will recursively call the child TaskTree on the state, and accumulate a TaskConjunction.
TODO: remove this code once all dependent models are gone
the (optional) TaskIdentifier associated with this tree
a mapping from ClassificationTasks (in the range of baseIdentifier
) to
TaskTrees
This is a wrapper for TaskTree that implements the TaskIdentifier interface.
This is a wrapper for TaskTree that implements the TaskIdentifier interface.
It exists mainly for ease of serialization. See the TaskTree documentation for more details on its functionality.
the task tree that drives this TaskIdentifier
A TransitionClassifier maps Transitions to probabilities.
A TransitionConstraint returns true if a given transition is illegal to apply in a given state.
A Walk is a walk through a finite-state machine.
A Walk is a walk through a finite-state machine.
the state in which we begin
the sequence of steps we take from the initial state
A WalkStep is a single step in an FSM walk.
A WalkStep is a single step in an FSM walk.
the current state
the transition to take param transitionCosts the costs of the possible transitions in the current state
Companion class for serializing TransitionClassifier instances.
A ClassificationTask specifies a particular classification task for which we want to collect feature vectors and train a classifier.
In practice, feature vectors will be tagged with their ClassificationTask, allowing us to easily sort a mixed set of feature vectors according to their relevant ClassificationTask, before training the respective classifiers.