org.allenai.nlpstack.parse.poly

polyparser

package polyparser

Visibility
  1. Public
  2. All

Type Members

  1. case class ApplicabilitySignature(shift: Boolean, reduce: Boolean, left: Boolean, right: Boolean) extends ClassificationTask with Product with Serializable

    The ApplicabilitySignature is a ClassificationTask for which we are trying to predict the next transition, given that only a subset of possible transitions are applicable.

    The ApplicabilitySignature is a ClassificationTask for which we are trying to predict the next transition, given that only a subset of possible transitions are applicable.

    If we choose this as our ClassificationTask, we will train separate classifiers for parser states that have different ApplicabilitySignatures.

    shift

    true iff Shift is applicable

    reduce

    true iff Reduce is applicable

    left

    true iff LeftArc and InvertedLeftArc are both applicable (for any labeling)

    right

    true iff RightArc and InvertedRightArc are both applicable (for any labeling)

  2. case class ArcEagerForbiddenArcInterpretation(forbiddenArc: ForbiddenEdge) extends ParsingConstraintInterpretation with Product with Serializable

    The ArcEagerForbiddenArcInterpretation handles ForbiddenArc constraints for the arc-eager system.

    The ArcEagerForbiddenArcInterpretation handles ForbiddenArc constraints for the arc-eager system. In other words, it translates these constraints into a function that returns true for any (state, transition) pair that violates the constraint.

    forbiddenArc

    the forbidden arc constraint to consider

  3. class ArcEagerGuidedCostFunction extends StateCostFunction

    The ArcEagerGuidedCostFunction uses a gold parse tree to make deterministic decisions about which transition to apply in any given state.

    The ArcEagerGuidedCostFunction uses a gold parse tree to make deterministic decisions about which transition to apply in any given state. Since the decision is uniquely determined by the gold parse, the returned map will have only a single mapping that assigns zero cost to the correct transition (all other transitions therefore have an implicit cost of infinity).

  4. case class ArcEagerLeftArc(label: Symbol = 'NONE) extends TransitionParserStateTransition with Product with Serializable

    The ArcEagerLeftArc operator creates an arc from the next buffer item to the stack top and then performs a Reduce (see above).

    The ArcEagerLeftArc operator creates an arc from the next buffer item to the stack top and then performs a Reduce (see above).

    label

    the label to attach to the created arc

  5. case class ArcEagerRequestedArcInterpretation(requestedArc: RequestedArc) extends ParsingConstraintInterpretation with Product with Serializable

    The ArcEagerRequestedArcInterpretation handles RequestedArc constraints for the arc-eager system.

    The ArcEagerRequestedArcInterpretation handles RequestedArc constraints for the arc-eager system. In other words, it translates these constraints into a function that returns true for any (state, transition) pair that violates the constraint.

    requestedArc

    the requested arc constraint to consider

  6. case class ArcEagerRightArc(label: Symbol = 'NONE) extends TransitionParserStateTransition with Product with Serializable

    The ArcEagerRightArc operator creates an arc from the stack top to the next buffer item and then performs a Shift (see above).

    The ArcEagerRightArc operator creates an arc from the stack top to the next buffer item and then performs a Shift (see above).

    label

    the label to attach to the created arc

  7. case class ArcEagerTransitionSystem(brownClusters: Seq[BrownClusters] = Seq()) extends DependencyParsingTransitionSystem with Product with Serializable

    The ArcEagerTransitionSystem has four transition operators: Shift, Reduce, RightArc, and LeftArc.

    The ArcEagerTransitionSystem has four transition operators: Shift, Reduce, RightArc, and LeftArc.

    The Shift operator pops the next buffer item and pushes it onto the stack.

    The Reduce operator pops the next stack item.

    The RightArc operator creates an arc from the top element of the stack to the next buffer item, then performs a Shift operation.

    The LeftArc operator creates an arc from the next buffer item to the top element of the stack, then performs a Reduce operation.

    brownClusters

    an optional set of Brown clusters to use for creating features

  8. case class ArcHybridForbiddenArcInterpretation(forbiddenArc: ForbiddenEdge) extends ParsingConstraintInterpretation with Product with Serializable

    The ArcHybridForbiddenArcInterpretation handles ForbiddenArc constraints for the arc hybrid system.

    The ArcHybridForbiddenArcInterpretation handles ForbiddenArc constraints for the arc hybrid system. In other words, it translates these constraints into a function that returns true for any (state, transition) pair that violates the constraint.

    forbiddenArc

    the forbidden arc constraint to consider

  9. case class ArcHybridForbiddenArcLabelInterpretation(forbiddenArcLabel: ForbiddenArcLabel) extends ParsingConstraintInterpretation with Product with Serializable

    The ArcHybridForbiddenArcLabelInterpretation handles ForbiddenArcLabel constraints for the arc hybrid system.

    The ArcHybridForbiddenArcLabelInterpretation handles ForbiddenArcLabel constraints for the arc hybrid system. In other words, it translates these constraints into a function that returns true for any (state, transition) pair that violates the constraint.

    forbiddenArcLabel

    the forbidden arc label request to consider

  10. class ArcHybridGuidedCostFunction extends StateCostFunction

    The ArcHybridGuidedCostFunction uses a gold parse tree to make deterministic decisions about which transition to apply in any given state.

    The ArcHybridGuidedCostFunction uses a gold parse tree to make deterministic decisions about which transition to apply in any given state. Since the decision is uniquely determined by the gold parse, the returned map will have only a single mapping that assigns zero cost to the correct transition (all other transitions therefore have an implicit cost of infinity).

  11. case class ArcHybridLeftArc(label: Symbol = 'NONE) extends TransitionParserStateTransition with Product with Serializable

    The ArcHybridLeftArc operator creates an arc from the next buffer item to the stack top and then performs a Reduce.

    The ArcHybridLeftArc operator creates an arc from the next buffer item to the stack top and then performs a Reduce.

    label

    the label to attach to the created arc

  12. case class ArcHybridRequestedArcInterpretation(requestedArc: RequestedArc) extends ParsingConstraintInterpretation with Product with Serializable

    The ArcHybridRequestedArcInterpretation handles RequestedArc constraints for the arc hybrid system.

    The ArcHybridRequestedArcInterpretation handles RequestedArc constraints for the arc hybrid system. In other words, it translates these constraints into a function that returns true for any (state, transition) pair that violates the constraint.

    requestedArc

    the requested arc cosntraint to consider

  13. case class ArcHybridRightArc(label: Symbol = 'NONE) extends TransitionParserStateTransition with Product with Serializable

    The ArcHybridRightArc operator creates an arc from the stack's second element to the stack top and then performs a Reduce.

    The ArcHybridRightArc operator creates an arc from the stack's second element to the stack top and then performs a Reduce.

    label

    the label to attach to the created arc

  14. case class ArcHybridTransitionSystem(brownClusters: Seq[BrownClusters] = Seq()) extends DependencyParsingTransitionSystem with Product with Serializable

    An ArcHybridTransitionSystem has three transition operators: Shift, RightArc, and LeftArc.

    An ArcHybridTransitionSystem has three transition operators: Shift, RightArc, and LeftArc.

    The Shift operator behaves the same as in the ArcEagerTransitionSystem: it pops the next buffer item and pushes it onto the stack.

    The RightArc operator creates an arc from the second element of the stack to the top element of the stack, then pops the top of the stack.

    The LeftArc operator creates an arc from the next buffer item to the top element of the stack, then pops the top of the stack.

    An important property of the ArcHybridTransitionSystem is that the only element that can get a breadcrumb via an operator is the top of the stack. Thus the stack top is the "focal point" of this transition system.

    brownClusters

    an optional set of Brown clusters to use for creating features

  15. class ArcInverter extends (PolytreeParse) ⇒ PolytreeParse

    The ArcInverter takes a PolytreeParse and inverts arcs whose labels are in the argument set inverseArcLabels.

    The ArcInverter takes a PolytreeParse and inverts arcs whose labels are in the argument set inverseArcLabels. Note that this operation should only affect the children field of a PolytreeParse, since the other fields only care about the underlying undirected tree.

    The purpose of this class is to convert standard dependency parses into polytree dependency parses. For instance, we may wish to invert all arcs x ---> y for which the arc label is 'det (effectively this would invert the relationship between a determiner and its noun to say that the determiner "requires" the noun, rather than vice-versa).

  16. case class BreadcrumbRef(index: Int) extends StateRef with Product with Serializable

    A BreadcrumbRef is a StateRef (see above) whose apply operation returns the breadcrumb of the indexth element of the stack, if it exists.

    A BreadcrumbRef is a StateRef (see above) whose apply operation returns the breadcrumb of the indexth element of the stack, if it exists.

    index

    the desired stack element, counting from 0 (i.e. 0 is the stack top)

  17. case class BrownTransform(clusters: BrownClusters, k: Int, name: String) extends NeighborhoodTransform with Product with Serializable

    Maps the tokens of a neighborhood to their respective Brown clusters.

    Maps the tokens of a neighborhood to their respective Brown clusters.

    clusters

    the Brown clusters

    k

    the maximum granularity we want to consider for a Brown cluster (i.e. the depth in the Brown cluster tree)

    name

    a label for the transform

  18. case class BufferChildrenRef(index: Int) extends StateRef with Product with Serializable

  19. case class BufferGretelsRef(index: Int) extends StateRef with Product with Serializable

  20. case class BufferLeftGretelsRef(index: Int) extends StateRef with Product with Serializable

  21. case class BufferRef(index: Int) extends StateRef with Product with Serializable

    A BufferRef is a StateRef (see above) whose apply operation returns the indexth element of the buffer, if it exists.

    A BufferRef is a StateRef (see above) whose apply operation returns the indexth element of the buffer, if it exists.

    index

    the desired buffer element, counting from 0 (i.e. 0 is the front of the buffer)

  22. case class BufferRightGretelsRef(index: Int) extends StateRef with Product with Serializable

  23. case class BufferWindowRef(index: Int) extends StateRef with Product with Serializable

  24. case class ConllX(useGoldPOSTags: Boolean, makePoly: Boolean = false) extends PolytreeParseFileFormat with Product with Serializable

  25. abstract class DependencyParsingTransitionSystem extends TransitionSystem

  26. case class EventStatisticFeatures(neighborhoodCounts: Seq[(String, NeighborhoodExtractor, Seq[(Neighborhood, Int)])], transforms: Seq[(String, NeighborhoodTransform)]) extends PolytreeParseFeature with Product with Serializable

    Generates a feature for each neighborhood histogram and transform in the argument list.

    Generates a feature for each neighborhood histogram and transform in the argument list.

    neighborhoodCounts

    the neighborhood histograms

    transforms

    the neighborhood transforms

  27. class ExtractorBasedNeighborhoodSource extends NeighborhoodSource

    Iterates through all neighborhoods from all parses in a PolytreeParseSource.

  28. case class FileBasedParsePoolSource(filename: String) extends ParsePoolSource with Product with Serializable

  29. case class FileBasedPolytreeParseSource(filename: String, format: PolytreeParseFileFormat) extends PolytreeParseSource with Product with Serializable

    Creates a data source from a file of parse trees.

    Creates a data source from a file of parse trees.

    filename

    the file containing the parse trees

    format

    the file format

  30. case class ForbiddenArcLabel(token1: Int, token2: Int, arcLabel: Symbol) extends TransitionConstraint with Product with Serializable

    A ForbiddenArcLabel constraint designates a transition as illegal if it would directly create an arc (in either direction) with the specified label between the tokens at the given indices.

    A ForbiddenArcLabel constraint designates a transition as illegal if it would directly create an arc (in either direction) with the specified label between the tokens at the given indices. It also implicitly creates a RequestedArc constraint for the specified arc (basically it says that we DO want an arc between the specified indices, just not with this label).

    Note that argument order (of the token indices) does not matter for the constructor.

    token1

    index of the first token

    token2

    index of the second token

    arcLabel

    label that is forbidden between the two tokens

  31. case class ForbiddenEdge(token1: Int, token2: Int) extends TransitionConstraint with Product with Serializable

    A ForbiddenEdge constraint designates a transition as illegal if it would directly create an arc (in either direction) between the tokens at the given indices.

    A ForbiddenEdge constraint designates a transition as illegal if it would directly create an arc (in either direction) between the tokens at the given indices.

    Note that argument order does not matter for the constructor.

    token1

    index of the first token

    token2

    index of the second token

  32. case class GoldParseSource(goldParses: PolytreeParseSource, transitionSystem: TransitionSystem) extends StateSource with Product with Serializable

    A GoldParseSource reduces parse trees to states of a finite-state machine.

    A GoldParseSource reduces parse trees to states of a finite-state machine.

    goldParses

    the source for the parse trees

    transitionSystem

    the transition system to use (for generating states)

  33. case class GoldParseTrainingVectorSource(goldParses: PolytreeParseSource, transitionSystem: TransitionSystem, baseCostFunction: Option[StateCostFunction] = None) extends FSMTrainingVectorSource with Product with Serializable

    A GoldParseTrainingVectorSource reduces a gold parse tree to a set of feature vectors for classifier training.

    A GoldParseTrainingVectorSource reduces a gold parse tree to a set of feature vectors for classifier training.

    Essentially, we derive the 2*n parser states that lead to the gold parse. Each of these states becomes a feature vector (using the apply method of the provided TransitionParserFeature), labeled with the transition executed from that state in the gold parse.

    One of the constructor arguments is a TaskIdentifer. This will dispatch the feature vectors to train different classifiers. For instance, if taskIdentifier(state) != taskIdentifier(state2), then their respective feature vectors (i.e. feature(state) and feature(state2)) will be used to train different classifiers.

    goldParses

    the data source for the parse trees

    transitionSystem

    the transition system to use (for generating states)

    baseCostFunction

    a trained cost function to adapt (optional)

  34. case class InMemoryPolytreeParseSource(parses: Iterable[PolytreeParse]) extends PolytreeParseSource with Product with Serializable

  35. case class KeywordFeature(keywords: Set[Symbol]) extends TokenFeature with Product with Serializable

    The KeywordFeature maps a token to its word representation, if its word appears in the argument set keywords.

    The KeywordFeature maps a token to its word representation, if its word appears in the argument set keywords. Otherwise its apply function will return an empty set.

    See the definition of TokenFeature (above) for more details about the interface.

  36. case class KeywordTransform(keywords: Set[Symbol]) extends TokenTransform with Product with Serializable

    The KeywordTransform maps a token to its word representation, if its word appears in the argument set keywords.

    The KeywordTransform maps a token to its word representation, if its word appears in the argument set keywords. Otherwise its apply function will return an empty set (if the StateRef points to a valid token) or TokenTransform.noTokenHere (if the StateRef points to an invalid token)

    See the definition of TokenTransform (above) for more details about the interface.

  37. case class LabelLeftArc(label: Symbol) extends TransitionParserStateTransition with Product with Serializable

    The LabelLeftArc operator labels the most recently created left-facing arc.

  38. case class LabelRightArc(label: Symbol) extends TransitionParserStateTransition with Product with Serializable

    The LabelRightArc operator labels the most recently created right-facing arc.

  39. case class LinearParseRerankingFunction(feature: PolytreeParseFeature, linearModel: Option[LinearModel]) extends RerankingFunction with Product with Serializable

    Scores parse trees based on a linear combination of features.

  40. case class MultiPolytreeParseSource(parseSources: Iterable[PolytreeParseSource]) extends PolytreeParseSource with Product with Serializable

  41. class NbestParser extends AnyRef

    Gets the n-best greedy parses for a given sentence.

  42. case class Neighborhood(tokens: Seq[Token]) extends Product with Serializable

    A Neighborhood is a sequence of tokens, generally taken from a parse tree.

    A Neighborhood is a sequence of tokens, generally taken from a parse tree.

    For instance, one might want to consider neighborhoods like: - a node and its children - a node and its parents - a node and its breadcrumb

    tokens

    a sequence of tokens, usually associated in some way (see NeighborhoodExtractors for examples of such associations)

  43. case class NeighborhoodEventStatistic(name: String, neighborhoodCounts: Seq[(Neighborhood, Int)], eventTransform: NeighborhoodTransform) extends Product with Serializable

    Collects statistics over "neighborhood events."

    Collects statistics over "neighborhood events."

    An example might help. A neighborhood is a collection of tokens, e.g. a node and its children in a dependency parse. A neighborhood event is a mapping of these tokens to a sequence of strings, e.g. we might map each token to its part-of-speech tag.

    Given a corpus of dependency parses, we might want to collect a histogram that tells us how many times each neighborhood event like (VERB, NOUN, NOUN) occurs in the corpus. This is what the NeighborhoodEventStatistic does.

    name

    a label for this object

    neighborhoodCounts

    a histogram over observed neighborhoods

    eventTransform

    a transformation from neighborhoods to events

  44. trait NeighborhoodExtractor extends (PolytreeParse) ⇒ Iterator[Neighborhood]

    Maps a parse tree to an iterator over its neighborhoods.

    Maps a parse tree to an iterator over its neighborhoods.

    Different extractors will define "neighborhood" in different ways. For instance, one might want to consider neighborhoods like: - a node and its children - a node and its parents - a node and its breadcrumb

    TODO: create unit tests for all inheriting instances.

  45. trait NeighborhoodSource extends AnyRef

    A data source for neighborhoods.

  46. trait NeighborhoodTransform extends (Neighborhood) ⇒ Seq[String]

    A NeighborhoodTransform maps a Neighborhood into an "event" (a sequence of strings).

    A NeighborhoodTransform maps a Neighborhood into an "event" (a sequence of strings).

    An example might help. Suppose that we have a neighborhood consisting of (node, child1, child2), i.e. three nodes of a parse tree. A transform might map these to the sequence of their POS tags, e.g. ("VERB", "NOUN", "NOUN").

  47. case class NumChildrenToTheLeft(max: Int) extends TokenTransform with Product with Serializable

    The NumChildrenToTheLeft transform maps a token to how many of its children appear to its left in the state's tokens sequence.

    The NumChildrenToTheLeft transform maps a token to how many of its children appear to its left in the state's tokens sequence.

    It takes an argument max which allows you to specify an upper bound. For instance, if max = 3 and a token has 5 children, then applying this transform to that token will return Set(Symbol("3")), not Set(Symbol("5")).

    See the definition of TokenTransform (above) for more details about the interface.

    max

    an upper bound on the number of children (anything higher will round down to max)

  48. case class NumChildrenToTheRight(max: Int) extends TokenTransform with Product with Serializable

    The NumChildrenToTheRight transform maps a token to how many of its children appear to its right in the state's tokens sequence.

    The NumChildrenToTheRight transform maps a token to how many of its children appear to its right in the state's tokens sequence. This will only be relevant for nodes on the stack (it is impossible for a buffer node to be associated with nodes to its right)

    It takes an argument max which allows you to specify an upper bound. For instance, if max = 3 and a token has 5 children to its right, then applying this transform to that token will return Set(Symbol("3")), not Set(Symbol("5")).

    See the definition of TokenTransform (above) for more details about the interface.

    max

    an upper bound on the number of children (anything higher will round down to max)

  49. case class OfflineTokenFeature(stateRef: StateRef) extends StateFeature with Product with Serializable

  50. case class OracleRerankingFunction(goldParses: Iterator[PolytreeParse]) extends RerankingFunction with Product with Serializable

  51. case class ParseCache(cachedParses: Seq[(String, PolytreeParse)], fallbackParser: TransitionParser) extends TransitionParser with Product with Serializable

  52. case class ParsePool(parses: Iterable[(PolytreeParse, Double)]) extends Product with Serializable

    A ParsePool is a collection of parse candidates for the same input sentence.

    A ParsePool is a collection of parse candidates for the same input sentence.

    parses

    a sequence of parse trees

  53. trait ParsePoolSource extends AnyRef

    A data source for ParsePool objects.

  54. case class ParserConfiguration(parsingCostFunction: StateCostFunction, rerankingFunction: RerankingFunction, parsingNbestSize: Int) extends Product with Serializable

    Contains the key components of a parser (for serialization purposes).

    Contains the key components of a parser (for serialization purposes).

    parsingCostFunction

    the cost function for the transition parser param labelingCostFunction the cost function for the arc labeler

    rerankingFunction

    the cost function for parse reranking

    parsingNbestSize

    the nbest size to generate for reranking

  55. abstract class ParsingConstraintInterpretation extends ConstraintInterpretation

    A ParsingConstraintInterpretation is a ConstraintInterpretation that fires only on TransitionParserState objects.

  56. case class PolytreeParse(sentence: Sentence, breadcrumb: Vector[Int], children: Vector[Set[Int]], arclabels: Vector[Set[(Int, Symbol)]]) extends MarbleBlock with Sculpture with Product with Serializable

    A PolytreeParse is a polytree-structured dependency parse.

    A PolytreeParse is a polytree-structured dependency parse. A polytree is a directed graph whose undirected structure is a tree. The nodes of this graph will correspond to an indexed sequence of tokens (think the words from a sentence), whose zeroth element is a reserved 'nexus' token which does not correspond to a word in the original sentence. The nexus must be one of the roots of the directed graph (i.e. it cannot be the child of any node).

    Since the undirected structure is a tree, every node (other than the nexus) has a unique neighbor which is one step closer to the nexus than itself (this may be the nexus itself). This neighbor is referred to as the node's 'breadcrumb'.

    It has four major fields: - tokens is a vector of Token objects (in the order that they appear in the associated sentence). The zeroth element is assumed to be the nexus. - breadcrumb tells you the unique neighbor that is closer to the nexus in the undirected tree (this can be the nexus itself); for instance, if breadcrumb(5) = 3, then token 3 is one step closer to the nexus from token 5. The breadcrumb of the nexus should be -1. - children tells you the set of children of a node in the polytree; for instance, if children(5) = Set(3,6,7), then token 5 has three children: tokens 3, 6, and 7 - arclabels tells you the labeled neighbors of a node in the undirected tree; for instance, if arclabels(5) = Set((4, 'det), (7, 'amod)), then token 5 has two neighbors, reached with arcs labeled 'det and 'amod (the labels are scala Symbol objects)

    sentence

    the parsed sentence (the zeroth token of which should be the nexus)

    breadcrumb

    the breadcrumb of each token (see above definition)

    children

    the set of children of each token in the polytree

    arclabels

    the set of labeled neighbors of each token in the undirected tree

  57. abstract class PolytreeParseFeature extends (PolytreeParse, Double) ⇒ FeatureVector

    Maps a scored parse into a feature vector.

  58. case class PolytreeParseFeatureUnion(features: Seq[PolytreeParseFeature]) extends PolytreeParseFeature with Product with Serializable

    A PolytreeParseFeatureUnion merges the output of a list of features.

    A PolytreeParseFeatureUnion merges the output of a list of features.

    features

    a list of the features we want to merge into a single feature

  59. sealed abstract class PolytreeParseFileFormat extends AnyRef

  60. trait PolytreeParseSource extends SentenceSource

    A data source for PolytreeParse objects.

  61. case class PrefixFeature(keyprefixes: Seq[Symbol]) extends TokenFeature with Product with Serializable

    The PrefixFeature maps a token to the set of its prefixes that are contained in a set of "key" prefixes.

    The PrefixFeature maps a token to the set of its prefixes that are contained in a set of "key" prefixes.

    See the definition of TokenFeature (above) for more details about the interface.

    keyprefixes

    the set of prefixes to treat as "key" prefixes

  62. case class PrefixTransform(keyprefixes: Set[Symbol]) extends TokenTransform with Product with Serializable

    The PrefixTransform maps a token to the set of its prefixes that are contained in a set of "key" prefixes.

    The PrefixTransform maps a token to the set of its prefixes that are contained in a set of "key" prefixes.

    See the definition of TokenTransform (above) for more details about the interface.

    keyprefixes

    the set of prefixes to treat as "key" prefixes

  63. case class RequestedArc(token1: Int, token2: Int, arcLabel: Option[Symbol] = None) extends TransitionConstraint with Product with Serializable

    A RequestedArc constraint requests that the output parse MUST contain the requested arc.

    A RequestedArc constraint requests that the output parse MUST contain the requested arc.

    The arc is specified using the index of the token at the arc's head followed by the index of the token at the arc's tail.

    Note: currently this constraint does not pay attention to the arc direction, nor the arc label. It only enforces that that there is some edge between the two specified tokens.

    token1

    index of the first token

    token2

    index of the second token

    arcLabel

    desired label for the arc

  64. case class RequestedCpos(tokenIndex: Int, cpos: Symbol) extends TransitionConstraint with Product with Serializable

    A RequestedCpos constraint specifies the coarse part-of-speech tag of a particular token.

    A RequestedCpos constraint specifies the coarse part-of-speech tag of a particular token. This means that in the returned parse, the 'cpos property for that token will correspond to the requested coarse tag.

    tokenIndex

    index of the desired token

    cpos

    desired coarse tag for the token

  65. case class RerankingTransitionParser(config: ParserConfiguration) extends TransitionParser with Product with Serializable

    Uses the parser model to create an n-best list, then chooses the best parse from this n-best list (according to the reranking function).

    Uses the parser model to create an n-best list, then chooses the best parse from this n-best list (according to the reranking function).

    config

    configuration object for the parser

  66. case class RootPathExtractor(maxPathLength: Int) extends NeighborhoodExtractor with Product with Serializable

    Extracts neighborhoods of the form (node, breadcrumb, grandcrumb, ..., root) from a parse tree.

  67. case class StackChildrenRef(index: Int) extends StateRef with Product with Serializable

  68. case class StackGretelsRef(index: Int) extends StateRef with Product with Serializable

  69. case class StackLeftGretelsRef(index: Int) extends StateRef with Product with Serializable

  70. case class StackRef(index: Int) extends StateRef with Product with Serializable

    A StackRef is a StateRef (see above) whose apply operation returns the indexth element of the stack, if it exists.

    A StackRef is a StateRef (see above) whose apply operation returns the indexth element of the stack, if it exists.

    index

    the desired stack element, counting from 0 (i.e. 0 is the stack top)

  71. case class StackRightGretelsRef(index: Int) extends StateRef with Product with Serializable

  72. case class StackWindowRef(index: Int) extends StateRef with Product with Serializable

  73. sealed abstract class StateRef extends (TransitionParserState) ⇒ Seq[Int]

    A StateRef allows you to figure out the token that corresponds to a particular aspect of a TransitionParserState.

    A StateRef allows you to figure out the token that corresponds to a particular aspect of a TransitionParserState.

    For instance, we may want to know what token is at the top of the stack for a given state. Applying StackRef(0) to the state will return the index of the token. More accurately, a set is returned, which will be empty if the StateRef refers to a non-existent element of the state. For instance, applying StackRef(3) to a state whose stack has 3 or fewer elements will return the empty set.

    This set of classes is used primarily to facilitate feature creation (e.g. see StateRefFeature).

  74. case class StateRefProperty(stateRef: StateRef, property: Symbol, propertyValue: String) extends ClassificationTask with Product with Serializable

    The StateRefProperty is a ClassificationTask for which we are trying to predict the next transition, given that we know some property of a particular token of the parser state.

  75. case class StateRefPropertyIdentifier(stateRef: StateRef, property: Symbol) extends TaskIdentifier with Product with Serializable

    The StateRefPropertyIdentifier identifies the ClassificationTask of a parser state according to the coarse part-of-speech tag of a particular word of the state (as identified by a StateRef).

  76. case class SuffixFeature(keysuffixes: Seq[Symbol]) extends TokenFeature with Product with Serializable

    The SuffixFeature maps a token to the set of its suffixes that are contained in a set of "key" suffixes.

    The SuffixFeature maps a token to the set of its suffixes that are contained in a set of "key" suffixes.

    See the definition of TokenFeature (above) for more details about the interface.

    keysuffixes

    the set of suffixes to treat as "key" suffixes

  77. case class SuffixTransform(keysuffixes: Set[Symbol]) extends TokenTransform with Product with Serializable

    The SuffixTransform maps a token to the set of its suffixes that are contained in a set of "key" suffixes.

    The SuffixTransform maps a token to the set of its suffixes that are contained in a set of "key" suffixes.

    See the definition of TokenTransform (above) for more details about the interface.

    keysuffixes

    the set of suffixes to treat as "key" suffixes

  78. case class TokenCardinalityFeature(stateRefs: Seq[StateRef]) extends StateFeature with Product with Serializable

  79. sealed abstract class TokenFeature extends (Sentence, Int) ⇒ Seq[(FeatureName, Double)]

  80. class TokenFeatureTagger extends AnyRef

  81. case class TokenPropTransform(label: Symbol) extends NeighborhoodTransform with Product with Serializable

    Maps the tokens of a neighborhood to a particular property in their token's property map.

  82. case class TokenPropertyFeature(property: Symbol) extends TokenFeature with Product with Serializable

    The TokenPropertyFeature maps a token to one of its properties.

    The TokenPropertyFeature maps a token to one of its properties.

    See the definition of TokenFeature (above) for more details about the interface.

  83. case class TokenPropertyTransform(property: Symbol) extends TokenTransform with Product with Serializable

    The TokenPropertyTransform maps a token to one of its properties.

    The TokenPropertyTransform maps a token to one of its properties.

    See the definition of TokenTransform (above) for more details about the interface.

  84. sealed abstract class TokenTransform extends (TransitionParserState, Int) ⇒ Set[Symbol]

    A TokenTransform is a function that maps a token to a set of symbols.

    A TokenTransform is a function that maps a token to a set of symbols.

    The token is described using a TransitionParserState and a StateRef (see the definition of StateRef for details). For instance, using StackRef(0) will cause the TokenTransform to operate on the token at the top of the stack in the current parser state.

    The purpose of a TokenTransform is primarily to facilitate feature creation (e.g. see StackRefFeature) by allowing us, say for instance, to map the token at top of the state's stack to its word representation. This would be achieved with:

    WordTransform(state, StackRef(0))

  85. case class TokenTransformFeature(stateRef: StateRef, tokenTransforms: Set[TokenTransform]) extends StateFeature with Product with Serializable

    A TokenTransformFeature creates a TransitionParserFeature from a TokenTransform and a StateRef.

    A TokenTransformFeature creates a TransitionParserFeature from a TokenTransform and a StateRef.

    Essentially it simply applies the TokenTransform to the token referenced by the StateRef (see definitions of TokenTransform and StateRef for details).

    For instance, suppose we want a binary feature that gives us the word at the top of the stack. We can achieve this with TokenTransformFeature(StackRef(0), WordTransform).

    stateRef

    the StateRef that refers to our desired token

    tokenTransforms

    the transformation we want to perform on our desired token

  86. abstract class TransitionParser extends AnyRef

    A TransitionParser implements a parsing algorithm for a transition-based parser.

  87. case class TransitionParserState(stack: Vector[Int], bufferPosition: Int, breadcrumb: Map[Int, Int], children: Map[Int, Set[Int]], arcLabels: Map[Set[Int], Symbol], annotatedSentence: AnnotatedSentence, previousLink: Option[(Int, Int)] = None, parserMode: Int = 0) extends State with Product with Serializable

    A TransitionParserState captures the current state of a transition-based parser (i.e.

    A TransitionParserState captures the current state of a transition-based parser (i.e. it corresponds to a partially constructed PolytreeParse). It includes the following fields: - the stack holds the indices of the tokens (note: the index of a token is its index in the tokens vector) on the stack. It is a vector of integers. The head of the vector represents the top of the stack. - the bufferPosition is an integer representing the index of the token that is currently at the front of the buffer. - breadcrumb maps the index of a token to its breadcrumb (see org.allenai.nlpstack.parse.poly.polyparser.PolytreeParse for the definition of breadcrumb). If a token index does not appear as a key in breadcrumb, then its breadcrumb has not yet been determined. - children maps the index of a token to the indices of its children (in the partially constructed polytree). - arcLabels maps a pair of token indices to the label of the arc between them. This presupposes that the two tokens are neighbors in the partially constructed polytree. Note that the pair of token indices is represented as a Set, so order is irrelevant. - tokens is the sequence of tokens in the sentence we are trying to parse. This will be invariant for all states of a given parsing process.

    stack

    the indices of the token indices on the 'stack' (stack.head is the stack top)

    bufferPosition

    the index of the token at the front of the 'buffer'

    breadcrumb

    the breadcrumbs of the partially constructed PolytreeParse

    children

    the children of the partially constructed PolytreeParse

    arcLabels

    the arc labels of the partially constructed PolytreeParse

    annotatedSentence

    the sentence we want to parse

  88. abstract class TransitionParserStateTransition extends StateTransition

Value Members

  1. object ArcEagerReduce extends TransitionParserStateTransition with Product with Serializable

    The ArcEagerReduce operator pops the top stack item.

  2. object ArcEagerShift extends TransitionParserStateTransition with Product with Serializable

    The ArcEagerShift operator pops the next buffer item and pushes it onto the stack.

  3. object ArcEagerTaskIdentifier extends TaskIdentifier

    The ArcEagerTaskIdentifier identifies the ClassificationTask associated with a particular state of the arc-eager transition system.

  4. object ArcEagerTransitionSystem extends Product with Serializable

  5. object ArcHybridLeftArc extends Serializable

  6. object ArcHybridRightArc extends Serializable

  7. object ArcHybridShift extends TransitionParserStateTransition with Product with Serializable

    The ArcHybridShift operator pops the next buffer item and pushes it onto the stack.

  8. object ArcHybridTaskIdentifier extends TaskIdentifier

    The ArcHybridTaskIdentifier identifies the ClassificationTask associated with a particular state of the arc-hybrid transition system.

  9. object ArcHybridTransitionSystem extends Product with Serializable

  10. object BaseParserScoreFeature extends PolytreeParseFeature with Product with Serializable

    Simply passes along the original score of the parse as a feature.

  11. object BreadcrumbArc extends TokenTransform with Product with Serializable

    The BreadcrumbArc transform maps a token to the label of the arc from its breadcrumb to itself.

    The BreadcrumbArc transform maps a token to the label of the arc from its breadcrumb to itself.

    See the definition of TokenTransform (above) for more details about the interface.

  12. object BreadcrumbAssigned extends TokenTransform with Product with Serializable

    The BreadcrumbAssigned transform maps a token to whether its breadcrumb has been assigned.

    The BreadcrumbAssigned transform maps a token to whether its breadcrumb has been assigned.

    See the definition of TokenTransform (above) for more details about the interface.

  13. object BreadcrumbExtractor extends NeighborhoodExtractor with Product with Serializable

    Extracts neighborhoods of the form (node, breadcrumb) from a parse tree.

  14. object ChildrenExtractor extends NeighborhoodExtractor with Product with Serializable

    Extracts neighborhoods of the form (node, child1, ..., childN) from a parse tree.

  15. object DependencyParserModes

    A struct contains the "modes" of a typical transition parser.

    A struct contains the "modes" of a typical transition parser.

    Specifically, the mode tells you what the next expected action is.

  16. object DependencyParsingTransitionSystem

  17. object FileBasedParsePoolSource extends Serializable

  18. object FileBasedPolytreeParseSource extends Serializable

  19. object FirstRef extends StateRef with Product with Serializable

    A FirstRef is a StateRef (see above) whose apply operation returns the first element of the sentence.

  20. object InMemoryPolytreeParseSource extends Serializable

  21. object IsBracketedTransform extends TokenTransform with Product with Serializable

    The IsBracketedTransform maps a token to a symbol which is 'yes if its word appears between a pair of parentheses, 'no if it is outside of all parentheses pairs, '( if it is a left paren and ') if it is a right paren.

    The IsBracketedTransform maps a token to a symbol which is 'yes if its word appears between a pair of parentheses, 'no if it is outside of all parentheses pairs, '( if it is a left paren and ') if it is a right paren. It will return a TokenTransform.noTokenHere if the StateRef points to an invalid token.

    See the definition of TokenTransform (above) for more details about the interface.

  22. object LastRef extends StateRef with Product with Serializable

    A LastRef is a StateRef (see above) whose apply operation returns the final element of the sentence.

  23. object LeftChildrenExtractor extends NeighborhoodExtractor with Product with Serializable

    Extracts neighborhoods of the form (node, leftChild1, ..., leftChildN) from a parse tree.

  24. object MultiWordTagger extends Product with Serializable

    A function that adds new token properties to a sentence if that token appears within a multi-word expression in the dictionary.

    A function that adds new token properties to a sentence if that token appears within a multi-word expression in the dictionary. The new properties are

    MultiWordTagger.mweSymbol -> MultiWordTagger.mweValue

    and

    MultiWordTagger.symbolFor(mwe) -> MultiWordTagger.mweValue

    The first property encodes the the fact that the token appears within any MWE. The second property encodes the fact that the token appears within a particular MWE. Tokens that do not occur within a particular MWE will not be given any additional properties.

  25. object MultiWordTransform extends TokenPropertyTransform

  26. object NbestParser

  27. object Neighborhood extends Serializable

  28. object NeighborhoodEventStatistic extends Serializable

  29. object NeighborhoodExtractor

  30. object NeighborhoodTransform

  31. object OracleReranker

  32. object ParentExtractor extends NeighborhoodExtractor with Product with Serializable

    Extracts neighborhoods of the form (node, parent1, ..., parentN) from a parse tree.

  33. object ParseCache extends Serializable

  34. object ParseFile

  35. object ParsePool extends Serializable

  36. object ParseRerankerTrainingPhaseOne

  37. object ParseRerankerTrainingPhaseTwo

  38. object Parser

  39. object ParserConfiguration extends Serializable

  40. object PolytreeParse extends Serializable

  41. object PolytreeParseFeature

  42. object PreviousLinkCrumbGretelRef extends StateRef with Product with Serializable

  43. object PreviousLinkCrumbRef extends StateRef with Product with Serializable

  44. object PreviousLinkDirection extends StateFeature with Product with Serializable

  45. object PreviousLinkGrandgretelRef extends StateRef with Product with Serializable

  46. object PreviousLinkGretelRef extends StateRef with Product with Serializable

  47. object RightChildrenExtractor extends NeighborhoodExtractor with Product with Serializable

    Extracts neighborhoods of the form (node, rightChild1, ..., rightChildN) from a parse tree.

  48. object SentenceLengthFeature extends PolytreeParseFeature with Product with Serializable

    Simply passes along the length of the sentence as a feature.

  49. object StateRef

  50. object TokenFeature

  51. object TokenPositionFeature extends TokenFeature with Product with Serializable

  52. object TokenTransform

  53. object Training

  54. object TransitionParser

  55. object WordFeature extends TokenFeature with Product with Serializable

    The WordFeature maps a token to its word representation.

    The WordFeature maps a token to its word representation.

    See the definition of TokenFeature (above) for more details about the interface.

  56. object WordTransform extends TokenTransform with Product with Serializable

    The WordTransform maps a token to its word representation.

    The WordTransform maps a token to its word representation.

    See the definition of TokenTransform (above) for more details about the interface.

  57. package labeler

Ungrouped