Package

com.johnsnowlabs.nlp.annotators.sbd

pragmatic

Permalink

package pragmatic

Visibility
  1. Public
  2. All

Type Members

  1. class PragmaticContentFormatter extends AnyRef

    Permalink

    rule-based formatter that adds regex rules to different marking steps Symbols protect from ambiguous bounds to be considered splitters

  2. class PragmaticMethod extends Serializable

    Permalink

    Inspired on Kevin Dias, Ruby implementation: https://github.com/diasks2/pragmatic_segmenter This approach extracts sentence bounds by first formatting the data with RuleSymbols and then extracting bounds with a strong RegexBased rule application

  3. class PragmaticSentenceExtractor extends AnyRef

    Permalink

    Reads through symbolized data, and computes the bounds based on regex rules following symbol meaning

  4. trait RuleSymbols extends AnyRef

    Permalink

    Base Symbols that may be extended later on.

    Base Symbols that may be extended later on. For now kept in the pragmatic scope.

  5. class SentenceDetectorModel extends AnnotatorModel[SentenceDetectorModel]

    Permalink

    Annotator that detects sentence boundaries using any provided approach

Value Members

  1. object PragmaticDictionaries

    Permalink

    This is a dictionary that contains common english abbreviations that should be considered sentence bounds

  2. object PragmaticSymbols extends RuleSymbols

    Permalink

    Extends RuleSymbols with specific symbols used for the pragmatic approach.

    Extends RuleSymbols with specific symbols used for the pragmatic approach. Right now, the only one.

  3. object SentenceDetectorModel extends DefaultParamsReadable[SentenceDetectorModel] with Serializable

    Permalink

Ungrouped