Class/Object

org.clulab.processors.clu

CluProcessor

Related Docs: object CluProcessor | package clu

Permalink

class CluProcessor extends Processor with Configured

Processor that uses only tools that are under Apache License Currently supports: tokenization (in-house), lemmatization (Morpha, copied in our repo to minimize dependencies), POS tagging, NER, chunking, dependency parsing - using our MTL architecture (dep parsing coming soon)

Linear Supertypes
Configured, Processor, AnyRef, Any
Known Subclasses
Ordering
  1. Alphabetic
  2. By Inheritance
Inherited
  1. CluProcessor
  2. Configured
  3. Processor
  4. AnyRef
  5. Any
  1. Hide All
  2. Show All
Visibility
  1. Public
  2. All

Instance Constructors

  1. new CluProcessor(config: Config = ConfigFactory.load("cluprocessor"), optionalNER: Option[LexiconNER] = None, seasonPathOpt: Option[String] = None)

    Permalink
  2. new CluProcessor(config: Config, optionalNER: Option[LexiconNER], numericEntityRecognizerOpt: Option[NumericEntityRecognizer], internStringsOpt: Option[Boolean], localTokenizerOpt: Option[Tokenizer], lemmatizerOpt: Option[Lemmatizer], mtlPosChunkSrlpOpt: Option[Metal], mtlNerOpt: Option[Metal], mtlSrlaOpt: Option[Metal], mtlDepsOpt: Option[Metal])

    Permalink
    Attributes
    protected

Value Members

  1. final def !=(arg0: Any): Boolean

    Permalink
    Definition Classes
    AnyRef → Any
  2. final def ##(): Int

    Permalink
    Definition Classes
    AnyRef → Any
  3. final def ==(arg0: Any): Boolean

    Permalink
    Definition Classes
    AnyRef → Any
  4. def annotate(doc: Document): Document

    Permalink

    Annotate the given document, returning an annotated document.

    Annotate the given document, returning an annotated document. The default implementation is an NLP pipeline of side-effecting calls.

    Definition Classes
    CluProcessorProcessor
  5. def annotate(text: String, keepText: Boolean = false): Document

    Permalink

    Annotate the given text string, specify whether to retain the text in the resultant Document.

    Annotate the given text string, specify whether to retain the text in the resultant Document.

    Definition Classes
    Processor
  6. def annotateFromSentences(sentences: Iterable[String], keepText: Boolean = false): Document

    Permalink

    Annotate the given sentences, specify whether to retain the text in the resultant Document.

    Annotate the given sentences, specify whether to retain the text in the resultant Document.

    Definition Classes
    Processor
  7. def annotateFromTokens(sentences: Iterable[Iterable[String]], keepText: Boolean = false): Document

    Permalink

    Annotate the given tokens, specify whether to retain the text in the resultant Document.

    Annotate the given tokens, specify whether to retain the text in the resultant Document.

    Definition Classes
    Processor
  8. final def asInstanceOf[T0]: T0

    Permalink
    Definition Classes
    Any
  9. def basicSanityCheck(doc: Document): Unit

    Permalink
  10. def cheapLemmatize(doc: Document): Unit

    Permalink

    Generates cheap lemmas with the word in lower case, for languages where a lemmatizer is not available

  11. def chunking(doc: Document): Unit

    Permalink

    Shallow parsing; modifies the document in place

    Shallow parsing; modifies the document in place

    Definition Classes
    CluProcessorProcessor
  12. def clone(): AnyRef

    Permalink
    Attributes
    protected[java.lang]
    Definition Classes
    AnyRef
    Annotations
    @throws( ... )
  13. val config: Config

    Permalink
  14. def contains(argPath: String): Boolean

    Permalink
    Definition Classes
    Configured
  15. def copy(configOpt: Option[Config] = None, optionalNEROpt: Option[Option[LexiconNER]] = None, numericEntityRecognizerOptOpt: Option[Option[NumericEntityRecognizer]] = None, internStringsOptOpt: Option[Option[Boolean]] = None, localTokenizerOptOpt: Option[Option[Tokenizer]] = None, lemmatizerOptOpt: Option[Option[Lemmatizer]] = None, mtlPosChunkSrlpOptOpt: Option[Option[Metal]] = None, mtlNerOptOpt: Option[Option[Metal]] = None, mtlSrlaOptOpt: Option[Option[Metal]] = None, mtlDepsOptOpt: Option[Option[Metal]] = None): CluProcessor

    Permalink
  16. def discourse(doc: Document): Unit

    Permalink

    Discourse parsing; modifies the document in place

    Discourse parsing; modifies the document in place

    Definition Classes
    CluProcessorProcessor
  17. final def eq(arg0: AnyRef): Boolean

    Permalink
    Definition Classes
    AnyRef
  18. def equals(arg0: Any): Boolean

    Permalink
    Definition Classes
    AnyRef → Any
  19. def finalize(): Unit

    Permalink
    Attributes
    protected[java.lang]
    Definition Classes
    AnyRef
    Annotations
    @throws( classOf[java.lang.Throwable] )
  20. def getArgBoolean(argPath: String, defaultValue: Option[Boolean]): Boolean

    Permalink
    Definition Classes
    Configured
  21. def getArgFloat(argPath: String, defaultValue: Option[Float]): Float

    Permalink
    Definition Classes
    Configured
  22. def getArgInt(argPath: String, defaultValue: Option[Int]): Int

    Permalink
    Definition Classes
    Configured
  23. def getArgString(argPath: String, defaultValue: Option[String]): String

    Permalink
    Definition Classes
    Configured
  24. def getArgStrings(argPath: String, defaultValue: Option[Seq[String]]): Seq[String]

    Permalink
    Definition Classes
    Configured
  25. final def getClass(): Class[_]

    Permalink
    Definition Classes
    AnyRef → Any
  26. def getConf: Config

    Permalink
    Definition Classes
    CluProcessorConfigured
  27. def getPredicateIndexes(preds: IndexedSeq[String]): IndexedSeq[Int]

    Permalink

    Gets the index of all predicates in this sentence

  28. def hashCode(): Int

    Permalink
    Definition Classes
    AnyRef → Any
  29. val internStrings: Boolean

    Permalink
  30. final def isInstanceOf[T0]: Boolean

    Permalink
    Definition Classes
    Any
  31. def lemmatize(doc: Document): Unit

    Permalink

    Lematization; modifies the document in place

    Lematization; modifies the document in place

    Definition Classes
    CluProcessorProcessor
  32. lazy val lemmatizer: Lemmatizer

    Permalink
  33. lazy val localTokenizer: Tokenizer

    Permalink
    Attributes
    protected
  34. def mkConstEmbeddings(doc: Document): Unit

    Permalink
  35. def mkDocument(text: String, keepText: Boolean = false): Document

    Permalink

    Constructs a document of tokens from free text; includes sentence splitting and tokenization

    Constructs a document of tokens from free text; includes sentence splitting and tokenization

    Definition Classes
    CluProcessorProcessor
  36. def mkDocumentFromSentences(sentences: Iterable[String], keepText: Boolean = false, charactersBetweenSentences: Int = 1): Document

    Permalink

    Constructs a document of tokens from an array of untokenized sentences

    Constructs a document of tokens from an array of untokenized sentences

    Definition Classes
    CluProcessorProcessor
  37. def mkDocumentFromTokens(sentences: Iterable[Iterable[String]], keepText: Boolean = false, charactersBetweenSentences: Int = 1, charactersBetweenTokens: Int = 1): Document

    Permalink

    Constructs a document of tokens from an array of tokenized sentences

    Constructs a document of tokens from an array of tokenized sentences

    Definition Classes
    CluProcessorProcessor
  38. lazy val mtlDeps: Metal

    Permalink
  39. lazy val mtlNer: Metal

    Permalink
  40. lazy val mtlPosChunkSrlp: Metal

    Permalink
  41. lazy val mtlSrla: Metal

    Permalink
  42. final def ne(arg0: AnyRef): Boolean

    Permalink
    Definition Classes
    AnyRef
  43. def nerSentence(words: Array[String], lemmas: Option[Array[String]], tags: Array[String], startCharOffsets: Array[Int], endCharOffsets: Array[Int], docDateOpt: Option[String], embeddings: ConstEmbeddingParameters): (IndexedSeq[String], Option[IndexedSeq[String]])

    Permalink

    Produces NE labels for one sentence

  44. final def notify(): Unit

    Permalink
    Definition Classes
    AnyRef
  45. final def notifyAll(): Unit

    Permalink
    Definition Classes
    AnyRef
  46. lazy val numericEntityRecognizer: NumericEntityRecognizer

    Permalink
  47. val optionalNER: Option[LexiconNER]

    Permalink
  48. def parse(doc: Document): Unit

    Permalink

    Syntactic parsing; modifies the document in place

    Syntactic parsing; modifies the document in place

    Definition Classes
    CluProcessorProcessor
  49. def parseSentence(words: IndexedSeq[String], posTags: IndexedSeq[String], nerLabels: IndexedSeq[String], embeddings: ConstEmbeddingParameters): DirectedGraph[String]

    Permalink

    Dependency parsing

  50. def recognizeNamedEntities(doc: Document): Unit

    Permalink

    NER; modifies the document in place

    NER; modifies the document in place

    Definition Classes
    CluProcessorProcessor
  51. def relationExtraction(doc: Document): Unit

    Permalink

    Relation extraction; modifies the document in place.

    Relation extraction; modifies the document in place.

    Definition Classes
    CluProcessorProcessor
  52. def removeNumericLabels(allLabels: Array[String]): Array[String]

    Permalink
  53. def resolveCoreference(doc: Document): Unit

    Permalink

    Coreference resolution; modifies the document in place

    Coreference resolution; modifies the document in place

    Definition Classes
    CluProcessorProcessor
  54. def srl(doc: Document): Unit

    Permalink

    Semantic role labeling

    Semantic role labeling

    Definition Classes
    CluProcessorProcessor
  55. def srlSentence(words: IndexedSeq[String], posTags: IndexedSeq[String], nerLabels: IndexedSeq[String], predicateIndexes: IndexedSeq[Int], embeddings: ConstEmbeddingParameters): DirectedGraph[String]

    Permalink

    Produces semantic role frames for one sentence

  56. def srlSentence(sent: Sentence, predicateIndexes: IndexedSeq[Int], embeddings: ConstEmbeddingParameters): DirectedGraph[String]

    Permalink
  57. final def synchronized[T0](arg0: ⇒ T0): T0

    Permalink
    Definition Classes
    AnyRef
  58. def tagPartsOfSpeech(doc: Document): Unit

    Permalink

    Part of speech tagging + chunking + SRL (predicates), jointly

    Part of speech tagging + chunking + SRL (predicates), jointly

    Definition Classes
    CluProcessorProcessor
  59. def tagSentence(words: IndexedSeq[String], embeddings: ConstEmbeddingParameters): (IndexedSeq[String], IndexedSeq[String], IndexedSeq[String])

    Permalink

    Produces POS tags, chunks, and semantic role predicates for one sentence

  60. def toString(): String

    Permalink
    Definition Classes
    AnyRef → Any
  61. lazy val tokenizer: Tokenizer

    Permalink
  62. final def wait(): Unit

    Permalink
    Definition Classes
    AnyRef
    Annotations
    @throws( ... )
  63. final def wait(arg0: Long, arg1: Int): Unit

    Permalink
    Definition Classes
    AnyRef
    Annotations
    @throws( ... )
  64. final def wait(arg0: Long): Unit

    Permalink
    Definition Classes
    AnyRef
    Annotations
    @throws( ... )

Inherited from Configured

Inherited from Processor

Inherited from AnyRef

Inherited from Any

Ungrouped