TermSuitePipeline (termsuite-core 2.3.0 API)

java.lang.Object
- eu.project.ttc.tools.TermSuitePipeline

```
public class TermSuitePipeline
extends java.lang.Object
```
A collection reader and ae aggregator (builder pattern) that creates and runs a full pipeline.

Method Summary

All Methods Static Methods Instance Methods Concrete Methods Deprecated Methods
Modifier and Type	Method and Description
`TermSuitePipeline`	`addPipelineListener(PipelineListener pipelineListener)` Registers a pipeline listener.
`TermSuitePipeline`	`aeChineseTokenizer()` Tokenizer for chinese collections.
`TermSuitePipeline`	`aeCompostSplitter()`
`TermSuitePipeline`	`aeContextualizer(int scope, boolean allTerms)` Computes the `Contextualizer` vector of all single-word terms in the term index.
`TermSuitePipeline`	`aeDocumentLogger(long nbDocument)`
`TermSuitePipeline`	`aeExtensionDetector()` Detects all inclusion/extension relation between terms that have size >= 2.
`TermSuitePipeline`	`aeFixedExpressionSpotter()` Spots fixed expressions in the CAS an creates `FixedExpression` annotation whenever one is found.
`TermSuitePipeline`	`aeFixedExpressionTermMarker()` Iterates over the `TermIndex` and mark terms as "fixed expressions" when their lemmas are found in the `FixedExpressionResource`.
`TermSuitePipeline`	`aeGraphicalVariantGatherer()`
`TermSuitePipeline`	`aeMateTaggerLemmatizer()`
`TermSuitePipeline`	`aeMaxSizeThresholdCleaner(TermProperty property, int maxSize)`
`TermSuitePipeline`	`aeMerger()` Merges the variants (only those who are extensions of the base term) of a terms by graphical variation.
`TermSuitePipeline`	`aePrefixSplitter()` Naive morphological analysis of prefix compounds based on a prefix dictionary resource
`TermSuitePipeline`	`aePrimaryOccurrenceDetector(int detectionStrategy)`
`TermSuitePipeline`	`aeRanker(TermProperty property, boolean desc)` Sets the `Term.setRank(int)` of all terms of the `TermIndex` given a `TermProperty`.
`TermSuitePipeline`	`aeRegexSpotter()` The single-word and multi-word term spotter AE base on UIMA Tokens Regex.
`TermSuitePipeline`	`aeScorer()` Transforms the `TermIndex` into a flat one-n scored model.
`TermSuitePipeline`	`aeSpecificityComputer()` Computes `TermProperty.WR` values (and additional term properties of type `TermProperty` in the future).
`TermSuitePipeline`	`aeStemmer()`
`TermSuitePipeline`	`aeStopWordsFilter()` Removes from the term index any term having a stop word at its boundaries.
`TermSuitePipeline`	`aeSuffixDerivationDetector()`
`TermSuitePipeline`	`aeSyntacticVariantGatherer()` Gathers terms according to their syntactic structures.
`TermSuitePipeline`	`aeTermClassifier(TermProperty sortingProperty)`
`TermSuitePipeline`	`aeTermOccAnnotationImporter()` An AE thats imports all `TermOccAnnotation` in CAS to a `TermIndex`.
`TermSuitePipeline`	`aeThresholdCleaner(TermProperty property, float threshold)`
`TermSuitePipeline`	`aeThresholdCleaner(TermProperty property, float threshold, boolean isPeriodic, int cleaningPeriod, int termIndexSizeTrigger)`
`TermSuitePipeline`	`aeThresholdCleanerPeriodic(TermProperty property, float threshold, int cleaningPeriod)`
`TermSuitePipeline`	`aeThresholdCleanerSizeTrigger(TermProperty property, float threshold, int termIndexSizeTrigger)`
`TermSuitePipeline`	`aeTopNCleaner(TermProperty property, int n)`
`TermSuitePipeline`	`aeTopNCleanerPeriodic(TermProperty property, int n, boolean isPeriodic, int cleaningPeriod)`
`TermSuitePipeline`	`aeTreeTagger()`
`TermSuitePipeline`	`aeUrlFilter()` Filters out URLs from CAS.
`TermSuitePipeline`	`aeWordTokenizer()`
`static TermSuitePipeline`	`create(java.lang.String lang)` Starts a chaining `TermSuitePipeline` builder.
`static TermSuitePipeline`	`create(TermIndex termIndex)`
`org.apache.uima.analysis_engine.AnalysisEngineDescription`	`createDescription()`
`TermSuitePipeline`	`customAE(org.apache.uima.analysis_engine.AnalysisEngineDescription ae, java.lang.String taskName)` Aggregates an AE to the TS pipeline.
`TermSuitePipeline`	`emptyCollection()`
`TermSuitePipeline`	`emptyTermIndex(java.lang.String name)` Creates a new in-memory `TermIndex` on which this piepline with run.
`TermSuitePipeline`	`enableSyntacticLabels()`
`java.lang.String`	`getHistoryResourceName()`
`java.lang.Thread`	`getStreamThread()`
`TermIndex`	`getTermIndex()` Returns the term index produced (or last modified) by this pipeline.
`TermSuitePipeline`	`haeCasStatCounter(java.lang.String statName)`
`TermSuitePipeline`	`haeCompoundExporter(java.lang.String toFilePath)` Exports all compound words of the terminology to given file path.
`TermSuitePipeline`	`haeEval(java.lang.String refFileURI, java.lang.String outputFile, java.lang.String customLogHeader, java.lang.String rFile, java.lang.String evalTraceName, boolean rtlWithVariants)`
`TermSuitePipeline`	`haeEvalExporter(java.lang.String toFilePath, boolean withVariants)`
`TermSuitePipeline`	`haeExportVariationRuleExamples(java.lang.String toFilePath)` Exports examples of matching pairs for each variation rule.
`TermSuitePipeline`	`haeJsonCasExporter(java.lang.String toDirectoryPath)`
`TermSuitePipeline`	`haeJsonExporter(java.lang.String toFilePath)`
`TermSuitePipeline`	`haeLogOverlappingRules()`
`TermSuitePipeline`	`haeSpotterTSVWriter(java.lang.String toDirectoryPath)` Export all CAS in TSV format to a given directory.
`TermSuitePipeline`	`haeTbxExporter(java.lang.String toFilePath)`
`TermSuitePipeline`	`haeTermsuiteJsonCasExporter(java.lang.String toDirectoryPath)` Exports all CAS as JSON files to a given directory.
`TermSuitePipeline`	`haeTraceTimePerf(java.lang.String toFile)` Exports time progress to TSV file.
`TermSuitePipeline`	`haeTsvExporter(java.lang.String toFilePath)` Exports the `TermIndex` in tsv format
`TermSuitePipeline`	`haeVariantEvalExporter(java.lang.String toFilePath, int topN, int maxVariantsPerTerm)` Creates a tsv output with : - the occurrence list of each term and theirs in-text contexts
`TermSuitePipeline`	`haeVariationExporter(java.lang.String toFilePath, VariationType... vTypes)`
`TermSuitePipeline`	`haeXmiCasExporter(java.lang.String toDirectoryPath)` Exports all CAS as XMI files to a given directory.
`TermSuitePipeline`	`linkMongoStore()` Configures the `JsonExporterAE` to not embed the occurrences in the json file, but to link the mongodb occurrence store instead.
`org.apache.uima.resource.ExternalResourceDescription`	`resHistory()`
`org.apache.uima.resource.ExternalResourceDescription`	`resObserver()`
`org.apache.uima.resource.ExternalResourceDescription`	`resSyntacticVariantRules()`
`org.apache.uima.resource.ExternalResourceDescription`	`resTermIndex()`
`TermSuitePipeline`	`run()` Runs the pipeline with `SimplePipeline` on the `CollectionReader` that must have been defined.
`TermSuitePipeline`	`run(org.apache.uima.jcas.JCas cas)` Runs the pipeline with `SimplePipeline` without requiring a `CollectionReader` to be defined.
`TermSuitePipeline`	`setAddSpottedAnnoToTermIndex(boolean addToTermIndex)` Configures `RegexSpotter`.
`TermSuitePipeline`	`setCollection(TermSuiteCollection termSuiteCollection, java.lang.String collectionPath, java.lang.String collectionEncoding)` Creates a collection reader for this pipeline.
`TermSuitePipeline`	`setCollection(TermSuiteCollection termSuiteCollection, java.lang.String collectionPath, java.lang.String collectionEncoding, java.lang.String droppedTags, java.lang.String txtTags)` Creates a collection reader of type `GenericXMLToTxtCollectionReader` for this pipeline.
`TermSuitePipeline`	`setCompostCoeffs(float alpha, float beta, float gamma, float delta)`
`TermSuitePipeline`	`setCompostMaxComponentNum(int compostMaxComponentNum)`
`TermSuitePipeline`	`setCompostMinComponentSize(int compostMinComponentSize)`
`TermSuitePipeline`	`setCompostScoreThreshold(float compostScoreThreshold)`
`TermSuitePipeline`	`setCompostSegmentSimilarityThreshold(float compostSegmentSimilarityThreshold)`
`TermSuitePipeline`	`setContextAssocRateMeasure(java.lang.String contextAssocRateMeasure)`
`TermSuitePipeline`	`setContextualizeCoTermsType(OccurrenceType contextualizeCoTermsType)`
`TermSuitePipeline`	`setContextualizeWithCoOccurrenceFrequencyThreshhold(int contextualizeWithCoOccurrenceFrequencyThreshhold)`
`TermSuitePipeline`	`setContextualizeWithTermClasses(boolean contextualizeWithTermClasses)`
`TermSuitePipeline`	`setExportJsonWithContext(boolean b)`
`TermSuitePipeline`	`setExportJsonWithOccurrences(boolean exportJsonWithOccurrences)`
`TermSuitePipeline`	`setGraphicalVariantSimilarityThreshold(float th)`
`TermSuitePipeline`	`setHistory(TermHistory history)`
`TermSuitePipeline`	`setInlineString(java.lang.String text)`
`TermSuitePipeline`	`setIstexCollection(java.lang.String apiURL, java.util.List<java.lang.String> documentsIds)`
`TermSuitePipeline`	`setKeepVariantsWhileCleaning(boolean keepVariantsWhileCleaning)`
`TermSuitePipeline`	`setMateModelPath(java.lang.String path)`
`TermSuitePipeline`	`setMongoDBOccurrenceStore(java.lang.String mongoDBUri)` Stores occurrences to MongoDB
`TermSuitePipeline`	`setPostProcessingStrategy(java.lang.String postProcessingStrategy)` Sets the post processing strategy for `RegexSpotter` analysis engine
`TermSuitePipeline`	`setResourceDir(java.lang.String resourceDir)` Invoke this method if TermSuite resources are accessible via a "file:/path/to/res/" url, i.e. they can be found locally.
`TermSuitePipeline`	`setResourceJar(java.lang.String resourceJar)`
`TermSuitePipeline`	`setResourceUrlPrefix(java.lang.String urlPrefix)`
`TermSuitePipeline`	`setSpotWithOccurrences(boolean activate)` Deprecated. Use TermSuitePipeline#setOccurrenceStoreMode instead.
`TermSuitePipeline`	`setTermIndex(TermIndex termIndex)` Sets the term index on which this pipeline will run.
`TermSuitePipeline`	`setTreeTaggerHome(java.lang.String treeTaggerPath)`
`TermSuitePipeline`	`setTsvExportProperties(TermProperty... properties)` Defines the term properties that appear in tsv export file
`TermSuitePipeline`	`setTsvShowHeaders(boolean tsvWithHeaders)` Configures tsvExporter to (not) show headers on the first line.
`TermSuitePipeline`	`setTsvShowScores(boolean tsvWithVariantScores)` Configures tsvExporter to (not) show variant scores with the "V" label
`DocumentStream`	`stream(CasConsumer consumer)`
`TermSuitePipeline`	`watch(java.lang.String... termKeys)`

Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait

- Method Detail
  - create
```
public static TermSuitePipeline create(java.lang.String lang)
```
    Starts a chaining TermSuitePipeline builder.
    
    Parameters:
    
    lang - The
    
    Returns:
    
    The chaining builder.
  - create
```
public static TermSuitePipeline create(TermIndex termIndex)
```
  - run
```
public TermSuitePipeline run()
```
    Runs the pipeline with SimplePipeline on the CollectionReader that must have been defined.
    
    Throws:
    
    TermSuitePipelineException - if no CollectionReader has been declared on this pipeline
  - stream
```
public DocumentStream stream(CasConsumer consumer)
```
  - getStreamThread
```
public java.lang.Thread getStreamThread()
```
  - addPipelineListener
```
public TermSuitePipeline addPipelineListener(PipelineListener pipelineListener)
```
    Registers a pipeline listener.
    
    Parameters:
    
    pipelineListener -
    
    Returns:
    
    This chaining TermSuitePipeline builder object
  - run
```
public TermSuitePipeline run(org.apache.uima.jcas.JCas cas)
```
    Runs the pipeline with SimplePipeline without requiring a CollectionReader to be defined.
    
    Parameters:
    
    cas - the JCas on which the pipeline operates.
    
    Returns:
    
    This chaining TermSuitePipeline builder object
  - setInlineString
```
public TermSuitePipeline setInlineString(java.lang.String text)
```
  - setIstexCollection
```
public TermSuitePipeline setIstexCollection(java.lang.String apiURL,
                                            java.util.List<java.lang.String> documentsIds)
```
  - setCollection
```
public TermSuitePipeline setCollection(TermSuiteCollection termSuiteCollection,
                                       java.lang.String collectionPath,
                                       java.lang.String collectionEncoding)
```
    Creates a collection reader for this pipeline.
    
    Parameters:
    
    termSuiteCollection -
    
    collectionPath -
    
    collectionEncoding -
    
    Returns:
    
    This chaining TermSuitePipeline builder object
  - setCollection
```
public TermSuitePipeline setCollection(TermSuiteCollection termSuiteCollection,
                                       java.lang.String collectionPath,
                                       java.lang.String collectionEncoding,
                                       java.lang.String droppedTags,
                                       java.lang.String txtTags)
```
    Creates a collection reader of type GenericXMLToTxtCollectionReader for this pipeline. Requires a list of dropped tags and txt tags for collection parsing.
    
    Parameters:
    
    termSuiteCollection -
    
    collectionPath -
    
    collectionEncoding -
    
    droppedTags -
    
    txtTags -
    
    Returns:
    
    This chaining TermSuitePipeline builder object
    
    See Also:
    
    AbstractToTxtSaxHandler
  - setResourceDir
```
public TermSuitePipeline setResourceDir(java.lang.String resourceDir)
```
    Invoke this method if TermSuite resources are accessible via a "file:/path/to/res/" url, i.e. they can be found locally.
    
    Parameters:
    
    resourceDir -
    
    Returns:
  - setResourceJar
```
public TermSuitePipeline setResourceJar(java.lang.String resourceJar)
```
  - setResourceUrlPrefix
```
public TermSuitePipeline setResourceUrlPrefix(java.lang.String urlPrefix)
```
  - setContextAssocRateMeasure
```
public TermSuitePipeline setContextAssocRateMeasure(java.lang.String contextAssocRateMeasure)
```
  - emptyCollection
```
public TermSuitePipeline emptyCollection()
```
  - createDescription
```
public org.apache.uima.analysis_engine.AnalysisEngineDescription createDescription()
```
  - setHistory
```
public TermSuitePipeline setHistory(TermHistory history)
```
  - watch
```
public TermSuitePipeline watch(java.lang.String... termKeys)
```
  - getHistoryResourceName
```
public java.lang.String getHistoryResourceName()
```
  - aeWordTokenizer
```
public TermSuitePipeline aeWordTokenizer()
```
  - aeTreeTagger
```
public TermSuitePipeline aeTreeTagger()
```
  - setMateModelPath
```
public TermSuitePipeline setMateModelPath(java.lang.String path)
```
  - aeMateTaggerLemmatizer
```
public TermSuitePipeline aeMateTaggerLemmatizer()
```
  - setTsvExportProperties
```
public TermSuitePipeline setTsvExportProperties(TermProperty... properties)
```
    Defines the term properties that appear in tsv export file
    
    Parameters:
    
    properties -
    
    Returns:
    
    This chaining TermSuitePipeline builder object
    
    See Also:
    
    haeTsvExporter(String)
  - haeTsvExporter
```
public TermSuitePipeline haeTsvExporter(java.lang.String toFilePath)
```
    Exports the TermIndex in tsv format
    
    Parameters:
    
    toFilePath -
    
    Returns:
    
    This chaining TermSuitePipeline builder object
    
    See Also:
    
    setTsvExportProperties(TermProperty...)
  - haeExportVariationRuleExamples
```
public TermSuitePipeline haeExportVariationRuleExamples(java.lang.String toFilePath)
```
    Exports examples of matching pairs for each variation rule.
    
    Parameters:
    
    toFilePath - the file path where to write the examples for each variation rules
    
    Returns:
    
    This chaining TermSuitePipeline builder object
  - haeCompoundExporter
```
public TermSuitePipeline haeCompoundExporter(java.lang.String toFilePath)
```
    Exports all compound words of the terminology to given file path.
    
    Parameters:
    
    toFilePath -
    
    Returns:
    
    This chaining TermSuitePipeline builder object
  - haeVariationExporter
```
public TermSuitePipeline haeVariationExporter(java.lang.String toFilePath,
                                              VariationType... vTypes)
```
  - haeTbxExporter
```
public TermSuitePipeline haeTbxExporter(java.lang.String toFilePath)
```
  - haeEvalExporter
```
public TermSuitePipeline haeEvalExporter(java.lang.String toFilePath,
                                         boolean withVariants)
```
  - setExportJsonWithOccurrences
```
public TermSuitePipeline setExportJsonWithOccurrences(boolean exportJsonWithOccurrences)
```
  - setExportJsonWithContext
```
public TermSuitePipeline setExportJsonWithContext(boolean b)
```
  - haeJsonExporter
```
public TermSuitePipeline haeJsonExporter(java.lang.String toFilePath)
```
  - haeVariantEvalExporter
```
public TermSuitePipeline haeVariantEvalExporter(java.lang.String toFilePath,
                                                int topN,
                                                int maxVariantsPerTerm)
```
    Creates a tsv output with : - the occurrence list of each term and theirs in-text contexts. - a json structure for the evaluation of each variant
    
    Parameters:
    
    toFilePath - The output file path
    
    topN - The number of variants to keep in the file
    
    maxVariantsPerTerm - The maximum number of variants to eval for each term
    
    Returns:
    
    This chaining TermSuitePipeline builder object
  - aeStemmer
```
public TermSuitePipeline aeStemmer()
```
  - aeFixedExpressionTermMarker
```
public TermSuitePipeline aeFixedExpressionTermMarker()
```
    Iterates over the TermIndex and mark terms as "fixed expressions" when their lemmas are found in the FixedExpressionResource.
    
    Returns:
    
    This chaining TermSuitePipeline builder object
  - aeFixedExpressionSpotter
```
public TermSuitePipeline aeFixedExpressionSpotter()
```
    Spots fixed expressions in the CAS an creates FixedExpression annotation whenever one is found.
    
    Returns:
    
    This chaining TermSuitePipeline builder object
  - aeRegexSpotter
```
public TermSuitePipeline aeRegexSpotter()
```
    The single-word and multi-word term spotter AE base on UIMA Tokens Regex.
    
    Returns:
    
    This chaining TermSuitePipeline builder object
  - aeTermOccAnnotationImporter
```
public TermSuitePipeline aeTermOccAnnotationImporter()
```
    An AE thats imports all TermOccAnnotation in CAS to a TermIndex.
    
    Returns:
    
    This chaining TermSuitePipeline builder object
  - aePrefixSplitter
```
public TermSuitePipeline aePrefixSplitter()
```
    Naive morphological analysis of prefix compounds based on a prefix dictionary resource
    
    Returns:
    
    This chaining TermSuitePipeline builder object
  - aeSuffixDerivationDetector
```
public TermSuitePipeline aeSuffixDerivationDetector()
```
  - aeStopWordsFilter
```
public TermSuitePipeline aeStopWordsFilter()
```
    Removes from the term index any term having a stop word at its boundaries.
    
    Returns:
    
    This chaining TermSuitePipeline builder object
    
    See Also:
    
    TermIndexBlacklistWordFilterAE
  - haeXmiCasExporter
```
public TermSuitePipeline haeXmiCasExporter(java.lang.String toDirectoryPath)
```
    Exports all CAS as XMI files to a given directory.
    
    Parameters:
    
    toDirectoryPath -
    
    Returns:
    
    This chaining TermSuitePipeline builder object
  - haeTermsuiteJsonCasExporter
```
public TermSuitePipeline haeTermsuiteJsonCasExporter(java.lang.String toDirectoryPath)
```
    Exports all CAS as JSON files to a given directory.
    
    Parameters:
    
    toDirectoryPath -
    
    Returns:
    
    This chaining TermSuitePipeline builder object
  - haeSpotterTSVWriter
```
public TermSuitePipeline haeSpotterTSVWriter(java.lang.String toDirectoryPath)
```
    Export all CAS in TSV format to a given directory.
    
    Parameters:
    
    toDirectoryPath -
    
    Returns:
    
    This chaining TermSuitePipeline builder object
    
    See Also:
    
    SpotterTSVWriter
  - aeDocumentLogger
```
public TermSuitePipeline aeDocumentLogger(long nbDocument)
```
  - aeChineseTokenizer
```
public TermSuitePipeline aeChineseTokenizer()
```
    Tokenizer for chinese collections.
    
    Returns:
    
    This chaining TermSuitePipeline builder object
    
    See Also:
    
    ChineseSegmenter
  - resTermIndex
```
public org.apache.uima.resource.ExternalResourceDescription resTermIndex()
```
  - resObserver
```
public org.apache.uima.resource.ExternalResourceDescription resObserver()
```
  - resHistory
```
public org.apache.uima.resource.ExternalResourceDescription resHistory()
```
  - resSyntacticVariantRules
```
public org.apache.uima.resource.ExternalResourceDescription resSyntacticVariantRules()
```
  - getTermIndex
```
public TermIndex getTermIndex()
```
    Returns the term index produced (or last modified) by this pipeline.
    
    Returns:
    
    The term index processed by this pipeline
  - setTermIndex
```
public TermSuitePipeline setTermIndex(TermIndex termIndex)
```
    Sets the term index on which this pipeline will run.
    
    Parameters:
    
    termIndex -
    
    Returns:
    
    This chaining TermSuitePipeline builder object
  - emptyTermIndex
```
public TermSuitePipeline emptyTermIndex(java.lang.String name)
```
    Creates a new in-memory TermIndex on which this piepline with run.
    
    Parameters:
    
    name - the name of the new term index
    
    Returns:
    
    This chaining TermSuitePipeline builder object
  - aeSpecificityComputer
```
public TermSuitePipeline aeSpecificityComputer()
```
    Computes TermProperty.WR values (and additional term properties of type TermProperty in the future).
    
    Returns:
    
    This chaining TermSuitePipeline builder object
    
    See Also:
    
    TermSpecificityComputer, TermProperty
  - setContextualizeCoTermsType
```
public TermSuitePipeline setContextualizeCoTermsType(OccurrenceType contextualizeCoTermsType)
```
  - setContextualizeWithTermClasses
```
public TermSuitePipeline setContextualizeWithTermClasses(boolean contextualizeWithTermClasses)
```
  - setContextualizeWithCoOccurrenceFrequencyThreshhold
```
public TermSuitePipeline setContextualizeWithCoOccurrenceFrequencyThreshhold(int contextualizeWithCoOccurrenceFrequencyThreshhold)
```
  - aeContextualizer
```
public TermSuitePipeline aeContextualizer(int scope,
                                          boolean allTerms)
```
    Computes the Contextualizer vector of all single-word terms in the term index.
    
    Parameters:
    
    scope -
    
    allTerms -
    
    Returns:
    
    This chaining TermSuitePipeline builder object
    
    See Also:
    
    Contextualizer
  - aeMaxSizeThresholdCleaner
```
public TermSuitePipeline aeMaxSizeThresholdCleaner(TermProperty property,
                                                   int maxSize)
```
  - aeThresholdCleaner
```
public TermSuitePipeline aeThresholdCleaner(TermProperty property,
                                            float threshold,
                                            boolean isPeriodic,
                                            int cleaningPeriod,
                                            int termIndexSizeTrigger)
```
  - aePrimaryOccurrenceDetector
```
public TermSuitePipeline aePrimaryOccurrenceDetector(int detectionStrategy)
```
  - aeThresholdCleanerPeriodic
```
public TermSuitePipeline aeThresholdCleanerPeriodic(TermProperty property,
                                                    float threshold,
                                                    int cleaningPeriod)
```
    Parameters:
    
    property -
    
    threshold -
    
    cleaningPeriod -
    
    Returns:
    
    This chaining TermSuitePipeline builder object
  - aeThresholdCleanerSizeTrigger
```
public TermSuitePipeline aeThresholdCleanerSizeTrigger(TermProperty property,
                                                       float threshold,
                                                       int termIndexSizeTrigger)
```
  - setKeepVariantsWhileCleaning
```
public TermSuitePipeline setKeepVariantsWhileCleaning(boolean keepVariantsWhileCleaning)
```
  - aeThresholdCleaner
```
public TermSuitePipeline aeThresholdCleaner(TermProperty property,
                                            float threshold)
```
  - aeTopNCleaner
```
public TermSuitePipeline aeTopNCleaner(TermProperty property,
                                       int n)
```
  - aeTopNCleanerPeriodic
```
public TermSuitePipeline aeTopNCleanerPeriodic(TermProperty property,
                                               int n,
                                               boolean isPeriodic,
                                               int cleaningPeriod)
```
    Parameters:
    
    property -
    
    n -
    
    isPeriodic -
    
    cleaningPeriod -
    
    Returns:
    
    This chaining TermSuitePipeline builder object
  - setGraphicalVariantSimilarityThreshold
```
public TermSuitePipeline setGraphicalVariantSimilarityThreshold(float th)
```
  - aeGraphicalVariantGatherer
```
public TermSuitePipeline aeGraphicalVariantGatherer()
```
  - aeUrlFilter
```
public TermSuitePipeline aeUrlFilter()
```
    Filters out URLs from CAS.
    
    Returns:
    
    This chaining TermSuitePipeline builder object
  - aeSyntacticVariantGatherer
```
public TermSuitePipeline aeSyntacticVariantGatherer()
```
    Gathers terms according to their syntactic structures.
    
    Returns:
    
    This chaining TermSuitePipeline builder object
  - aeExtensionDetector
```
public TermSuitePipeline aeExtensionDetector()
```
    Detects all inclusion/extension relation between terms that have size >= 2.
    
    Returns:
    
    This chaining TermSuitePipeline builder object
  - aeScorer
```
public TermSuitePipeline aeScorer()
```
    Transforms the TermIndex into a flat one-n scored model.
    
    Returns:
    
    This chaining TermSuitePipeline builder object
  - aeMerger
```
public TermSuitePipeline aeMerger()
```
    Merges the variants (only those who are extensions of the base term) of a terms by graphical variation.
    
    Returns:
    
    This chaining TermSuitePipeline builder object
  - aeRanker
```
public TermSuitePipeline aeRanker(TermProperty property,
                                  boolean desc)
```
    Sets the Term.setRank(int) of all terms of the TermIndex given a TermProperty.
    
    Parameters:
    
    property -
    
    desc -
    
    Returns:
  - setTreeTaggerHome
```
public TermSuitePipeline setTreeTaggerHome(java.lang.String treeTaggerPath)
```
  - haeLogOverlappingRules
```
public TermSuitePipeline haeLogOverlappingRules()
```
  - enableSyntacticLabels
```
public TermSuitePipeline enableSyntacticLabels()
```
  - setCompostCoeffs
```
public TermSuitePipeline setCompostCoeffs(float alpha,
                                          float beta,
                                          float gamma,
                                          float delta)
```
  - setCompostMaxComponentNum
```
public TermSuitePipeline setCompostMaxComponentNum(int compostMaxComponentNum)
```
  - setCompostMinComponentSize
```
public TermSuitePipeline setCompostMinComponentSize(int compostMinComponentSize)
```
  - setCompostScoreThreshold
```
public TermSuitePipeline setCompostScoreThreshold(float compostScoreThreshold)
```
  - setCompostSegmentSimilarityThreshold
```
public TermSuitePipeline setCompostSegmentSimilarityThreshold(float compostSegmentSimilarityThreshold)
```
  - aeCompostSplitter
```
public TermSuitePipeline aeCompostSplitter()
```
  - haeCasStatCounter
```
public TermSuitePipeline haeCasStatCounter(java.lang.String statName)
```
  - haeTraceTimePerf
```
public TermSuitePipeline haeTraceTimePerf(java.lang.String toFile)
```
    Exports time progress to TSV file. Columns are :
    - elapsed time from initialization in milliseconds
    - number of docs processed
    - cumulated size of data processed
    - number of terms in term index
    - number of WordAnnotation processed
    Parameters:
    
    toFile -
    
    Returns:
    
    This chaining TermSuitePipeline builder object
  - aeTermClassifier
```
public TermSuitePipeline aeTermClassifier(TermProperty sortingProperty)
```
    Parameters:
    
    sortingProperty - the term property used to order terms before they are classified. The first term of a class appearing given this order will be considered as the head of the class.
    
    Returns:
    
    This chaining TermSuitePipeline builder object
    
    See Also:
    
    TermClassifier
  - haeEval
```
public TermSuitePipeline haeEval(java.lang.String refFileURI,
                                 java.lang.String outputFile,
                                 java.lang.String customLogHeader,
                                 java.lang.String rFile,
                                 java.lang.String evalTraceName,
                                 boolean rtlWithVariants)
```
    Parameters:
    
    refFileURI - The path to reference termino
    
    outputFile - The path to output log file
    
    customLogHeader - A custom string to add in the header of the output log file
    
    rFile - The path to output r file
    
    evalTraceName - The name of the eval trace
    
    rtlWithVariants - true if variants of the reference termino should be kept during the eval
    
    Returns:
    
    This chaining TermSuitePipeline builder object
  - setMongoDBOccurrenceStore
```
public TermSuitePipeline setMongoDBOccurrenceStore(java.lang.String mongoDBUri)
```
    Stores occurrences to MongoDB
    
    Parameters:
    
    mongoDBUri - the mongo db connection uri
    
    Returns:
    
    This chaining TermSuitePipeline builder object
  - setSpotWithOccurrences
```
@Deprecated
public TermSuitePipeline setSpotWithOccurrences(boolean activate)
```
    Deprecated. Use TermSuitePipeline#setOccurrenceStoreMode instead.
    
    Parameters:
    
    activate -
    
    Returns:
    
    This chaining TermSuitePipeline builder object
  - setAddSpottedAnnoToTermIndex
```
public TermSuitePipeline setAddSpottedAnnoToTermIndex(boolean addToTermIndex)
```
    Configures RegexSpotter. If true, adds all spotted occurrences to the TermIndex.
    
    Parameters:
    
    addToTermIndex - the value of the parameter
    
    Returns:
    
    This chaining TermSuitePipeline builder object
    
    See Also:
    
    aeRegexSpotter()
  - setPostProcessingStrategy
```
public TermSuitePipeline setPostProcessingStrategy(java.lang.String postProcessingStrategy)
```
    Sets the post processing strategy for RegexSpotter analysis engine
    
    Parameters:
    
    postProcessingStrategy -
    
    Returns:
    
    This chaining TermSuitePipeline builder object
    
    See Also:
    
    aeRegexSpotter(), OccurrenceBuffer.NO_CLEANING, OccurrenceBuffer.KEEP_PREFIXES, OccurrenceBuffer.KEEP_SUFFIXES
  - setTsvShowHeaders
```
public TermSuitePipeline setTsvShowHeaders(boolean tsvWithHeaders)
```
    Configures tsvExporter to (not) show headers on the first line.
    
    Parameters:
    
    tsvWithHeaders - the flag
    
    Returns:
    
    This chaining TermSuitePipeline builder object
  - setTsvShowScores
```
public TermSuitePipeline setTsvShowScores(boolean tsvWithVariantScores)
```
    Configures tsvExporter to (not) show variant scores with the "V" label
    
    Parameters:
    
    tsvWithVariantScores - the flag
    
    Returns:
    
    This chaining TermSuitePipeline builder object
  - haeJsonCasExporter
```
public TermSuitePipeline haeJsonCasExporter(java.lang.String toDirectoryPath)
```
  - linkMongoStore
```
public TermSuitePipeline linkMongoStore()
```
    Configures the JsonExporterAE to not embed the occurrences in the json file, but to link the mongodb occurrence store instead.
    
    Returns:
    
    This chaining TermSuitePipeline builder object
    
    See Also:
    
    haeJsonExporter(String)
  - customAE
```
public TermSuitePipeline customAE(org.apache.uima.analysis_engine.AnalysisEngineDescription ae,
                                  java.lang.String taskName)
```
    Aggregates an AE to the TS pipeline.
    
    Parameters:
    
    ae - the ae description of the added pipeline.
    
    taskName - a user-readable name for the AE task (intended to be displayed in progress views)
    
    Returns:
    
    This chaining TermSuitePipeline builder object

Class TermSuitePipeline

Method Summary

Methods inherited from class java.lang.Object

Method Detail

create

create

run

stream

getStreamThread

addPipelineListener

run

setInlineString

setIstexCollection

setCollection

setCollection

setResourceDir

setResourceJar

setResourceUrlPrefix

setContextAssocRateMeasure

emptyCollection

createDescription

setHistory

watch

getHistoryResourceName

aeWordTokenizer

aeTreeTagger

setMateModelPath

aeMateTaggerLemmatizer

setTsvExportProperties

haeTsvExporter

haeExportVariationRuleExamples

haeCompoundExporter

haeVariationExporter

haeTbxExporter

haeEvalExporter

setExportJsonWithOccurrences

setExportJsonWithContext

haeJsonExporter

haeVariantEvalExporter

aeStemmer

aeFixedExpressionTermMarker

aeFixedExpressionSpotter

aeRegexSpotter

aeTermOccAnnotationImporter

aePrefixSplitter

aeSuffixDerivationDetector

aeStopWordsFilter

haeXmiCasExporter

haeTermsuiteJsonCasExporter

haeSpotterTSVWriter

aeDocumentLogger

aeChineseTokenizer

resTermIndex

resObserver

resHistory

resSyntacticVariantRules

getTermIndex

setTermIndex

emptyTermIndex

aeSpecificityComputer

setContextualizeCoTermsType

setContextualizeWithTermClasses

setContextualizeWithCoOccurrenceFrequencyThreshhold

aeContextualizer

aeMaxSizeThresholdCleaner

aeThresholdCleaner

aePrimaryOccurrenceDetector

aeThresholdCleanerPeriodic

aeThresholdCleanerSizeTrigger

setKeepVariantsWhileCleaning

aeThresholdCleaner

aeTopNCleaner

aeTopNCleanerPeriodic

setGraphicalVariantSimilarityThreshold

aeGraphicalVariantGatherer

aeUrlFilter

aeSyntacticVariantGatherer

aeExtensionDetector

aeScorer

aeMerger