public class TermOccurrenceUtils
extends java.lang.Object
TermOccurrence
objects and collections.Modifier and Type | Field and Description |
---|---|
static java.util.Comparator<TermOccurrence> |
uimaNaturalOrder |
Constructor and Description |
---|
TermOccurrenceUtils() |
Modifier and Type | Method and Description |
---|---|
static boolean |
areOffsetsOverlapping(TermOccurrence a,
TermOccurrence b)
True if two
TermOccurrence offsets overlap strictly. |
static boolean |
areOverlapping(TermOccurrence a,
TermOccurrence b)
Returns true if two occurrences are in the same
document and their offsets overlap.
|
static boolean |
hasOverlappingOffsets(TermOccurrence theOcc,
java.util.Collection<TermOccurrence> theOccCollection)
True if an occurrence set contains any element overlapping
with the param occurrence.
|
static void |
markPrimaryOccurrence(java.util.Collection<TermOccurrence> occs,
TermMeasure measure)
Given a strategy, detects all primary occurrences in a collection
of
TermOccurrence . |
static java.util.Iterator<java.util.List<TermOccurrence>> |
occurrenceChunkIterator(java.util.Collection<TermOccurrence> occurrences)
Returns a virtual iterator on chunks of an occurrence collection.
|
static void |
removeOverlaps(java.util.Collection<TermOccurrence> referenceSet,
java.util.Collection<TermOccurrence> occurrenceSet)
Removes from an occurrence set all occurrences that overlap
at least one occurrence in a reference occurrence set.
|
public static java.util.Comparator<TermOccurrence> uimaNaturalOrder
public static void markPrimaryOccurrence(java.util.Collection<TermOccurrence> occs, TermMeasure measure)
TermOccurrence
.
What defines an occurrence's primary/secondary status is the fact
that in a Document
, two primary occurrences cannot overlap.
E.g. in text "offshore wind energy", the sequence or term occurrences "offshore"
and "wind energy" is a set of primary sequence, but the set of term occurrences
"offshore wind" and "wind energy" is not a primary sequence, because occurrences
overlap.occs
- the occurrence collectionmeasure
- the measure for detecting primary occurrencesmarkPrimaryOccurrence(Collection, TermMeasure)
,
TermOccurrence.isPrimaryOccurrence()
public static java.util.Iterator<java.util.List<TermOccurrence>> occurrenceChunkIterator(java.util.Collection<TermOccurrence> occurrences)
TermOccurrence
. Every time
there is a gap between two occurrences (i.e. there do not overlap),
a new chunk is created.occurrences
- public static void removeOverlaps(java.util.Collection<TermOccurrence> referenceSet, java.util.Collection<TermOccurrence> occurrenceSet)
referenceSet
- the reference set, not modified by this methodoccurrenceSet
- the occurrence set to analyze, will be modified by this methodpublic static boolean hasOverlappingOffsets(TermOccurrence theOcc, java.util.Collection<TermOccurrence> theOccCollection)
theOcc
- theOccCollection
- public static boolean areOffsetsOverlapping(TermOccurrence a, TermOccurrence b)
TermOccurrence
offsets overlap strictly. Sharing exactly
one offset (e.g. a.end == b.begin
) is not considered as overlap.a
- b
- public static boolean areOverlapping(TermOccurrence a, TermOccurrence b)
a
- b
-