org.clulab.wm.eidos.extraction
Annotate the processors Sentence with found matches from the gazetteers in place.
Annotate the processors Sentence with found matches from the gazetteers in place.
The annotations will be placed in the entities
field, using BIO notation (i.e., B-X, I-X, O, ...)
Sentence to be annotated
Find mentions corresponding to gazetteers (using string matching).
Find mentions corresponding to gazetteers (using string matching). The label of the mention will match the name of the gazetteer file which matches it. The matching is done using the LexiconNER class in Processors.
Processors Document, already annotated. The named entities field does not need to be populated.
the mentions corresponding to the gazetteer items
The GazetteerEntityFinder finds mentions of gazetteer elements. The matching uses a processors LexiconNER, so the lexicons are provided as paths to the csv files (stored in
resources
) and matching is based on exact string match. The found mentions are odin TextBoundMentions, where the Mention label is the same as the base name of the gazetteer that matched it.