Class PhraseMatcher


  • public class PhraseMatcher
    extends java.lang.Object
    Detects query phrases using an automaton. This class is thread safe.
    Author:
    bratseth
    • Nested Class Summary

      Nested Classes 
      Modifier and Type Class Description
      static class  PhraseMatcher.Phrase
      Points to a collection of word items (one or more) which is matches a complete listing in an automat
    • Constructor Summary

      Constructors 
      Constructor Description
      PhraseMatcher​(com.yahoo.fsa.FSA phraseAutomatonFSA, boolean ignorePluralForm)
      Creates a phrase matcher
      PhraseMatcher​(java.lang.String phraseAutomatonFile)
      Creates a phrase matcher.
      PhraseMatcher​(java.lang.String phraseAutomatonFile, boolean ignorePluralForm)
      Creates a phrase matcher
    • Method Summary

      All Methods Static Methods Instance Methods Concrete Methods 
      Modifier and Type Method Description
      static PhraseMatcher getNullMatcher()
      Returns a phrase matcher which (quickly) never matches anything
      boolean isEmpty()  
      java.util.List<PhraseMatcher.Phrase> matchPhrases​(Item queryItem)
      Finds all phrases (word sequences of length 1 or higher) of the same index, not negative items of a notitem, which constitutes a complete entry in the automaton of this matcher
      void setIgnorePluralForm​(boolean ignorePluralForm)
      Sets whether we should ignore plural/singular form when matching
      void setMatchAll​(boolean matchAll)
      Sets whether to return the longest matching phrase when there are overlapping matches (default), or all matching phrases
      void setMatchPhraseItems​(boolean matchPhraseItems)
      Set whether to match words contained in phrase items as well.
      void setMatchSingleItems​(boolean matchSingleItems)
      Sets whether single items should be matched and returned as phrase matches.
      • Methods inherited from class java.lang.Object

        clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
    • Constructor Detail

      • PhraseMatcher

        public PhraseMatcher​(java.lang.String phraseAutomatonFile)
        Creates a phrase matcher. This will not ignore plural/singular form differences when matching
        Parameters:
        phraseAutomatonFile - the file containing phrases to match
        Throws:
        java.lang.IllegalArgumentException - if the file is not found
      • PhraseMatcher

        public PhraseMatcher​(java.lang.String phraseAutomatonFile,
                             boolean ignorePluralForm)
        Creates a phrase matcher
        Parameters:
        phraseAutomatonFile - the file containing phrases to match
        ignorePluralForm - whether we should ignore plural and singular forms as matches
        Throws:
        java.lang.IllegalArgumentException - if the file is not found
      • PhraseMatcher

        public PhraseMatcher​(com.yahoo.fsa.FSA phraseAutomatonFSA,
                             boolean ignorePluralForm)
        Creates a phrase matcher
        Parameters:
        phraseAutomatonFSA - the fsa containing phrases to match
        ignorePluralForm - whether we should ignore plural and singular forms as matches
        Throws:
        java.lang.IllegalArgumentException - if FSA is null
    • Method Detail

      • isEmpty

        public boolean isEmpty()
      • setMatchPhraseItems

        public void setMatchPhraseItems​(boolean matchPhraseItems)
        Set whether to match words contained in phrase items as well. Default is false - don't match words contained in phrase items
      • setMatchSingleItems

        public void setMatchSingleItems​(boolean matchSingleItems)
        Sets whether single items should be matched and returned as phrase matches. Default is false.
      • setIgnorePluralForm

        public void setIgnorePluralForm​(boolean ignorePluralForm)
        Sets whether we should ignore plural/singular form when matching
      • setMatchAll

        public void setMatchAll​(boolean matchAll)
        Sets whether to return the longest matching phrase when there are overlapping matches (default), or all matching phrases
      • matchPhrases

        public java.util.List<PhraseMatcher.Phrase> matchPhrases​(Item queryItem)
        Finds all phrases (word sequences of length 1 or higher) of the same index, not negative items of a notitem, which constitutes a complete entry in the automaton of this matcher
        Parameters:
        queryItem - the root query item in which to match phrases
        Returns:
        the matched phrases, or null if there was no matches
      • getNullMatcher

        public static PhraseMatcher getNullMatcher()
        Returns a phrase matcher which (quickly) never matches anything