Class RewriterFeatures

java.lang.Object
com.yahoo.search.query.rewrite.RewriterFeatures

public class RewriterFeatures extends Object
Contains commonly used rewriter features
Author:
Karen Sze Wing Lee
  • Constructor Details

    • RewriterFeatures

      public RewriterFeatures()
  • Method Details

    • addUnitToOriginalQuery

      public static Query addUnitToOriginalQuery(Query query, String boostingQuery, boolean keepOriginalQuery) throws RuntimeException

      Add proximity boosting to original query by modifying the query tree directly

      e.g. original Query Tree: (AND aa bb)
      if keepOriginalQuery: true
      new Query tree: (OR (AND aa bb) "aa bb")
      if keepOriginalQuery: false
      new Query Tree: "aa bb"

      original Query Tree: (OR (AND aa bb) (AND cc dd))
      boostingQuery: cc dd
      if keepOriginalQuery: true
      new Query Tree: (OR (AND aa bb) (AND cc dd) "cc dd")
      if keepOriginalQuery: false
      new Query Tree: (OR (AND aa bb) "cc dd")
      Parameters:
      query - Query object from searcher
      boostingQuery - query to be boosted
      keepOriginalQuery - whether to keep original unboosted query as equiv
      Returns:
      Modified Query object, return original query object on error
      Throws:
      RuntimeException
    • addRewritesAsEquiv

      public static Query addRewritesAsEquiv(Query query, String matchingStr, String rewrites, boolean addUnitToRewrites, int maxNumRewrites) throws RuntimeException

      Add query expansion to the query tree

      e.g. origQuery: aa bb
      matchingStr: aa bb
      rewrite: cc dd, ee ff
      if addUnitToRewrites: false
      new query tree: (OR (AND aa bb) (AND cc dd) (AND ee ff))
      if addUnitToRewrites: true
      new query tree: (OR (AND aa bb) "cc dd" "ee ff")
      Parameters:
      query - Query object from searcher
      matchingStr - string used to retrieve the rewrite
      rewrites - The rewrite string retrieved from dictionary
      addUnitToRewrites - Whether to add unit to rewrites
      maxNumRewrites - Max number of rewrites to be added, 0 if no limit
      Returns:
      Modified Query object, return original query object on error
      Throws:
      RuntimeException
    • getNonOverlappingFullPhraseMatches

      public static Set<PhraseMatcher.Phrase> getNonOverlappingFullPhraseMatches(PhraseMatcher phraseMatcher, Query query) throws RuntimeException

      Retrieve the longest, from left to right non overlapping full phrase substrings in query based on FSA dictionary

      e.g. query: ((modern AND new AND york AND city AND travel) OR travel) AND ((sunny AND travel AND agency) OR nyc)
      dictionary:
      mny\tmodern new york
      mo\tmodern
      modern\tn/a
      modern\tnew york\tn/a
      new york\tn/a
      new york city\tn/a
      new york city travel\tn/a
      new york company\tn/a
      ny\tnew york
      nyc\tnew york city\tnew york company
      nyct\tnew york city travel
      ta\ttravel agency
      travel agency\tn/a
      return: nyc
      Parameters:
      phraseMatcher - PhraseMatcher object loaded with FSA dict
      query - Query object from the searcher
      Returns:
      Matching phrases
      Throws:
      RuntimeException
    • getNonOverlappingPartialPhraseMatches

      public static Set<PhraseMatcher.Phrase> getNonOverlappingPartialPhraseMatches(PhraseMatcher phraseMatcher, Query query) throws RuntimeException

      Retrieve the longest, from left to right non overlapping partial phrase substrings in query based on FSA dictionary

      e.g. query: ((modern AND new AND york AND city AND travel) OR travel) AND ((sunny AND travel AND agency) OR nyc)
      dictionary:
      mny\tmodern new york
      mo\tmodern
      modern\tn/a
      modern new york\tn/a
      new york\tn/a
      new york city\tn/a
      new york city travel\tn/a
      new york company\tn/a
      ny\tnew york
      nyc\tnew york city\tnew york company
      nyct\tnew york city travel
      ta\ttravel agency
      travel agency\tn/a
      return:
      modern
      new york city travel
      travel agency
      nyc
      Parameters:
      phraseMatcher - PhraseMatcher object loaded with FSA dict
      query - Query object from the searcher
      Returns:
      Matching phrases
      Throws:
      RuntimeException
    • getNonOverlappingMatchesInAndItem

      public static List<PhraseMatcher.Phrase> getNonOverlappingMatchesInAndItem(List<PhraseMatcher.Phrase> allMatches, Query query) throws RuntimeException

      Retrieve the longest, from left to right non overlapping substrings in AndItem based on FSA dictionary

      e.g. subtree: (modern AND new AND york AND city AND travel)
      dictionary:
      mny\tmodern new york
      mo\tmodern
      modern\tn/a
      modern new york\tn/a
      new york\tn/a
      new york city\tn/a
      new york city travel\tn/a
      new york company\tn/a
      ny\tnew york
      nyc\tnew york city\tnew york company
      nyct\tnew york city travel
      allMatches:
      modern
      modern new york
      new york
      new york city
      new york city travel
      return:
      modern
      new york city travel
      Parameters:
      allMatches - All matches within the subtree
      query - Query object from the searcher
      Returns:
      Matching phrases
      Throws:
      RuntimeException
    • addExpansions

      public static Query addExpansions(Query query, Set<PhraseMatcher.Phrase> matches, String expandIndex, int maxNumRewrites, boolean removeOriginal, boolean addUnitToRewrites) throws RuntimeException

      Add Expansions to the matching phrases

      e.g. Query: nyc travel agency
      matching phrase: nyc\tnew york city\tnew york company travel agency\tn/a
      if expandIndex is not null and removeOriginal is true
      New Query: ((new york city) OR ([expandIndex]:new york city) OR (new york company) OR ([expandIndex]:new york company)) AND ((travel agency) OR ([expandIndex]:travel agency))
      if expandIndex is null and removeOriginal is true
      New Query: ((new york city) OR (new york company)) AND travel agency
      if expandIndex is null and removeOriginal is false
      New Query: (nyc OR (new york city) OR (new york company)) AND travel agency
      Parameters:
      query - Query object from searcher
      matches - Set of longest non-overlapping matches
      expandIndex - Name of expansion index or null if default index
      maxNumRewrites - Max number of rewrites to be added, 0 if no limit
      removeOriginal - Whether to remove the original matching phrase
      addUnitToRewrites - Whether to add rewrite as phrase
      Throws:
      RuntimeException
    • convertMatchToString

      public static String convertMatchToString(PhraseMatcher.Phrase phrase)
      Convert Match to String
      Parameters:
      phrase - Match from PhraseMatcher
      Returns:
      String format of the phrase