Class NGramSearcher

java.lang.Object
com.yahoo.component.AbstractComponent
com.yahoo.component.chain.ChainedComponent
com.yahoo.processing.Processor
com.yahoo.search.Searcher
com.yahoo.search.querytransform.NGramSearcher
All Implemented Interfaces:
com.yahoo.component.Component, com.yahoo.component.Deconstructable, Comparable<com.yahoo.component.Component>

public class NGramSearcher extends Searcher
Handles NGram indexes by splitting query terms to them into grams and combining summary field values from such fields into the original text.

This declares it must be placed after Juniper searchers because it assumes Juniper token separators (which are returned on bolding) are not replaced by highlight tags when this is run (and "after" means "before" from the point of view of the result).

Author:
bratseth
  • Constructor Details

    • NGramSearcher

      public NGramSearcher(com.yahoo.language.Linguistics linguistics)
  • Method Details

    • search

      public Result search(Query query, Execution execution)
      Description copied from class: Searcher
      Override this to implement your searcher.

      Searcher implementation subclasses will, depending on their type of logic, do one of the following:

      • Query processors: Access the query, then call execution.search and return the result
      • Result processors: Call execution.search to get the result, access it and return
      • Sources (which produces results): Create a result, add the desired hits and return it.
      • Federators (which forwards the search to multiple subchains): Call search on the desired subchains in parallel and get the results. Combine the results to one and return it.
      • Workflows: Call execution.search as many times as desired, using different queries. Eventually return a result.

      Hits come in two kinds - concrete hits are actual content of the kind requested by the user, meta hits are hits which provides information about the collection of hits, on the query, the service and so on.

      The query specifies a window into a larger result list that must be returned from the searcher through hits and offset; Searchers which returns list of hits in the top level in the result must return at least hits number of hits (or if impossible; all that are available), starting at the given offset. In addition, searchers are allowed to return any number of meta hits (although this number is expected to be low). For hits contained in nested hit groups, the concept of a window defined by hits and offset is not well defined and does not apply.

      Error handling in searchers:

      • Unexpected events: Throw any RuntimeException. This query will fail with the exception message, and the error will be logged
      • Expected events: Create (new Result(Query, ErrorMessage) or add result.setErrorIfNoOtherErrors(ErrorMessage) an error message to the Result.
      • Recoverable user errors: Add a FeedbackHit explaining the condition and how to correct it.
      Specified by:
      search in class Searcher
      Parameters:
      query - the query
      Returns:
      the result of making this query
    • fill

      public void fill(Result result, String summaryClass, Execution execution)
      Description copied from class: Searcher
      Fill hit properties with data using the given summary class. Calling this on already filled results has no cost.

      This needs to be overridden by federating searchers to contact search sources again by propagating the fill call down through the search chain, and by source searchers which talks to fill capable backends to request the data to be filled. Other searchers do not need to override this.

      Overrides:
      fill in class Searcher
      Parameters:
      result - the result to fill
      summaryClass - the name of the collection of fields to fetch the values of, or null to use the default
    • splitToGrams

      protected Item splitToGrams(Item term, String text, int gramSize, Query query)
      Splits the given item into n-grams and adds them as a CompositeItem containing WordItems searching the index of the input term. If the result is a single gram, that single WordItem is returned rather than the AndItem
      Parameters:
      term - the term to split, must be an item which implement the IndexedItem and BlockItem "mixins"
      text - the text of the item, just stringValue() if the item is a TermItem
      gramSize - the gram size to split to
      query - the query in which this rewriting is done
      Returns:
      the root of the query subtree produced by this, containing the split items
    • getGramSplitter

      protected final com.yahoo.language.process.GramSplitter getGramSplitter()
      Returns the (thread-safe) object to use to split the query text into grams.
    • createGramRoot

      protected CompositeItem createGramRoot(HasIndexItem term, Query query)
      Creates the root of the query subtree which will contain the grams to match, called by splitToGrams(com.yahoo.prelude.query.Item, java.lang.String, int, com.yahoo.search.Query). This hook is provided to make it easy to create a subclass which matches grams using a different composite item, e.g an OrItem.

      This default implementation returns createGramRoot(query).

      Parameters:
      term - the term item this gram root is replacing in the query tree, typically used to access the index name of the term when that is required by the new gram root (such as in PhraseItem)
      query - the input query, to make it possible to return a different composite item type depending on the query content
      Returns:
      the composite item to add the gram items to in splitToGrams(com.yahoo.prelude.query.Item, java.lang.String, int, com.yahoo.search.Query)
    • createGramRoot

      protected CompositeItem createGramRoot(Query query)
      Creates the root of the query subtree without access to the term being replaced.