Class RAGSearcher

java.lang.Object
com.yahoo.component.AbstractComponent
com.yahoo.component.chain.ChainedComponent
com.yahoo.processing.Processor
All Implemented Interfaces:
com.yahoo.component.Component, com.yahoo.component.Deconstructable, Comparable<com.yahoo.component.Component>

public class RAGSearcher extends LLMSearcher
An LLM searcher that uses the RAG (Retrieval-Augmented Generation) model to generate completions. Prompts are generated based on the search result context. By default, the context is a concatenation of the fields of the search result hits.
Author:
lesters
  • Constructor Details

    • RAGSearcher

      @Inject public RAGSearcher(LlmSearcherConfig config, com.yahoo.component.provider.ComponentRegistry<ai.vespa.llm.LanguageModel> languageModels)
  • Method Details

    • search

      public Result search(Query query, Execution execution)
      Description copied from class: Searcher
      Override this to implement your searcher.

      Searcher implementation subclasses will, depending on their type of logic, do one of the following:

      • Query processors: Access the query, then call execution.search and return the result
      • Result processors: Call execution.search to get the result, access it and return
      • Sources (which produces results): Create a result, add the desired hits and return it.
      • Federators (which forwards the search to multiple subchains): Call search on the desired subchains in parallel and get the results. Combine the results to one and return it.
      • Workflows: Call execution.search as many times as desired, using different queries. Eventually return a result.

      Hits come in two kinds - concrete hits are actual content of the kind requested by the user, meta hits are hits which provides information about the collection of hits, on the query, the service and so on.

      The query specifies a window into a larger result list that must be returned from the searcher through hits and offset; Searchers which returns list of hits in the top level in the result must return at least hits number of hits (or if impossible; all that are available), starting at the given offset. In addition, searchers are allowed to return any number of meta hits (although this number is expected to be low). For hits contained in nested hit groups, the concept of a window defined by hits and offset is not well defined and does not apply.

      Error handling in searchers:

      • Unexpected events: Throw any RuntimeException. This query will fail with the exception message, and the error will be logged
      • Expected events: Create (new Result(Query, ErrorMessage) or add result.setErrorIfNoOtherErrors(ErrorMessage) an error message to the Result.
      • Recoverable user errors: Add a FeedbackHit explaining the condition and how to correct it.
      Overrides:
      search in class LLMSearcher
      Parameters:
      query - the query
      Returns:
      the result of making this query
    • buildPrompt

      protected ai.vespa.llm.completion.Prompt buildPrompt(Query query, Result result)