All Superinterfaces:: AutoCloseable, Closeable

public interface InferenceService extends Closeable

Nested Class Summary

Nested Classes

Modifier and Type

Interface

Description

static final record

InferenceService.DefaultConfigId
Method Summary

Modifier and Type

Method

Description

default boolean

canStream(TaskType taskType)

Checks the task type against the set of supported streaming tasks returned by supportedStreamingTasks().

default void

checkModelConfig(Model model, ActionListener<Model> listener)

Optionally test the new model configuration in the inference service.

void

chunkedInfer(Model model, String query, List<String> input, Map<String,Object> taskSettings, InputType inputType, ChunkingOptions chunkingOptions, TimeValue timeout, ActionListener<List<ChunkedInferenceServiceResults>> listener)

Chunk long text according to chunkingOptions or the model defaults if chunkingOptions contains unset values.

default List<InferenceService.DefaultConfigId>

defaultConfigIds()

Get the Ids and task type of any default configurations provided by this service

default void

defaultConfigs(ActionListener<List<Model>> defaultsListener)

Call the listener with the default model configurations defined by the service

InferenceServiceConfiguration

getConfiguration()

TransportVersion

getMinimalSupportedVersion()

Defines the version required across all clusters to use this service

default Boolean

hideFromConfigurationApi()

Whether this service should be hidden from the API.

void

infer(Model model, String query, List<String> input, boolean stream, Map<String,Object> taskSettings, InputType inputType, TimeValue timeout, ActionListener<InferenceServiceResults> listener)

Perform inference on the model.

default void

init(Client client)

String

name()

Model

parsePersistedConfig(String modelId, TaskType taskType, Map<String,Object> config)

Parse model configuration from config map from persisted storage and return the parsed Model.

Model

parsePersistedConfigWithSecrets(String modelId, TaskType taskType, Map<String,Object> config, Map<String,Object> secrets)

Parse model configuration from config map from persisted storage and return the parsed Model.

void

parseRequestConfig(String modelId, TaskType taskType, Map<String,Object> config, ActionListener<Model> parsedModelListener)

Parse model configuration from the config map from a request and return the parsed Model.

void

start(Model model, TimeValue timeout, ActionListener<Boolean> listener)

Start or prepare the model for use.

default void

stop(UnparsedModel unparsedModel, ActionListener<Boolean> listener)

Stop the model deployment.

default Set<TaskType>

supportedStreamingTasks()

The set of tasks where this service provider supports using the streaming API.

EnumSet<TaskType>

supportedTaskTypes()

The task types supported by the service

default void

updateModelsWithDynamicFields(List<Model> model, ActionListener<List<Model>> listener)

default Model

updateModelWithChatCompletionDetails(Model model)

Update a chat completion model's max tokens if required.

default Model

updateModelWithEmbeddingDetails(Model model, int embeddingSize)

Update a text embedding model's dimensions based on a provided embedding size and set the default similarity if required.

Methods inherited from interface java.io.Closeable
close

Method Details
- init
  
  default void init(Client client)
- name
  
  String name()
- parseRequestConfig
  
  void parseRequestConfig(String modelId, TaskType taskType, Map<String,Object> config, ActionListener<Model> parsedModelListener)
  
  Parse model configuration from the config map from a request and return the parsed Model. This requires that both the secrets and service settings be contained in the service_settings field. This function modifies config map, fields are removed from the map as they are read.
  If the map contains unrecognized configuration option an ElasticsearchStatusException is thrown.
  
  Parameters:
  
  modelId - Model Id
  
  taskType - The model task type
  
  config - Configuration options including the secrets
  
  parsedModelListener - A listener which will handle the resulting model or failure
- parsePersistedConfigWithSecrets
  
  Model parsePersistedConfigWithSecrets(String modelId, TaskType taskType, Map<String,Object> config, Map<String,Object> secrets)
  
  Parse model configuration from config map from persisted storage and return the parsed Model. This requires that secrets and service settings be in two separate maps. This function modifies config map, fields are removed from the map as they are read. If the map contains unrecognized configuration options, no error is thrown.
  
  Parameters:
  
  modelId - Model Id
  
  taskType - The model task type
  
  config - Configuration options
  
  secrets - Sensitive configuration options (e.g. api key)
  
  Returns:
  
  The parsed Model
- parsePersistedConfig
  
  Model parsePersistedConfig(String modelId, TaskType taskType, Map<String,Object> config)
  
  Parse model configuration from config map from persisted storage and return the parsed Model. This function modifies config map, fields are removed from the map as they are read. If the map contains unrecognized configuration options, no error is thrown.
  
  Parameters:
  
  modelId - Model Id
  
  taskType - The model task type
  
  config - Configuration options
  
  Returns:
  
  The parsed Model
- getConfiguration
  
  InferenceServiceConfiguration getConfiguration()
- hideFromConfigurationApi
  
  default Boolean hideFromConfigurationApi()
  
  Whether this service should be hidden from the API. Should be used for services that are not ready to be used.
- supportedTaskTypes
  
  EnumSet<TaskType> supportedTaskTypes()
  
  The task types supported by the service
  
  Returns:
  
  Set of supported.
- infer
  
  void infer(Model model, @Nullable String query, List<String> input, boolean stream, Map<String,Object> taskSettings, InputType inputType, TimeValue timeout, ActionListener<InferenceServiceResults> listener)
  
  Perform inference on the model.
  
  Parameters:
  
  model - The model
  
  query - Inference query, mainly for re-ranking
  
  input - Inference input
  
  stream - Stream inference results
  
  taskSettings - Settings in the request to override the model's defaults
  
  inputType - For search, ingest etc
  
  timeout - The timeout for the request
  
  listener - Inference result listener
- chunkedInfer
  
  void chunkedInfer(Model model, @Nullable String query, List<String> input, Map<String,Object> taskSettings, InputType inputType, ChunkingOptions chunkingOptions, TimeValue timeout, ActionListener<List<ChunkedInferenceServiceResults>> listener)
  
  Chunk long text according to chunkingOptions or the model defaults if chunkingOptions contains unset values.
  
  Parameters:
  
  model - The model
  
  query - Inference query, mainly for re-ranking
  
  input - Inference input
  
  taskSettings - Settings in the request to override the model's defaults
  
  inputType - For search, ingest etc
  
  chunkingOptions - The window and span options to apply
  
  timeout - The timeout for the request
  
  listener - Chunked Inference result listener
- start
  
  void start(Model model, TimeValue timeout, ActionListener<Boolean> listener)
  
  Start or prepare the model for use.
  
  Parameters:
  
  model - The model
  
  timeout - Start timeout
  
  listener - The listener
- stop
  
  default void stop(UnparsedModel unparsedModel, ActionListener<Boolean> listener)
  
  Stop the model deployment. The default action does nothing except acknowledge the request (true).
  
  Parameters:
  
  unparsedModel - The unparsed model configuration
  
  listener - The listener
- checkModelConfig
  
  default void checkModelConfig(Model model, ActionListener<Model> listener)
  
  Optionally test the new model configuration in the inference service. This function should be called when the model is first created, the default action is to do nothing.
  
  Parameters:
  
  model - The new model
  
  listener - The listener
- updateModelWithEmbeddingDetails
  
  default Model updateModelWithEmbeddingDetails(Model model, int embeddingSize)
  
  Update a text embedding model's dimensions based on a provided embedding size and set the default similarity if required. The default behaviour is to just return the model.
  
  Parameters:
  
  model - The original model without updated embedding details
  
  embeddingSize - The embedding size to update the model with
  
  Returns:
  
  The model with updated embedding details
- updateModelWithChatCompletionDetails
  
  default Model updateModelWithChatCompletionDetails(Model model)
  
  Update a chat completion model's max tokens if required. The default behaviour is to just return the model.
  
  Parameters:
  
  model - The original model without updated embedding details
  
  Returns:
  
  The model with updated chat completion details
- getMinimalSupportedVersion
  
  TransportVersion getMinimalSupportedVersion()
  
  Defines the version required across all clusters to use this service
  
  Returns:
  
  TransportVersion specifying the version
- supportedStreamingTasks
  
  default Set<TaskType> supportedStreamingTasks()
  
  The set of tasks where this service provider supports using the streaming API.
  
  Returns:
  
  set of supported task types. Defaults to empty.
- canStream
  
  default boolean canStream(TaskType taskType)
  
  Checks the task type against the set of supported streaming tasks returned by supportedStreamingTasks().
  
  Parameters:
  
  taskType - the task that supports streaming
  
  Returns:
  
  true if the taskType is supported
- defaultConfigIds
  
  default List<InferenceService.DefaultConfigId> defaultConfigIds()
  
  Get the Ids and task type of any default configurations provided by this service
  
  Returns:
  
  Defaults
- defaultConfigs
  
  default void defaultConfigs(ActionListener<List<Model>> defaultsListener)
  
  Call the listener with the default model configurations defined by the service
  
  Parameters:
  
  defaultsListener - The listener
- updateModelsWithDynamicFields
  
  default void updateModelsWithDynamicFields(List<Model> model, ActionListener<List<Model>> listener)

Interface InferenceService

Nested Class Summary

Method Summary

Methods inherited from interface java.io.Closeable

Method Details

init

name

parseRequestConfig

parsePersistedConfigWithSecrets

parsePersistedConfig

getConfiguration

hideFromConfigurationApi

supportedTaskTypes

infer

chunkedInfer

start

stop

checkModelConfig

updateModelWithEmbeddingDetails

updateModelWithChatCompletionDetails

getMinimalSupportedVersion

supportedStreamingTasks

canStream

defaultConfigIds

defaultConfigs

updateModelsWithDynamicFields