- All Superinterfaces:
AutoCloseable
,Closeable
-
Nested Class Summary
Nested Classes -
Method Summary
Modifier and TypeMethodDescriptiondefault boolean
Checks the task type against the set of supported streaming tasks returned bysupportedStreamingTasks()
.default void
checkModelConfig
(Model model, ActionListener<Model> listener) Optionally test the new model configuration in the inference service.void
chunkedInfer
(Model model, String query, List<String> input, Map<String, Object> taskSettings, InputType inputType, ChunkingOptions chunkingOptions, TimeValue timeout, ActionListener<List<ChunkedInferenceServiceResults>> listener) Chunk long text according tochunkingOptions
or the model defaults ifchunkingOptions
contains unset values.default List<InferenceService.DefaultConfigId>
Get the Ids and task type of any default configurations provided by this servicedefault void
defaultConfigs
(ActionListener<List<Model>> defaultsListener) Call the listener with the default model configurations defined by the serviceDefines the version required across all clusters to use this servicedefault Boolean
Whether this service should be hidden from the API.void
infer
(Model model, String query, List<String> input, boolean stream, Map<String, Object> taskSettings, InputType inputType, TimeValue timeout, ActionListener<InferenceServiceResults> listener) Perform inference on the model.default void
name()
Parse model configuration fromconfig map
from persisted storage and return the parsedModel
.parsePersistedConfigWithSecrets
(String modelId, TaskType taskType, Map<String, Object> config, Map<String, Object> secrets) Parse model configuration fromconfig map
from persisted storage and return the parsedModel
.void
parseRequestConfig
(String modelId, TaskType taskType, Map<String, Object> config, ActionListener<Model> parsedModelListener) Parse model configuration from theconfig map
from a request and return the parsedModel
.void
start
(Model model, TimeValue timeout, ActionListener<Boolean> listener) Start or prepare the model for use.default void
stop
(UnparsedModel unparsedModel, ActionListener<Boolean> listener) Stop the model deployment.The set of tasks where this service provider supports using the streaming API.The task types supported by the servicedefault void
updateModelsWithDynamicFields
(List<Model> model, ActionListener<List<Model>> listener) default Model
Update a chat completion model's max tokens if required.default Model
updateModelWithEmbeddingDetails
(Model model, int embeddingSize) Update a text embedding model's dimensions based on a provided embedding size and set the default similarity if required.
-
Method Details
-
init
-
name
String name() -
parseRequestConfig
void parseRequestConfig(String modelId, TaskType taskType, Map<String, Object> config, ActionListener<Model> parsedModelListener) Parse model configuration from theconfig map
from a request and return the parsedModel
. This requires that both the secrets and service settings be contained in theservice_settings
field. This function modifiesconfig map
, fields are removed from the map as they are read.If the map contains unrecognized configuration option an
ElasticsearchStatusException
is thrown.- Parameters:
modelId
- Model IdtaskType
- The model task typeconfig
- Configuration options including the secretsparsedModelListener
- A listener which will handle the resulting model or failure
-
parsePersistedConfigWithSecrets
Model parsePersistedConfigWithSecrets(String modelId, TaskType taskType, Map<String, Object> config, Map<String, Object> secrets) Parse model configuration fromconfig map
from persisted storage and return the parsedModel
. This requires that secrets and service settings be in two separate maps. This function modifiesconfig map
, fields are removed from the map as they are read. If the map contains unrecognized configuration options, no error is thrown.- Parameters:
modelId
- Model IdtaskType
- The model task typeconfig
- Configuration optionssecrets
- Sensitive configuration options (e.g. api key)- Returns:
- The parsed
Model
-
parsePersistedConfig
Parse model configuration fromconfig map
from persisted storage and return the parsedModel
. This function modifiesconfig map
, fields are removed from the map as they are read. If the map contains unrecognized configuration options, no error is thrown.- Parameters:
modelId
- Model IdtaskType
- The model task typeconfig
- Configuration options- Returns:
- The parsed
Model
-
getConfiguration
InferenceServiceConfiguration getConfiguration() -
hideFromConfigurationApi
Whether this service should be hidden from the API. Should be used for services that are not ready to be used. -
supportedTaskTypes
The task types supported by the service- Returns:
- Set of supported.
-
infer
void infer(Model model, @Nullable String query, List<String> input, boolean stream, Map<String, Object> taskSettings, InputType inputType, TimeValue timeout, ActionListener<InferenceServiceResults> listener) Perform inference on the model.- Parameters:
model
- The modelquery
- Inference query, mainly for re-rankinginput
- Inference inputstream
- Stream inference resultstaskSettings
- Settings in the request to override the model's defaultsinputType
- For search, ingest etctimeout
- The timeout for the requestlistener
- Inference result listener
-
chunkedInfer
void chunkedInfer(Model model, @Nullable String query, List<String> input, Map<String, Object> taskSettings, InputType inputType, ChunkingOptions chunkingOptions, TimeValue timeout, ActionListener<List<ChunkedInferenceServiceResults>> listener) Chunk long text according tochunkingOptions
or the model defaults ifchunkingOptions
contains unset values.- Parameters:
model
- The modelquery
- Inference query, mainly for re-rankinginput
- Inference inputtaskSettings
- Settings in the request to override the model's defaultsinputType
- For search, ingest etcchunkingOptions
- The window and span options to applytimeout
- The timeout for the requestlistener
- Chunked Inference result listener
-
start
Start or prepare the model for use.- Parameters:
model
- The modeltimeout
- Start timeoutlistener
- The listener
-
stop
Stop the model deployment. The default action does nothing except acknowledge the request (true).- Parameters:
unparsedModel
- The unparsed model configurationlistener
- The listener
-
checkModelConfig
Optionally test the new model configuration in the inference service. This function should be called when the model is first created, the default action is to do nothing.- Parameters:
model
- The new modellistener
- The listener
-
updateModelWithEmbeddingDetails
Update a text embedding model's dimensions based on a provided embedding size and set the default similarity if required. The default behaviour is to just return the model.- Parameters:
model
- The original model without updated embedding detailsembeddingSize
- The embedding size to update the model with- Returns:
- The model with updated embedding details
-
updateModelWithChatCompletionDetails
Update a chat completion model's max tokens if required. The default behaviour is to just return the model.- Parameters:
model
- The original model without updated embedding details- Returns:
- The model with updated chat completion details
-
getMinimalSupportedVersion
TransportVersion getMinimalSupportedVersion()Defines the version required across all clusters to use this service- Returns:
TransportVersion
specifying the version
-
supportedStreamingTasks
The set of tasks where this service provider supports using the streaming API.- Returns:
- set of supported task types. Defaults to empty.
-
canStream
Checks the task type against the set of supported streaming tasks returned bysupportedStreamingTasks()
.- Parameters:
taskType
- the task that supports streaming- Returns:
- true if the taskType is supported
-
defaultConfigIds
Get the Ids and task type of any default configurations provided by this service- Returns:
- Defaults
-
defaultConfigs
Call the listener with the default model configurations defined by the service- Parameters:
defaultsListener
- The listener
-
updateModelsWithDynamicFields
-