@Generated public class ServingEndpointsAPI extends Object
You can use a serving endpoint to serve models from the Databricks Model Registry or from Unity Catalog. Endpoints expose the underlying models as scalable REST API endpoints using serverless compute. This means the endpoints and associated compute resources are fully managed by Databricks and will not appear in your cloud account. A serving endpoint can consist of one or more MLflow models from the Databricks Model Registry, called served models. A serving endpoint can have at most ten served models. You can configure traffic settings to define how requests should be routed to your served models behind an endpoint. Additionally, you can configure the scale of resources that should be applied to each served model.
Constructor and Description |
---|
ServingEndpointsAPI(ApiClient apiClient)
Regular-use constructor
|
ServingEndpointsAPI(ServingEndpointsService mock)
Constructor for mocks
|
public ServingEndpointsAPI(ApiClient apiClient)
public ServingEndpointsAPI(ServingEndpointsService mock)
public ServingEndpointDetailed waitGetServingEndpointNotUpdating(String name) throws TimeoutException
TimeoutException
public ServingEndpointDetailed waitGetServingEndpointNotUpdating(String name, Duration timeout, Consumer<ServingEndpointDetailed> callback) throws TimeoutException
TimeoutException
public BuildLogsResponse buildLogs(String name, String servedModelName)
public BuildLogsResponse buildLogs(BuildLogsRequest request)
Retrieves the build logs associated with the provided served model.
public Wait<ServingEndpointDetailed,ServingEndpointDetailed> create(String name, EndpointCoreConfigInput config)
public Wait<ServingEndpointDetailed,ServingEndpointDetailed> create(CreateServingEndpoint request)
public void delete(String name)
public void delete(DeleteServingEndpointRequest request)
public void exportMetrics(String name)
public void exportMetrics(ExportMetricsRequest request)
Retrieves the metrics associated with the provided serving endpoint in either Prometheus or OpenMetrics exposition format.
public ServingEndpointDetailed get(String name)
public ServingEndpointDetailed get(GetServingEndpointRequest request)
Retrieves the details for a single serving endpoint.
public GetServingEndpointPermissionLevelsResponse getPermissionLevels(String servingEndpointId)
public GetServingEndpointPermissionLevelsResponse getPermissionLevels(GetServingEndpointPermissionLevelsRequest request)
Gets the permission levels that a user can have on an object.
public ServingEndpointPermissions getPermissions(String servingEndpointId)
public ServingEndpointPermissions getPermissions(GetServingEndpointPermissionsRequest request)
Gets the permissions of a serving endpoint. Serving endpoints can inherit permissions from their root object.
public Iterable<ServingEndpoint> list()
public ServerLogsResponse logs(String name, String servedModelName)
public ServerLogsResponse logs(LogsRequest request)
Retrieves the service logs associated with the provided served model.
public Iterable<EndpointTag> patch(String name)
public Iterable<EndpointTag> patch(PatchServingEndpointTags request)
Used to batch add and delete tags from a serving endpoint with a single API call.
public QueryEndpointResponse query(String name)
public QueryEndpointResponse query(QueryEndpointInput request)
public ServingEndpointPermissions setPermissions(String servingEndpointId)
public ServingEndpointPermissions setPermissions(ServingEndpointPermissionsRequest request)
Sets permissions on a serving endpoint. Serving endpoints can inherit permissions from their root object.
public Wait<ServingEndpointDetailed,ServingEndpointDetailed> updateConfig(String name, Collection<ServedModelInput> servedModels)
public Wait<ServingEndpointDetailed,ServingEndpointDetailed> updateConfig(EndpointCoreConfigInput request)
Updates any combination of the serving endpoint's served models, the compute configuration of those served models, and the endpoint's traffic config. An endpoint that already has an update in progress can not be updated until the current update completes or fails.
public ServingEndpointPermissions updatePermissions(String servingEndpointId)
public ServingEndpointPermissions updatePermissions(ServingEndpointPermissionsRequest request)
Updates the permissions on a serving endpoint. Serving endpoints can inherit permissions from their root object.
public ServingEndpointsService impl()
Copyright © 2023. All rights reserved.