@Generated public class WorkspaceClient extends Object
Constructor and Description |
---|
WorkspaceClient() |
WorkspaceClient(boolean mock)
Constructor for mocks
|
WorkspaceClient(boolean mock,
ApiClient apiClient)
Constructor for mocks
|
WorkspaceClient(DatabricksConfig config) |
Modifier and Type | Method and Description |
---|---|
AccountAccessControlProxyAPI |
accountAccessControlProxy()
These APIs manage access rules on resources in an account.
|
AlertsAPI |
alerts()
The alerts API can be used to perform CRUD operations on alerts.
|
ApiClient |
apiClient() |
AppsAPI |
apps()
Lakehouse Apps run directly on a customer’s Databricks instance, integrate with their data, use
and extend Databricks services, and enable users to interact through single sign-on.
|
ArtifactAllowlistsAPI |
artifactAllowlists()
In Databricks Runtime 13.3 and above, you can add libraries and init scripts to the `allowlist`
in UC so that users can leverage these artifacts on compute configured with shared access mode.
|
CatalogsAPI |
catalogs()
A catalog is the first layer of Unity Catalog’s three-level namespace.
|
CleanRoomsAPI |
cleanRooms()
A clean room is a secure, privacy-protecting environment where two or more parties can share
sensitive enterprise data, including customer data, for measurements, insights, activation and
other use cases.
|
ClusterPoliciesAPI |
clusterPolicies()
You can use cluster policies to control users' ability to configure clusters based on a set of
rules.
|
ClustersExt |
clusters()
The Clusters API allows you to create, start, edit, list, terminate, and delete clusters.
|
CommandExecutionAPI |
commandExecution()
This API allows execution of Python, Scala, SQL, or R commands on running Databricks Clusters.
|
DatabricksConfig |
config() |
ConnectionsAPI |
connections()
Connections allow for creating a connection to an external data source.
|
CredentialsManagerAPI |
credentialsManager()
Credentials manager interacts with with Identity Providers to to perform token exchanges using
stored credentials and refresh tokens.
|
CurrentUserAPI |
currentUser()
This API allows retrieving information about currently authenticated user or service principal.
|
DashboardsAPI |
dashboards()
In general, there is little need to modify dashboards using the API.
|
DashboardWidgetsAPI |
dashboardWidgets()
This is an evolving API that facilitates the addition and removal of widgets from existing
dashboards within the Databricks Workspace.
|
DataSourcesAPI |
dataSources()
This API is provided to assist you in making new query objects.
|
DbfsExt |
dbfs()
DBFS API makes it simple to interact with various data sources without having to include a
users credentials every time to read a file.
|
DbsqlPermissionsAPI |
dbsqlPermissions()
The SQL Permissions API is similar to the endpoints of the :method:permissions/set.
|
ExperimentsAPI |
experiments()
Experiments are the primary unit of organization in MLflow; all MLflow runs belong to an
experiment.
|
ExternalLocationsAPI |
externalLocations()
An external location is an object that combines a cloud storage path with a storage credential
that authorizes access to the cloud storage path.
|
FilesAPI |
files()
The Files API allows you to read, write, and delete files and directories in Unity Catalog
volumes.
|
FunctionsAPI |
functions()
Functions implement User-Defined Functions (UDFs) in Unity Catalog.
|
GitCredentialsAPI |
gitCredentials()
Registers personal access token for Databricks to do operations on behalf of the user.
|
GlobalInitScriptsAPI |
globalInitScripts()
The Global Init Scripts API enables Workspace administrators to configure global initialization
scripts for their workspace.
|
GrantsAPI |
grants()
In Unity Catalog, data is secure by default.
|
GroupsAPI |
groups()
Groups simplify identity management, making it easier to assign access to Databricks workspace,
data, and other securable objects.
|
InstancePoolsAPI |
instancePools()
Instance Pools API are used to create, edit, delete and list instance pools by using
ready-to-use cloud instances which reduces a cluster start and auto-scaling times.
|
InstanceProfilesAPI |
instanceProfiles()
The Instance Profiles API allows admins to add, list, and remove instance profiles that users
can launch clusters with.
|
IpAccessListsAPI |
ipAccessLists()
IP Access List enables admins to configure IP access lists.
|
JobsAPI |
jobs()
The Jobs API allows you to create, edit, and delete jobs.
|
LibrariesAPI |
libraries()
The Libraries API allows you to install and uninstall libraries and get the status of libraries
on a cluster.
|
MetastoresAPI |
metastores()
A metastore is the top-level container of objects in Unity Catalog.
|
ModelRegistryAPI |
modelRegistry()
Note: This API reference documents APIs for the Workspace Model Registry.
|
ModelVersionsAPI |
modelVersions()
Databricks provides a hosted version of MLflow Model Registry in Unity Catalog.
|
PermissionsAPI |
permissions()
Permissions API are used to create read, write, edit, update and manage access for various
users on different objects and endpoints.
|
PipelinesAPI |
pipelines()
The Delta Live Tables API allows you to create, edit, delete, start, and view details about
pipelines.
|
PolicyFamiliesAPI |
policyFamilies()
View available policy families.
|
ProvidersAPI |
providers()
A data provider is an object representing the organization in the real world who shares the
data.
|
QueriesAPI |
queries()
These endpoints are used for CRUD operations on query definitions.
|
QueryHistoryAPI |
queryHistory()
Access the history of queries through SQL warehouses.
|
QueryVisualizationsAPI |
queryVisualizations()
This is an evolving API that facilitates the addition and removal of vizualisations from
existing queries within the Databricks Workspace.
|
RecipientActivationAPI |
recipientActivation()
The Recipient Activation API is only applicable in the open sharing model where the recipient
object has the authentication type of `TOKEN`.
|
RecipientsAPI |
recipients()
A recipient is an object you create using :method:recipients/create to represent an
organization which you want to allow access shares.
|
RegisteredModelsAPI |
registeredModels()
Databricks provides a hosted version of MLflow Model Registry in Unity Catalog.
|
ReposAPI |
repos()
The Repos API allows users to manage their git repos.
|
SchemasAPI |
schemas()
A schema (also called a database) is the second layer of Unity Catalog’s three-level namespace.
|
SecretsExt |
secrets()
The Secrets API allows you to manage secrets, secret scopes, and access permissions.
|
ServicePrincipalsAPI |
servicePrincipals()
Identities for use with jobs, automated tools, and systems such as scripts, apps, and CI/CD
platforms.
|
ServingEndpointsAPI |
servingEndpoints()
The Serving Endpoints API allows you to create, update, and delete model serving endpoints.
|
SettingsAPI |
settings()
The default namespace setting API allows users to configure the default namespace for a
Databricks workspace.
|
SharesAPI |
shares()
A share is a container instantiated with :method:shares/create.
|
StatementExecutionAPI |
statementExecution()
The Databricks SQL Statement Execution API can be used to execute SQL statements on a SQL
warehouse and fetch the result.
|
StorageCredentialsAPI |
storageCredentials()
A storage credential represents an authentication and authorization mechanism for accessing
data stored on your cloud tenant.
|
SystemSchemasAPI |
systemSchemas()
A system schema is a schema that lives within the system catalog.
|
TableConstraintsAPI |
tableConstraints()
Primary key and foreign key constraints encode relationships between fields in tables.
|
TablesAPI |
tables()
A table resides in the third layer of Unity Catalog’s three-level namespace.
|
TokenManagementAPI |
tokenManagement()
Enables administrators to get all tokens and delete tokens for other users.
|
TokensAPI |
tokens()
The Token API allows you to create, list, and revoke tokens that can be used to authenticate
and access Databricks REST APIs.
|
UsersAPI |
users()
User identities recognized by Databricks and represented by email addresses.
|
VolumesAPI |
volumes()
Volumes are a Unity Catalog (UC) capability for accessing, storing, governing, organizing and
processing files.
|
WarehousesAPI |
warehouses()
A SQL warehouse is a compute resource that lets you run SQL commands on data objects within
Databricks SQL.
|
WorkspaceClient |
withAccountAccessControlProxyImpl(AccountAccessControlProxyService accountAccessControlProxy)
Replace AccountAccessControlProxyAPI implementation with mock
|
WorkspaceClient |
withAlertsImpl(AlertsService alerts)
Replace AlertsAPI implementation with mock
|
WorkspaceClient |
withAppsImpl(AppsService apps)
Replace AppsAPI implementation with mock
|
WorkspaceClient |
withArtifactAllowlistsImpl(ArtifactAllowlistsService artifactAllowlists)
Replace ArtifactAllowlistsAPI implementation with mock
|
WorkspaceClient |
withCatalogsImpl(CatalogsService catalogs)
Replace CatalogsAPI implementation with mock
|
WorkspaceClient |
withCleanRoomsImpl(CleanRoomsService cleanRooms)
Replace CleanRoomsAPI implementation with mock
|
WorkspaceClient |
withClusterPoliciesImpl(ClusterPoliciesService clusterPolicies)
Replace ClusterPoliciesAPI implementation with mock
|
WorkspaceClient |
withClustersImpl(ClustersService clusters)
Replace ClustersAPI implementation with mock
|
WorkspaceClient |
withCommandExecutionImpl(CommandExecutionService commandExecution)
Replace CommandExecutionAPI implementation with mock
|
WorkspaceClient |
withConnectionsImpl(ConnectionsService connections)
Replace ConnectionsAPI implementation with mock
|
WorkspaceClient |
withCredentialsManagerImpl(CredentialsManagerService credentialsManager)
Replace CredentialsManagerAPI implementation with mock
|
WorkspaceClient |
withCurrentUserImpl(CurrentUserService currentUser)
Replace CurrentUserAPI implementation with mock
|
WorkspaceClient |
withDashboardsImpl(DashboardsService dashboards)
Replace DashboardsAPI implementation with mock
|
WorkspaceClient |
withDashboardWidgetsImpl(DashboardWidgetsService dashboardWidgets)
Replace DashboardWidgetsAPI implementation with mock
|
WorkspaceClient |
withDataSourcesImpl(DataSourcesService dataSources)
Replace DataSourcesAPI implementation with mock
|
WorkspaceClient |
withDbfsImpl(DbfsService dbfs)
Replace DbfsAPI implementation with mock
|
WorkspaceClient |
withDbsqlPermissionsImpl(DbsqlPermissionsService dbsqlPermissions)
Replace DbsqlPermissionsAPI implementation with mock
|
WorkspaceClient |
withExperimentsImpl(ExperimentsService experiments)
Replace ExperimentsAPI implementation with mock
|
WorkspaceClient |
withExternalLocationsImpl(ExternalLocationsService externalLocations)
Replace ExternalLocationsAPI implementation with mock
|
WorkspaceClient |
withFilesImpl(FilesService files)
Replace FilesAPI implementation with mock
|
WorkspaceClient |
withFunctionsImpl(FunctionsService functions)
Replace FunctionsAPI implementation with mock
|
WorkspaceClient |
withGitCredentialsImpl(GitCredentialsService gitCredentials)
Replace GitCredentialsAPI implementation with mock
|
WorkspaceClient |
withGlobalInitScriptsImpl(GlobalInitScriptsService globalInitScripts)
Replace GlobalInitScriptsAPI implementation with mock
|
WorkspaceClient |
withGrantsImpl(GrantsService grants)
Replace GrantsAPI implementation with mock
|
WorkspaceClient |
withGroupsImpl(GroupsService groups)
Replace GroupsAPI implementation with mock
|
WorkspaceClient |
withInstancePoolsImpl(InstancePoolsService instancePools)
Replace InstancePoolsAPI implementation with mock
|
WorkspaceClient |
withInstanceProfilesImpl(InstanceProfilesService instanceProfiles)
Replace InstanceProfilesAPI implementation with mock
|
WorkspaceClient |
withIpAccessListsImpl(IpAccessListsService ipAccessLists)
Replace IpAccessListsAPI implementation with mock
|
WorkspaceClient |
withJobsImpl(JobsService jobs)
Replace JobsAPI implementation with mock
|
WorkspaceClient |
withLibrariesImpl(LibrariesService libraries)
Replace LibrariesAPI implementation with mock
|
WorkspaceClient |
withMetastoresImpl(MetastoresService metastores)
Replace MetastoresAPI implementation with mock
|
WorkspaceClient |
withModelRegistryImpl(ModelRegistryService modelRegistry)
Replace ModelRegistryAPI implementation with mock
|
WorkspaceClient |
withModelVersionsImpl(ModelVersionsService modelVersions)
Replace ModelVersionsAPI implementation with mock
|
WorkspaceClient |
withPermissionsImpl(PermissionsService permissions)
Replace PermissionsAPI implementation with mock
|
WorkspaceClient |
withPipelinesImpl(PipelinesService pipelines)
Replace PipelinesAPI implementation with mock
|
WorkspaceClient |
withPolicyFamiliesImpl(PolicyFamiliesService policyFamilies)
Replace PolicyFamiliesAPI implementation with mock
|
WorkspaceClient |
withProvidersImpl(ProvidersService providers)
Replace ProvidersAPI implementation with mock
|
WorkspaceClient |
withQueriesImpl(QueriesService queries)
Replace QueriesAPI implementation with mock
|
WorkspaceClient |
withQueryHistoryImpl(QueryHistoryService queryHistory)
Replace QueryHistoryAPI implementation with mock
|
WorkspaceClient |
withQueryVisualizationsImpl(QueryVisualizationsService queryVisualizations)
Replace QueryVisualizationsAPI implementation with mock
|
WorkspaceClient |
withRecipientActivationImpl(RecipientActivationService recipientActivation)
Replace RecipientActivationAPI implementation with mock
|
WorkspaceClient |
withRecipientsImpl(RecipientsService recipients)
Replace RecipientsAPI implementation with mock
|
WorkspaceClient |
withRegisteredModelsImpl(RegisteredModelsService registeredModels)
Replace RegisteredModelsAPI implementation with mock
|
WorkspaceClient |
withReposImpl(ReposService repos)
Replace ReposAPI implementation with mock
|
WorkspaceClient |
withSchemasImpl(SchemasService schemas)
Replace SchemasAPI implementation with mock
|
WorkspaceClient |
withSecretsImpl(SecretsService secrets)
Replace SecretsAPI implementation with mock
|
WorkspaceClient |
withServicePrincipalsImpl(ServicePrincipalsService servicePrincipals)
Replace ServicePrincipalsAPI implementation with mock
|
WorkspaceClient |
withServingEndpointsImpl(ServingEndpointsService servingEndpoints)
Replace ServingEndpointsAPI implementation with mock
|
WorkspaceClient |
withSettingsImpl(SettingsService settings)
Replace SettingsAPI implementation with mock
|
WorkspaceClient |
withSharesImpl(SharesService shares)
Replace SharesAPI implementation with mock
|
WorkspaceClient |
withStatementExecutionImpl(StatementExecutionService statementExecution)
Replace StatementExecutionAPI implementation with mock
|
WorkspaceClient |
withStorageCredentialsImpl(StorageCredentialsService storageCredentials)
Replace StorageCredentialsAPI implementation with mock
|
WorkspaceClient |
withSystemSchemasImpl(SystemSchemasService systemSchemas)
Replace SystemSchemasAPI implementation with mock
|
WorkspaceClient |
withTableConstraintsImpl(TableConstraintsService tableConstraints)
Replace TableConstraintsAPI implementation with mock
|
WorkspaceClient |
withTablesImpl(TablesService tables)
Replace TablesAPI implementation with mock
|
WorkspaceClient |
withTokenManagementImpl(TokenManagementService tokenManagement)
Replace TokenManagementAPI implementation with mock
|
WorkspaceClient |
withTokensImpl(TokensService tokens)
Replace TokensAPI implementation with mock
|
WorkspaceClient |
withUsersImpl(UsersService users)
Replace UsersAPI implementation with mock
|
WorkspaceClient |
withVolumesImpl(VolumesService volumes)
Replace VolumesAPI implementation with mock
|
WorkspaceClient |
withWarehousesImpl(WarehousesService warehouses)
Replace WarehousesAPI implementation with mock
|
WorkspaceClient |
withWorkspaceBindingsImpl(WorkspaceBindingsService workspaceBindings)
Replace WorkspaceBindingsAPI implementation with mock
|
WorkspaceClient |
withWorkspaceConfImpl(WorkspaceConfService workspaceConf)
Replace WorkspaceConfAPI implementation with mock
|
WorkspaceClient |
withWorkspaceImpl(WorkspaceService workspace)
Replace WorkspaceAPI implementation with mock
|
WorkspaceAPI |
workspace()
The Workspace API allows you to list, import, export, and delete notebooks and folders.
|
WorkspaceBindingsAPI |
workspaceBindings()
A securable in Databricks can be configured as __OPEN__ or __ISOLATED__.
|
WorkspaceConfAPI |
workspaceConf()
This API allows updating known workspace settings for advanced users.
|
public WorkspaceClient()
public WorkspaceClient(DatabricksConfig config)
public WorkspaceClient(boolean mock)
public WorkspaceClient(boolean mock, ApiClient apiClient)
public AccountAccessControlProxyAPI accountAccessControlProxy()
public AlertsAPI alerts()
public AppsAPI apps()
public ArtifactAllowlistsAPI artifactAllowlists()
public CatalogsAPI catalogs()
In Unity Catalog, admins and data stewards manage users and their access to data centrally across all of the workspaces in a Databricks account. Users in different workspaces can share access to the same data, depending on privileges granted centrally in Unity Catalog.
public CleanRoomsAPI cleanRooms()
To create clean rooms, you must be a metastore admin or a user with the **CREATE_CLEAN_ROOM** privilege.
public ClusterPoliciesAPI clusterPolicies()
With cluster policies, you can: - Auto-install cluster libraries on the next restart by listing them in the policy's "libraries" field. - Limit users to creating clusters with the prescribed settings. - Simplify the user interface, enabling more users to create clusters, by fixing and hiding some fields. - Manage costs by setting limits on attributes that impact the hourly rate.
Cluster policy permissions limit which policies a user can select in the Policy drop-down when the user creates a cluster: - A user who has unrestricted cluster create permission can select the Unrestricted policy and create fully-configurable clusters. - A user who has both unrestricted cluster create permission and access to cluster policies can select the Unrestricted policy and policies they have access to. - A user that has access to only cluster policies, can select the policies they have access to.
If no policies exist in the workspace, the Policy drop-down doesn't appear. Only admin users can create, edit, and delete policies. Admin users also have access to all policies.
public ClustersExt clusters()
Databricks maps cluster node instance types to compute units known as DBUs. See the instance type pricing page for a list of the supported instance types and their corresponding DBUs.
A Databricks cluster is a set of computation resources and configurations on which you run data engineering, data science, and data analytics workloads, such as production ETL pipelines, streaming analytics, ad-hoc analytics, and machine learning.
You run these workloads as a set of commands in a notebook or as an automated job. Databricks makes a distinction between all-purpose clusters and job clusters. You use all-purpose clusters to analyze data collaboratively using interactive notebooks. You use job clusters to run fast and robust automated jobs.
You can create an all-purpose cluster using the UI, CLI, or REST API. You can manually terminate and restart an all-purpose cluster. Multiple users can share such clusters to do collaborative interactive analysis.
IMPORTANT: Databricks retains cluster configuration information for up to 200 all-purpose clusters terminated in the last 30 days and up to 30 job clusters recently terminated by the job scheduler. To keep an all-purpose cluster configuration even after it has been terminated for more than 30 days, an administrator can pin a cluster to the cluster list.
public CommandExecutionAPI commandExecution()
public ConnectionsAPI connections()
A connection is an abstraction of an external data source that can be connected from Databricks Compute. Creating a connection object is the first step to managing external data sources within Unity Catalog, with the second step being creating a data object (catalog, schema, or table) using the connection. Data objects derived from a connection can be written to or read from similar to other Unity Catalog data objects based on cloud storage. Users may create different types of connections with each connection having a unique set of configuration options to support credential management and other settings.
public CredentialsManagerAPI credentialsManager()
public CurrentUserAPI currentUser()
public DashboardWidgetsAPI dashboardWidgets()
public DashboardsAPI dashboards()
public DataSourcesAPI dataSources()
This API does not support searches. It returns the full list of SQL warehouses in your workspace. We advise you to use any text editor, REST client, or `grep` to search the response from this API for the name of your SQL warehouse as it appears in Databricks SQL.
public DbfsExt dbfs()
public DbsqlPermissionsAPI dbsqlPermissions()
There are three levels of permission:
- `CAN_VIEW`: Allows read-only access
- `CAN_RUN`: Allows read access and run access (superset of `CAN_VIEW`)
- `CAN_MANAGE`: Allows all actions: read, run, edit, delete, modify permissions (superset of `CAN_RUN`)
public ExperimentsAPI experiments()
Experiments are located in the workspace file tree. You manage experiments using the same tools you use to manage other workspace objects such as folders, notebooks, and libraries.
public ExternalLocationsAPI externalLocations()
Databricks recommends using external locations rather than using storage credentials directly.
To create external locations, you must be a metastore admin or a user with the **CREATE_EXTERNAL_LOCATION** privilege.
public FilesAPI files()
public FunctionsAPI functions()
The function implementation can be any SQL expression or Query, and it can be invoked wherever a table reference is allowed in a query. In Unity Catalog, a function resides at the same level as a table, so it can be referenced with the form __catalog_name__.__schema_name__.__function_name__.
public GitCredentialsAPI gitCredentials()
See [more info].
[more info]: https://docs.databricks.com/repos/get-access-tokens-from-git-provider.html
public GlobalInitScriptsAPI globalInitScripts()
**Important:** Existing clusters must be restarted to pick up any changes made to global init scripts. Global init scripts are run in order. If the init script returns with a bad exit code, the Apache Spark container fails to launch and init scripts with later position are skipped. If enough containers fail, the entire cluster fails with a `GLOBAL_INIT_SCRIPT_FAILURE` error code.
public GrantsAPI grants()
Securable objects in Unity Catalog are hierarchical and privileges are inherited downward. This means that granting a privilege on the catalog automatically grants the privilege to all current and future objects within the catalog. Similarly, privileges granted on a schema are inherited by all current and future objects within that schema.
public GroupsAPI groups()
It is best practice to assign access to workspaces and access-control policies in Unity Catalog to groups, instead of to users individually. All Databricks workspace identities can be assigned as members of groups, and members inherit permissions that are assigned to their group.
public InstancePoolsAPI instancePools()
Databricks pools reduce cluster start and auto-scaling times by maintaining a set of idle, ready-to-use instances. When a cluster is attached to a pool, cluster nodes are created using the pool’s idle instances. If the pool has no idle instances, the pool expands by allocating a new instance from the instance provider in order to accommodate the cluster’s request. When a cluster releases an instance, it returns to the pool and is free for another cluster to use. Only clusters attached to a pool can use that pool’s idle instances.
You can specify a different pool for the driver node and worker nodes, or use the same pool for both.
Databricks does not charge DBUs while instances are idle in the pool. Instance provider billing does apply. See pricing.
public InstanceProfilesAPI instanceProfiles()
[Secure access to S3 buckets]: https://docs.databricks.com/administration-guide/cloud-configurations/aws/instance-profiles.html
public IpAccessListsAPI ipAccessLists()
IP access lists affect web application access and REST API access to this workspace only. If the feature is disabled for a workspace, all access is allowed for this workspace. There is support for allow lists (inclusion) and block lists (exclusion).
When a connection is attempted: 1. **First, all block lists are checked.** If the connection IP address matches any block list, the connection is rejected. 2. **If the connection was not rejected by block lists**, the IP address is compared with the allow lists.
If there is at least one allow list for the workspace, the connection is allowed only if the IP address matches an allow list. If there are no allow lists for the workspace, all IP addresses are allowed.
For all allow lists and block lists combined, the workspace supports a maximum of 1000 IP/CIDR values, where one CIDR counts as a single value.
After changes to the IP access list feature, it can take a few minutes for changes to take effect.
public JobsAPI jobs()
You can use a Databricks job to run a data processing or data analysis task in a Databricks cluster with scalable resources. Your job can consist of a single task or can be a large, multi-task workflow with complex dependencies. Databricks manages the task orchestration, cluster management, monitoring, and error reporting for all of your jobs. You can run your jobs immediately or periodically through an easy-to-use scheduling system. You can implement job tasks using notebooks, JARS, Delta Live Tables pipelines, or Python, Scala, Spark submit, and Java applications.
You should never hard code secrets or store them in plain text. Use the [Secrets CLI] to manage secrets in the [Databricks CLI]. Use the [Secrets utility] to reference secrets in notebooks and jobs.
[Databricks CLI]: https://docs.databricks.com/dev-tools/cli/index.html [Secrets CLI]: https://docs.databricks.com/dev-tools/cli/secrets-cli.html [Secrets utility]: https://docs.databricks.com/dev-tools/databricks-utils.html#dbutils-secrets
public LibrariesAPI libraries()
To make third-party or custom code available to notebooks and jobs running on your clusters, you can install a library. Libraries can be written in Python, Java, Scala, and R. You can upload Java, Scala, and Python libraries and point to external packages in PyPI, Maven, and CRAN repositories.
Cluster libraries can be used by all notebooks running on a cluster. You can install a cluster library directly from a public repository such as PyPI or Maven, using a previously installed workspace library, or using an init script.
When you install a library on a cluster, a notebook already attached to that cluster will not immediately see the new library. You must first detach and then reattach the notebook to the cluster.
When you uninstall a library from a cluster, the library is removed only when you restart the cluster. Until you restart the cluster, the status of the uninstalled library appears as Uninstall pending restart.
public MetastoresAPI metastores()
Each metastore is configured with a root storage location in a cloud storage account. This storage location is used for metadata and managed tables data.
NOTE: This metastore is distinct from the metastore included in Databricks workspaces created before Unity Catalog was released. If your workspace includes a legacy Hive metastore, the data in that metastore is available in a catalog named hive_metastore.
public ModelRegistryAPI modelRegistry()
The Workspace Model Registry is a centralized model repository and a UI and set of APIs that enable you to manage the full lifecycle of MLflow Models.
public ModelVersionsAPI modelVersions()
This API reference documents the REST endpoints for managing model versions in Unity Catalog. For more details, see the [registered models API docs](/api/workspace/registeredmodels).
public PermissionsAPI permissions()
* **[Cluster permissions](:service:clusters)** — Manage which users can manage, restart, or attach to clusters.
* **[Cluster policy permissions](:service:clusterpolicies)** — Manage which users can use cluster policies.
* **[Delta Live Tables pipeline permissions](:service:pipelines)** — Manage which users can view, manage, run, cancel, or own a Delta Live Tables pipeline.
* **[Job permissions](:service:jobs)** — Manage which users can view, manage, trigger, cancel, or own a job.
* **[MLflow experiment permissions](:service:experiments)** — Manage which users can read, edit, or manage MLflow experiments.
* **[MLflow registered model permissions](:service:modelregistry)** — Manage which users can read, edit, or manage MLflow registered models.
* **[Password permissions](:service:users)** — Manage which users can use password login when SSO is enabled.
* **[Instance Pool permissions](:service:instancepools)** — Manage which users can manage or attach to pools.
* **[Repo permissions](repos)** — Manage which users can read, run, edit, or manage a repo.
* **[Serving endpoint permissions](:service:servingendpoints)** — Manage which users can view, query, or manage a serving endpoint.
* **[SQL warehouse permissions](:service:warehouses)** — Manage which users can use or manage SQL warehouses.
* **[Token permissions](:service:tokenmanagement)** — Manage which users can create or use tokens.
* **[Workspace object permissions](:service:workspace)** — Manage which users can read, run, edit, or manage directories, files, and notebooks.
For the mapping of the required permissions for specific actions or abilities and other important information, see [Access Control].
[Access Control]: https://docs.databricks.com/security/auth-authz/access-control/index.html
public PipelinesAPI pipelines()
Delta Live Tables is a framework for building reliable, maintainable, and testable data processing pipelines. You define the transformations to perform on your data, and Delta Live Tables manages task orchestration, cluster management, monitoring, data quality, and error handling.
Instead of defining your data pipelines using a series of separate Apache Spark tasks, Delta Live Tables manages how your data is transformed based on a target schema you define for each processing step. You can also enforce data quality with Delta Live Tables expectations. Expectations allow you to define expected data quality and specify how to handle records that fail those expectations.
public PolicyFamiliesAPI policyFamilies()
Databricks manages and provides policy families for several common cluster use cases. You cannot create, edit, or delete policy families.
Policy families cannot be used directly to create clusters. Instead, you create cluster policies using a policy family. Cluster policies created using a policy family inherit the policy family's policy definition.
public ProvidersAPI providers()
public QueriesAPI queries()
public QueryHistoryAPI queryHistory()
public QueryVisualizationsAPI queryVisualizations()
public RecipientActivationAPI recipientActivation()
Note that you can download the credential file only once. Recipients should treat the downloaded credential as a secret and must not share it outside of their organization.
public RecipientsAPI recipients()
- For recipients with access to a Databricks workspace that is enabled for Unity Catalog, you can create a recipient object along with a unique sharing identifier you get from the recipient. The sharing identifier is the key identifier that enables the secure connection. This sharing mode is called **Databricks-to-Databricks sharing**.
- For recipients without access to a Databricks workspace that is enabled for Unity Catalog, when you create a recipient object, Databricks generates an activation link you can send to the recipient. The recipient follows the activation link to download the credential file, and then uses the credential file to establish a secure connection to receive the shared data. This sharing mode is called **open sharing**.
public RegisteredModelsAPI registeredModels()
An MLflow registered model resides in the third layer of Unity Catalog’s three-level namespace. Registered models contain model versions, which correspond to actual ML models (MLflow models). Creating new model versions currently requires use of the MLflow Python client. Once model versions are created, you can load them for batch inference using MLflow Python client APIs, or deploy them for real-time serving using Databricks Model Serving.
All operations on registered models and model versions require USE_CATALOG permissions on the enclosing catalog and USE_SCHEMA permissions on the enclosing schema. In addition, the following additional privileges are required for various operations:
* To create a registered model, users must additionally have the CREATE_MODEL permission on the target schema. * To view registered model or model version metadata, model version data files, or invoke a model version, users must additionally have the EXECUTE permission on the registered model * To update registered model or model version tags, users must additionally have APPLY TAG permissions on the registered model * To update other registered model or model version metadata (comments, aliases) create a new model version, or update permissions on the registered model, users must be owners of the registered model.
Note: The securable type for models is "FUNCTION". When using REST APIs (e.g. tagging, grants) that specify a securable type, use "FUNCTION" as the securable type.
public ReposAPI repos()
Databricks Repos is a visual Git client in Databricks. It supports common Git operations such a cloning a repository, committing and pushing, pulling, branch management, and visual comparison of diffs when committing.
Within Repos you can develop code in notebooks or other files and follow data science and engineering code development best practices using Git for version control, collaboration, and CI/CD.
public SchemasAPI schemas()
public SecretsExt secrets()
Sometimes accessing data requires that you authenticate to external data sources through JDBC. Instead of directly entering your credentials into a notebook, use Databricks secrets to store your credentials and reference them in notebooks and jobs.
Administrators, secret creators, and users granted permission can read Databricks secrets. While Databricks makes an effort to redact secret values that might be displayed in notebooks, it is not possible to prevent such users from reading secrets.
public ServicePrincipalsAPI servicePrincipals()
public ServingEndpointsAPI servingEndpoints()
You can use a serving endpoint to serve models from the Databricks Model Registry or from Unity Catalog. Endpoints expose the underlying models as scalable REST API endpoints using serverless compute. This means the endpoints and associated compute resources are fully managed by Databricks and will not appear in your cloud account. A serving endpoint can consist of one or more MLflow models from the Databricks Model Registry, called served models. A serving endpoint can have at most ten served models. You can configure traffic settings to define how requests should be routed to your served models behind an endpoint. Additionally, you can configure the scale of resources that should be applied to each served model.
public SettingsAPI settings()
Through this API, users can retrieve, set, or modify the default namespace used when queries do not reference a fully qualified three-level name. For example, if you use the API to set 'retail_prod' as the default catalog, then a query 'SELECT * FROM myTable' would reference the object 'retail_prod.default.myTable' (the schema 'default' is always assumed).
This setting requires a restart of clusters and SQL warehouses to take effect. Additionally, the default namespace only applies when using Unity Catalog-enabled compute.
public SharesAPI shares()
public StatementExecutionAPI statementExecution()
**Getting started**
We suggest beginning with the [Databricks SQL Statement Execution API tutorial].
**Overview of statement execution and result fetching**
Statement execution begins by issuing a :method:statementexecution/executeStatement request with a valid SQL statement and warehouse ID, along with optional parameters such as the data catalog and output format. If no other parameters are specified, the server will wait for up to 10s before returning a response. If the statement has completed within this timespan, the response will include the result data as a JSON array and metadata. Otherwise, if no result is available after the 10s timeout expired, the response will provide the statement ID that can be used to poll for results by using a :method:statementexecution/getStatement request.
You can specify whether the call should behave synchronously, asynchronously or start synchronously with a fallback to asynchronous execution. This is controlled with the `wait_timeout` and `on_wait_timeout` settings. If `wait_timeout` is set between 5-50 seconds (default: 10s), the call waits for results up to the specified timeout; when set to `0s`, the call is asynchronous and responds immediately with a statement ID. The `on_wait_timeout` setting specifies what should happen when the timeout is reached while the statement execution has not yet finished. This can be set to either `CONTINUE`, to fallback to asynchronous mode, or it can be set to `CANCEL`, which cancels the statement.
In summary: - Synchronous mode - `wait_timeout=30s` and `on_wait_timeout=CANCEL` - The call waits up to 30 seconds; if the statement execution finishes within this time, the result data is returned directly in the response. If the execution takes longer than 30 seconds, the execution is canceled and the call returns with a `CANCELED` state. - Asynchronous mode - `wait_timeout=0s` (`on_wait_timeout` is ignored) - The call doesn't wait for the statement to finish but returns directly with a statement ID. The status of the statement execution can be polled by issuing :method:statementexecution/getStatement with the statement ID. Once the execution has succeeded, this call also returns the result and metadata in the response. - Hybrid mode (default) - `wait_timeout=10s` and `on_wait_timeout=CONTINUE` - The call waits for up to 10 seconds; if the statement execution finishes within this time, the result data is returned directly in the response. If the execution takes longer than 10 seconds, a statement ID is returned. The statement ID can be used to fetch status and results in the same way as in the asynchronous mode.
Depending on the size, the result can be split into multiple chunks. If the statement execution is successful, the statement response contains a manifest and the first chunk of the result. The manifest contains schema information and provides metadata for each chunk in the result. Result chunks can be retrieved by index with :method:statementexecution/getStatementResultChunkN which may be called in any order and in parallel. For sequential fetching, each chunk, apart from the last, also contains a `next_chunk_index` and `next_chunk_internal_link` that point to the next chunk.
A statement can be canceled with :method:statementexecution/cancelExecution.
**Fetching result data: format and disposition**
To specify the format of the result data, use the `format` field, which can be set to one of the following options: `JSON_ARRAY` (JSON), `ARROW_STREAM` ([Apache Arrow Columnar]), or `CSV`.
There are two ways to receive statement results, controlled by the `disposition` setting, which can be either `INLINE` or `EXTERNAL_LINKS`:
- `INLINE`: In this mode, the result data is directly included in the response. It's best suited for smaller results. This mode can only be used with the `JSON_ARRAY` format.
- `EXTERNAL_LINKS`: In this mode, the response provides links that can be used to download the result data in chunks separately. This approach is ideal for larger results and offers higher throughput. This mode can be used with all the formats: `JSON_ARRAY`, `ARROW_STREAM`, and `CSV`.
By default, the API uses `format=JSON_ARRAY` and `disposition=INLINE`.
**Limits and limitations**
Note: The byte limit for INLINE disposition is based on internal storage metrics and will not exactly match the byte count of the actual payload.
- Statements with `disposition=INLINE` are limited to 25 MiB and will fail when this limit is exceeded. - Statements with `disposition=EXTERNAL_LINKS` are limited to 100 GiB. Result sets larger than this limit will be truncated. Truncation is indicated by the `truncated` field in the result manifest. - The maximum query text size is 16 MiB. - Cancelation might silently fail. A successful response from a cancel request indicates that the cancel request was successfully received and sent to the processing engine. However, an outstanding statement might have already completed execution when the cancel request arrives. Polling for status until a terminal state is reached is a reliable way to determine the final state. - Wait timeouts are approximate, occur server-side, and cannot account for things such as caller delays and network latency from caller to service. - The system will auto-close a statement after one hour if the client stops polling and thus you must poll at least once an hour. - The results are only available for one hour after success; polling does not extend this.
[Apache Arrow Columnar]: https://arrow.apache.org/overview/ [Databricks SQL Statement Execution API tutorial]: https://docs.databricks.com/sql/api/sql-execution-tutorial.html
public StorageCredentialsAPI storageCredentials()
Databricks recommends using external locations rather than using storage credentials directly.
To create storage credentials, you must be a Databricks account admin. The account admin who creates the storage credential can delegate ownership to another user or group to manage permissions on it.
public SystemSchemasAPI systemSchemas()
public TableConstraintsAPI tableConstraints()
Primary and foreign keys are informational only and are not enforced. Foreign keys must reference a primary key in another table. This primary key is the parent constraint of the foreign key and the table this primary key is on is the parent table of the foreign key. Similarly, the foreign key is the child constraint of its referenced primary key; the table of the foreign key is the child table of the primary key.
You can declare primary keys and foreign keys as part of the table specification during table creation. You can also add or drop constraints on existing tables.
public TablesAPI tables()
A table can be managed or external. From an API perspective, a __VIEW__ is a particular kind of table (rather than a managed or external table).
public TokenManagementAPI tokenManagement()
public TokensAPI tokens()
public UsersAPI users()
Databricks recommends using SCIM provisioning to sync users and groups automatically from your identity provider to your Databricks workspace. SCIM streamlines onboarding a new employee or team by using your identity provider to create users and groups in Databricks workspace and give them the proper level of access. When a user leaves your organization or no longer needs access to Databricks workspace, admins can terminate the user in your identity provider and that user’s account will also be removed from Databricks workspace. This ensures a consistent offboarding process and prevents unauthorized users from accessing sensitive data.
public VolumesAPI volumes()
public WarehousesAPI warehouses()
public WorkspaceAPI workspace()
A notebook is a web-based interface to a document that contains runnable code, visualizations, and explanatory text.
public WorkspaceBindingsAPI workspaceBindings()
NOTE: The __isolation_mode__ is configured for the securable itself (using its Update method) and the workspace bindings are only consulted when the securable's __isolation_mode__ is set to __ISOLATED__.
A securable's workspace bindings can be configured by a metastore admin or the owner of the securable.
The original path (/api/2.1/unity-catalog/workspace-bindings/catalogs/{name}) is deprecated. Please use the new path (/api/2.1/unity-catalog/bindings/{securable_type}/{securable_name}) which introduces the ability to bind a securable in READ_ONLY mode (catalogs only).
Securables that support binding: - catalog
public WorkspaceConfAPI workspaceConf()
public WorkspaceClient withAccountAccessControlProxyImpl(AccountAccessControlProxyService accountAccessControlProxy)
public WorkspaceClient withAlertsImpl(AlertsService alerts)
public WorkspaceClient withAppsImpl(AppsService apps)
public WorkspaceClient withArtifactAllowlistsImpl(ArtifactAllowlistsService artifactAllowlists)
public WorkspaceClient withCatalogsImpl(CatalogsService catalogs)
public WorkspaceClient withCleanRoomsImpl(CleanRoomsService cleanRooms)
public WorkspaceClient withClusterPoliciesImpl(ClusterPoliciesService clusterPolicies)
public WorkspaceClient withClustersImpl(ClustersService clusters)
public WorkspaceClient withCommandExecutionImpl(CommandExecutionService commandExecution)
public WorkspaceClient withConnectionsImpl(ConnectionsService connections)
public WorkspaceClient withCredentialsManagerImpl(CredentialsManagerService credentialsManager)
public WorkspaceClient withCurrentUserImpl(CurrentUserService currentUser)
public WorkspaceClient withDashboardWidgetsImpl(DashboardWidgetsService dashboardWidgets)
public WorkspaceClient withDashboardsImpl(DashboardsService dashboards)
public WorkspaceClient withDataSourcesImpl(DataSourcesService dataSources)
public WorkspaceClient withDbfsImpl(DbfsService dbfs)
public WorkspaceClient withDbsqlPermissionsImpl(DbsqlPermissionsService dbsqlPermissions)
public WorkspaceClient withExperimentsImpl(ExperimentsService experiments)
public WorkspaceClient withExternalLocationsImpl(ExternalLocationsService externalLocations)
public WorkspaceClient withFilesImpl(FilesService files)
public WorkspaceClient withFunctionsImpl(FunctionsService functions)
public WorkspaceClient withGitCredentialsImpl(GitCredentialsService gitCredentials)
public WorkspaceClient withGlobalInitScriptsImpl(GlobalInitScriptsService globalInitScripts)
public WorkspaceClient withGrantsImpl(GrantsService grants)
public WorkspaceClient withGroupsImpl(GroupsService groups)
public WorkspaceClient withInstancePoolsImpl(InstancePoolsService instancePools)
public WorkspaceClient withInstanceProfilesImpl(InstanceProfilesService instanceProfiles)
public WorkspaceClient withIpAccessListsImpl(IpAccessListsService ipAccessLists)
public WorkspaceClient withJobsImpl(JobsService jobs)
public WorkspaceClient withLibrariesImpl(LibrariesService libraries)
public WorkspaceClient withMetastoresImpl(MetastoresService metastores)
public WorkspaceClient withModelRegistryImpl(ModelRegistryService modelRegistry)
public WorkspaceClient withModelVersionsImpl(ModelVersionsService modelVersions)
public WorkspaceClient withPermissionsImpl(PermissionsService permissions)
public WorkspaceClient withPipelinesImpl(PipelinesService pipelines)
public WorkspaceClient withPolicyFamiliesImpl(PolicyFamiliesService policyFamilies)
public WorkspaceClient withProvidersImpl(ProvidersService providers)
public WorkspaceClient withQueriesImpl(QueriesService queries)
public WorkspaceClient withQueryHistoryImpl(QueryHistoryService queryHistory)
public WorkspaceClient withQueryVisualizationsImpl(QueryVisualizationsService queryVisualizations)
public WorkspaceClient withRecipientActivationImpl(RecipientActivationService recipientActivation)
public WorkspaceClient withRecipientsImpl(RecipientsService recipients)
public WorkspaceClient withRegisteredModelsImpl(RegisteredModelsService registeredModels)
public WorkspaceClient withReposImpl(ReposService repos)
public WorkspaceClient withSchemasImpl(SchemasService schemas)
public WorkspaceClient withSecretsImpl(SecretsService secrets)
public WorkspaceClient withServicePrincipalsImpl(ServicePrincipalsService servicePrincipals)
public WorkspaceClient withServingEndpointsImpl(ServingEndpointsService servingEndpoints)
public WorkspaceClient withSettingsImpl(SettingsService settings)
public WorkspaceClient withSharesImpl(SharesService shares)
public WorkspaceClient withStatementExecutionImpl(StatementExecutionService statementExecution)
public WorkspaceClient withStorageCredentialsImpl(StorageCredentialsService storageCredentials)
public WorkspaceClient withSystemSchemasImpl(SystemSchemasService systemSchemas)
public WorkspaceClient withTableConstraintsImpl(TableConstraintsService tableConstraints)
public WorkspaceClient withTablesImpl(TablesService tables)
public WorkspaceClient withTokenManagementImpl(TokenManagementService tokenManagement)
public WorkspaceClient withTokensImpl(TokensService tokens)
public WorkspaceClient withUsersImpl(UsersService users)
public WorkspaceClient withVolumesImpl(VolumesService volumes)
public WorkspaceClient withWarehousesImpl(WarehousesService warehouses)
public WorkspaceClient withWorkspaceImpl(WorkspaceService workspace)
public WorkspaceClient withWorkspaceBindingsImpl(WorkspaceBindingsService workspaceBindings)
public WorkspaceClient withWorkspaceConfImpl(WorkspaceConfService workspaceConf)
public ApiClient apiClient()
public DatabricksConfig config()
Copyright © 2023. All rights reserved.