Thursday, 15 January 2026

Databricks REST API – Complete Enterprise Automation Guide (Python + AWS)

Databricks REST API – Complete Enterprise Automation Guide (Python + AWS)

Databricks REST API – Complete Enterprise Automation Guide

This guide documents almost all commonly used Databricks REST API endpoints with working Python examples for enterprise automation on AWS.


0️⃣ Authentication & Base Configuration

Account-Level APIs

Base URL: https://accounts.cloud.databricks.com
Auth: Account PAT

Workspace-Level APIs

Base URL: https://dbc-xxxx.region.databricks.com
Auth: Workspace PAT
import requests

ACCOUNT_ID = "xxxx"
ACCOUNT_HOST = "https://accounts.cloud.databricks.com"
WORKSPACE_HOST = "https://dbc-xxxx.us-east-1.databricks.com"

ACCOUNT_HEADERS = {
    "Authorization": "Bearer ACCOUNT_TOKEN",
    "Content-Type": "application/json"
}

WORKSPACE_HEADERS = {
    "Authorization": "Bearer WORKSPACE_TOKEN",
    "Content-Type": "application/json"
}

1️⃣ Identity & SCIM APIs

EndpointPurpose
POST /scim/v2/UsersCreate user
GET /scim/v2/UsersList users
POST /scim/v2/GroupsCreate group
PATCH /scim/v2/Groups/{id}Add/remove members

Create User

url = f"{ACCOUNT_HOST}/api/2.0/accounts/{ACCOUNT_ID}/scim/v2/Users"
payload = {
  "userName": "alice@company.com",
  "displayName": "Alice",
  "active": true
}
requests.post(url, headers=ACCOUNT_HEADERS, json=payload).raise_for_status()

Create Group

url = f"{ACCOUNT_HOST}/api/2.0/accounts/{ACCOUNT_ID}/scim/v2/Groups"
payload = {"displayName": "data-engineers"}
group = requests.post(url, headers=ACCOUNT_HEADERS, json=payload).json()

2️⃣ Workspace (Account-Level) APIs

EndpointDescription
POST /workspacesCreate workspace
GET /workspacesList workspaces
POST /permissionassignmentsAssign groups to workspace

Create Workspace

url = f"{ACCOUNT_HOST}/api/2.0/accounts/{ACCOUNT_ID}/workspaces"
payload = {
  "workspace_name": "prod",
  "aws_region": "us-east-1",
  "credentials_id": "cred-123",
  "storage_configuration_id": "storage-123",
  "network_id": "network-123"
}
workspace = requests.post(url, headers=ACCOUNT_HEADERS, json=payload).json()

3️⃣ Cluster APIs

EndpointDescription
POST /clusters/createCreate cluster
GET /clusters/listList clusters
POST /clusters/startStart cluster
POST /clusters/deleteDelete cluster

Create Cluster

url = f"{WORKSPACE_HOST}/api/2.0/clusters/create"
payload = {
  "cluster_name": "engineering",
  "spark_version": "13.3.x-scala2.12",
  "node_type_id": "m5.xlarge",
  "num_workers": 2
}
cluster = requests.post(url, headers=WORKSPACE_HEADERS, json=payload).json()

Set Cluster Permissions

url = f"{WORKSPACE_HOST}/api/2.0/permissions/clusters/{cluster['cluster_id']}"
payload = {
  "access_control_list": [
    {
      "group_name": "data-engineers",
      "permission_level": "CAN_ATTACH_TO"
    }
  ]
}
requests.patch(url, headers=WORKSPACE_HEADERS, json=payload)

4️⃣ Jobs API

EndpointPurpose
POST /jobs/createCreate job
POST /jobs/run-nowRun job
GET /jobs/listList jobs

Create Job

url = f"{WORKSPACE_HOST}/api/2.0/jobs/create"
payload = {
  "name": "etl-job",
  "new_cluster": {
    "spark_version": "13.3.x-scala2.12",
    "node_type_id": "m5.large",
    "num_workers": 2
  },
  "notebook_task": {
    "notebook_path": "/Shared/etl"
  }
}
job = requests.post(url, headers=WORKSPACE_HEADERS, json=payload).json()

5️⃣ SQL & Warehouses API

EndpointDescription
POST /sql/warehousesCreate SQL warehouse
POST /sql/statementsExecute SQL

Execute SQL

url = f"{WORKSPACE_HOST}/api/2.0/sql/statements"
payload = {
  "statement": "SELECT current_user(), current_date()",
  "warehouse_id": "wh-123"
}
requests.post(url, headers=WORKSPACE_HEADERS, json=payload)

6️⃣ DBFS & Workspace APIs

EndpointDescription
POST /dbfs/putUpload file
GET /workspace/listList notebooks
POST /workspace/importImport notebook

Upload File to DBFS

url = f"{WORKSPACE_HOST}/api/2.0/dbfs/put"
payload = {
  "path": "/tmp/data.txt",
  "contents": "SGVsbG8="
}
requests.post(url, headers=WORKSPACE_HEADERS, json=payload)

7️⃣ Unity Catalog APIs (Most Used)

EndpointDescription
POST /unity-catalog/catalogsCreate catalog
POST /unity-catalog/schemasCreate schema
POST /unity-catalog/tablesCreate table
PATCH /unity-catalog/permissionsGrant access

Create Catalog

url = f"{WORKSPACE_HOST}/api/2.1/unity-catalog/catalogs"
payload = {"name": "finance"}
requests.post(url, headers=WORKSPACE_HEADERS, json=payload)

Grant Table Access

url = f"{WORKSPACE_HOST}/api/2.1/unity-catalog/permissions/table/finance.payments.txns"
payload = {
  "changes": [{
    "principal": "data-scientists",
    "add": ["SELECT"]
  }]
}
requests.patch(url, headers=WORKSPACE_HEADERS, json=payload)

8️⃣ Tokens, Secrets, Repos

EndpointUse
POST /token/createCreate PAT
POST /secrets/scopes/createCreate secret scope
POST /reposCreate repo

9️⃣ Enterprise Best Practices

  • Terraform for bootstrap & security
  • Python APIs for day-2 operations
  • Unity Catalog for ALL data access
  • No IAM-based data access
This API-first approach is used by regulated banks, fintech, and large enterprises.

Next Topics You Can Publish

  • Databricks CI/CD pipelines
  • API error handling & retries
  • Zero-trust data architecture
  • Cross-account Unity Catalog sharing

No comments:

Post a Comment