Monday, 26 January 2026

Databricks APIs – Overview and Python Examples

Databricks APIs – Overview and Python Examples

Databricks APIs – Architecture, Types, and Python Examples

Databricks provides a comprehensive set of REST APIs to automate platform setup, workspace administration, data governance, compute management, and analytics workflows. These APIs are commonly used for infrastructure automation, CI/CD pipelines, and application onboarding.


Common Python Setup


import requests
import json

DATABRICKS_HOST = "https://<databricks-instance>"
TOKEN = "<DATABRICKS_TOKEN>"

HEADERS = {
    "Authorization": f"Bearer {TOKEN}",
    "Content-Type": "application/json"
}

1. Account API

Purpose: Manage Databricks accounts and workspaces.

Documentation: Databricks Account API

Create a Workspace


url = f"{DATABRICKS_HOST}/api/2.0/accounts/<ACCOUNT_ID>/workspaces"

payload = {
    "workspace_name": "dev-workspace",
    "aws_region": "us-east-1",
    "credentials_id": "cred-id",
    "storage_configuration_id": "storage-id",
    "network_id": "network-id"
}

response = requests.post(url, headers=HEADERS, json=payload)
print(response.json())

2. SCIM API

Purpose: Manage users, groups, and service principals.

Documentation: Databricks SCIM API

Create a Service Principal


url = f"{DATABRICKS_HOST}/api/2.0/preview/scim/v2/ServicePrincipals"

payload = {
    "displayName": "my-app-sp"
}

response = requests.post(url, headers=HEADERS, json=payload)
print(response.json())

3. Unity Catalog API

Purpose: Centralized data governance for catalogs, schemas, and tables.

Documentation: Unity Catalog API

Create a Catalog


url = f"{DATABRICKS_HOST}/api/2.1/unity-catalog/catalogs"

payload = {
    "name": "sales_catalog",
    "comment": "Catalog for sales domain"
}

response = requests.post(url, headers=HEADERS, json=payload)
print(response.json())

Grant Catalog Permission


url = f"{DATABRICKS_HOST}/api/2.1/unity-catalog/permissions/catalogs/sales_catalog"

payload = {
    "changes": [
        {
            "principal": "data_analysts",
            "add": ["USE_CATALOG"]
        }
    ]
}

response = requests.patch(url, headers=HEADERS, json=payload)
print(response.json())

4. Workspace API

Purpose: Manage clusters, jobs, notebooks, and workspace objects.

Documentation: Workspace API

Create a Cluster


url = f"{DATABRICKS_HOST}/api/2.0/clusters/create"

payload = {
    "cluster_name": "demo-cluster",
    "spark_version": "13.3.x-scala2.12",
    "node_type_id": "Standard_DS3_v2",
    "num_workers": 1,
    "autotermination_minutes": 30
}

response = requests.post(url, headers=HEADERS, json=payload)
print(response.json())

5. Jobs API

Purpose: Orchestrate batch and streaming workloads.

Documentation: Jobs API

Create a Job


url = f"{DATABRICKS_HOST}/api/2.1/jobs/create"

payload = {
    "name": "sample-job",
    "tasks": [
        {
            "task_key": "run_notebook",
            "notebook_task": {
                "notebook_path": "/Shared/sample_notebook"
            },
            "new_cluster": {
                "spark_version": "13.3.x-scala2.12",
                "node_type_id": "Standard_DS3_v2",
                "num_workers": 1
            }
        }
    ]
}

response = requests.post(url, headers=HEADERS, json=payload)
print(response.json())

6. Repos API

Purpose: Integrate Git repositories.

Documentation: Repos API

Create a Repo

No comments:

Post a Comment