Thursday, 15 January 2026

Enterprise Databricks Automation on AWS – Identity, RBAC & Security as Code

Enterprise Databricks Automation on AWS – Identity, RBAC & Security as Code

Enterprise Databricks Automation on AWS

SCIM, Unity Catalog RBAC, Row-Level Security & Security-as-Code


Where This Fits in the Enterprise Series

PostTopic
Step 0Identity setup (SSO + SCIM)
Step 1Workspace strategy & environment isolation
Step 2Unity Catalog metastore & data isolation
Step 3Identity, RBAC & data security as code (this post)
Step 4CI/CD & promotion pipelines

1️⃣ SCIM Group Automation with Terraform

Why SCIM Automation Is Mandatory

  • No manual user or group creation
  • Identity source of truth = IdP
  • Permissions change automatically with group membership

Provider Configuration (Account Level)


provider "databricks" {
  host  = var.databricks_account_host
  token = var.databricks_account_token
}

Create Groups via Terraform


resource "databricks_group" "prod_engineers" {
  display_name = "group_prod_engineers"
}

resource "databricks_group" "dev_engineers" {
  display_name = "group_dev_engineers"
}

Add Users to Groups


resource "databricks_group_member" "prod_user" {
  group_id  = databricks_group.prod_engineers.id
  member_id = databricks_user.user_a.id
}
In real enterprises, users are synced automatically from IdP via SCIM. Terraform manages only group-level logic.

2️⃣ Unity Catalog RBAC as Code (grants.tf)

Why RBAC as Code Matters

  • Auditable permissions
  • No UI drift
  • Consistent across environments

grants.tf – Catalog-Level Access


resource "databricks_grants" "catalog_usage" {
  catalog = "prod_catalog"

  grant {
    principal  = "group_prod_engineers"
    privileges = ["USAGE"]
  }

  grant {
    principal  = "group_dev_engineers"
    privileges = ["USAGE"]
  }
}

Schema-Level RBAC


resource "databricks_grants" "sales_schema" {
  schema = "prod_catalog.sales"

  grant {
    principal  = "group_prod_engineers"
    privileges = ["CREATE", "SELECT"]
  }

  grant {
    principal  = "group_dev_engineers"
    privileges = ["SELECT"]
  }
}

Table-Level RBAC


resource "databricks_grants" "customers_table" {
  table = "prod_catalog.sales.customers"

  grant {
    principal  = "group_prod_engineers"
    privileges = ["SELECT", "MODIFY"]
  }

  grant {
    principal  = "group_dev_engineers"
    privileges = ["SELECT"]
  }
}
Permissions are enforced by Unity Catalog at query execution time.

3️⃣ Row-Level Security (Dynamic Views)

Use Case

User GroupAllowed Country
group_us_analystsUSA
group_eu_analystsEU

Base Table (Restricted)


REVOKE ALL PRIVILEGES ON TABLE prod_catalog.sales.customers FROM PUBLIC;

Dynamic View with RLS


CREATE VIEW prod_catalog.sales.customers_secure AS
SELECT *
FROM prod_catalog.sales.customers
WHERE
  (is_member('group_us_analysts') AND country = 'USA')
  OR
  (is_member('group_eu_analysts') AND country = 'EU');

Grant Access Only to the View


GRANT SELECT ON VIEW prod_catalog.sales.customers_secure
TO `group_us_analysts`, `group_eu_analysts`;
Row-level security is enforced automatically based on group membership. No application changes required.

4️⃣ End-to-End Access Example

UserGroupResult
User Agroup_prod_engineersRead + Write all rows
User Bgroup_dev_engineersRead-only
User Cgroup_us_analystsUSA rows only

5️⃣ CI/CD Flow (Security Included)

Git Commit
  ↓
Terraform Apply
  ↓
SCIM Groups + Workspaces + UC Grants
  ↓
Users Login
  ↓
Access Automatically Enforced

6️⃣ Common Enterprise Anti-Patterns

  • Granting permissions to users instead of groups
  • Direct access to base tables (no views)
  • Mixing Dev and Prod users in same workspace
  • Manual permission changes via UI

7️⃣ Why Auditors Love This Setup

  • All access is code-reviewed
  • Clear separation of duties
  • Full traceability in Git
  • Zero manual overrides

8️⃣ Enterprise Databricks Blog Series Roadmap

PostDescription
Part 1Identity, SSO & SCIM architecture
Part 2Workspace isolation & networking
Part 3Unity Catalog & RBAC as code
Part 4Row-level & column-level security
Part 5CI/CD promotion Dev → Prod
Part 6Operating Databricks at scale

Final Takeaway

This approach gives you:

  • Enterprise-grade security by design
  • Zero-touch onboarding
  • Strong compliance posture
  • Infrastructure and data security as code

This is how Databricks is run in regulated enterprises.

No comments:

Post a Comment