Thursday, 15 January 2026

Step 1 – Workspace Strategy & Environment Isolation in Databricks

Step 1 – Workspace Strategy & Environment Isolation in Databricks

Step 1 – Workspace Strategy & Environment Isolation in Databricks

After completing Step 0: Identity Setup, the next critical task in enterprise onboarding is designing a robust workspace strategy and ensuring environmental isolation. Workspaces in Databricks are execution boundaries that control compute, job execution, clusters, secrets, and repos. Proper strategy ensures safe deployment, governance, and compliance.


Why Workspace Strategy Matters

Poor workspace design can lead to:

  • Accidental production data access
  • Shared clusters across teams
  • Inconsistent governance and auditing

This step defines:

  • How many workspaces are needed
  • How environments (Dev/QA/Prod) are isolated
  • How identities and access flow across workspaces
---

Workspace vs Environment Responsibilities

Responsibility Workspace Unity Catalog
User login
Cluster configuration
Job execution
Data access
Table-level security
---

Step 1.1 – Decide Workspace Topology

Databricks Account
├── Dev Workspace
├── QA Workspace
└── Prod Workspace

Start simple — one workspace per environment. Avoid creating per-team or per-user workspaces initially. Workspace isolation ensures safe Dev → QA → Prod promotion.

---

Step 1.2 – Create Workspaces

Steps:

  1. Log in to Databricks Account Console
  2. Create workspace with required region, VNET, private endpoints, and storage account
  3. Repeat for Dev, QA, Prod

Example naming convention:

Dev  → dbx-dev-us-east
QA   → dbx-qa-us-east
Prod → dbx-prod-us-east
---

Step 1.3 – Define Environment-Specific Azure AD Groups

Combine role + environment in group names:

dbx-dev-admins
dbx-dev-engineers
dbx-dev-analysts

dbx-prod-admins
dbx-prod-engineers
dbx-prod-users

This enables:

  • Same person has Dev access but limited/no Prod access
  • Clear audit trail and separation of duties
---

Step 1.4 – Assign Groups to Workspaces

Workspace access is granted via the Databricks Account Console:

Dev Workspace Example

GroupPermission
dbx-dev-adminsWorkspace Admin
dbx-dev-engineersWorkspace User
dbx-prod-engineers❌ No Access

Prod Workspace Example

GroupPermission
dbx-prod-adminsWorkspace Admin
dbx-prod-engineersWorkspace User
dbx-dev-engineers❌ No Access
---

Step 1.5 – Authentication & Access Flow

User logs in
   |
   v
Azure AD SSO
   |
   v
Databricks Account checks:
    - Is user in a group assigned to this workspace?
        |
        +-- YES → Access granted
        +-- NO  → Workspace invisible
---

Step 1.6 – Workspace Admin vs Account Admin

Role Scope
Account Admin All workspaces, identity, global settings (2–3 people max)
Workspace Admin Single workspace (clusters, jobs, repos)
---

Step 1.7 – Cluster & Job Isolation

Cluster policies per workspace:

  • Dev: small nodes, auto-termination, permissive libraries
  • Prod: fixed nodes, restricted libraries, no interactive clusters

Jobs are workspace-bound:

Git → Dev Workspace Job
        ↓
     QA Workspace Job
        ↓
     Prod Workspace Job

Secrets are workspace-scoped to ensure Dev/Prod isolation:

dev-kv/snowflake-password
prod-kv/snowflake-password
---

Step 1.8 – What Workspaces Do NOT Control

  • Table-level access
  • Row-level security
  • Column masking

These are handled by Unity Catalog in Step 2.

---

Step 1.9 – Common Mistakes to Avoid

  • Giving Dev engineers access to Prod workspace
  • Making everyone Workspace Admin
  • Using one workspace + folders for envs
  • Relying on notebook naming for isolation
---

Step 1.10 – Validation Checklist

  • Dev user logs in → sees Dev workspace only
  • Dev user tries Prod URL → access denied
  • Prod user logs in → sees Prod workspace only
  • Removing user from Azure AD group → access disappears automatically
  • No manual Databricks changes required
---

What Step 1 Enables Next

Because workspaces are properly isolated:
  • Unity Catalog can safely share data across workspaces
  • Prod data can be read-only from Dev
  • Cluster RBAC becomes enforceable
  • Auditors can validate separation of duties and compliance
---

Next Step: Step 2 – Unity Catalog Metastore & Data Isolation (Catalogs, schemas, table-level RBAC, cross-workspace sharing)

No comments:

Post a Comment