Enterprise Databricks on AWS using Terraform
This guide explains how to deploy Databricks on AWS using Terraform in an enterprise environment with separate AWS accounts for Dev, UAT and Production.
The document includes architecture, governance strategy, and complete Terraform code to create your first workspace.
1. Enterprise Cloud Governance
Large enterprises manage accounts using AWS Organizations. Each Line of Business (LOB) receives its own Organizational Unit (OU) so different security policies (SCP) can be applied.
AWS Organization (Root) │ ├── Security OU │ ├── Log Archive │ └── Security Tooling │ ├── Shared Services OU │ ├── CI/CD │ ├── Terraform State │ └── Networking │ ├── Retail Banking OU │ ├── Dev Account │ ├── UAT Account │ └── Prod Account │ ├── Capital Markets OU │ ├── Dev Account │ ├── UAT Account │ └── Prod Account │ └── Sandbox OU
Benefits of LOB separation:
- Independent Service Control Policies (SCP)
- Cost isolation
- Security boundaries
- Compliance governance
2. Databricks Multi-Account Strategy
Retail Banking DEV Account └ Databricks Workspace Retail Banking UAT Account └ Databricks Workspace Retail Banking PROD Account └ Databricks Workspace
| Feature | Benefit |
|---|---|
| Isolation | Dev cannot impact production |
| Security | Different IAM roles per environment |
| Cost | Clear billing separation |
| Compliance | Production governance |
3. Required Infrastructure Components
Before creating a workspace you must provision:- AWS Account
- VPC
- S3 Root Storage
- Cross Account IAM Role
- Databricks Account Console
4. Databricks Architecture
Databricks Control Plane
│
│ Secure API
│
Customer AWS Account
│
├── EC2 Cluster Nodes
├── S3 Data Lake
├── DBFS Storage
└── Notebooks / Jobs
5. Create Root Storage (S3)
Each workspace needs a root bucket.
resource "aws_s3_bucket" "databricks_root" {
bucket = "lob1-databricks-dev-root"
}
resource "aws_s3_bucket_versioning" "versioning" {
bucket = aws_s3_bucket.databricks_root.id
versioning_configuration {
status = "Enabled"
}
}
6. Create Databricks VPC
resource "aws_vpc" "databricks_vpc" {
cidr_block = "10.20.0.0/16"
tags = {
Name = "databricks-dev-vpc"
}
}
7. Create Private Subnets
resource "aws_subnet" "private1" {
vpc_id = aws_vpc.databricks_vpc.id
cidr_block = "10.20.1.0/24"
availability_zone = "us-east-1a"
}
resource "aws_subnet" "private2" {
vpc_id = aws_vpc.databricks_vpc.id
cidr_block = "10.20.2.0/24"
availability_zone = "us-east-1b"
}
8. Create Security Group
resource "aws_security_group" "databricks_sg" {
name = "databricks-cluster-sg"
vpc_id = aws_vpc.databricks_vpc.id
ingress {
from_port = 443
to_port = 443
protocol = "tcp"
cidr_blocks = ["10.20.0.0/16"]
}
egress {
from_port = 0
to_port = 0
protocol = "-1"
cidr_blocks = ["0.0.0.0/0"]
}
}
9. Create Cross-Account IAM Role
Databricks needs permission to launch compute clusters in your AWS account.
resource "aws_iam_role" "databricks_role" {
name = "databricks-cross-account-role"
assume_role_policy = jsonencode({
Version = "2012-10-17"
Statement = [{
Effect = "Allow"
Principal = {
AWS = "arn:aws:iam::414351767826:root"
}
Action = "sts:AssumeRole"
}]
})
}
Attach policy:
resource "aws_iam_role_policy_attachment" "policy" {
role = aws_iam_role.databricks_role.name
policy_arn = "arn:aws:iam::aws:policy/AdministratorAccess"
}
10. Terraform Providers
terraform {
required_providers {
aws = {
source = "hashicorp/aws"
}
databricks = {
source = "databricks/databricks"
}
}
}
11. Configure Databricks Provider
provider "databricks" {
host = "https://accounts.cloud.databricks.com"
account_id = var.databricks_account_id
username = var.username
password = var.password
}
12. Register Credentials
resource "databricks_mws_credentials" "credentials" {
account_id = var.databricks_account_id
credentials_name = "dev-databricks-credentials"
role_arn = aws_iam_role.databricks_role.arn
}
13. Register Storage
resource "databricks_mws_storage_configurations" "storage" {
account_id = var.databricks_account_id
storage_configuration_name = "dev-root-storage"
bucket_name = aws_s3_bucket.databricks_root.bucket
}
14. Register Network
resource "databricks_mws_networks" "network" {
account_id = var.databricks_account_id
network_name = "dev-databricks-network"
vpc_id = aws_vpc.databricks_vpc.id
subnet_ids = [
aws_subnet.private1.id,
aws_subnet.private2.id
]
security_group_ids = [
aws_security_group.databricks_sg.id
]
}
15. Create Databricks Workspace
resource "databricks_mws_workspaces" "workspace" {
account_id = var.databricks_account_id
workspace_name = "lob1-dev-databricks"
aws_region = "us-east-1"
credentials_id = databricks_mws_credentials.credentials.credentials_id
storage_configuration_id =
databricks_mws_storage_configurations.storage.storage_configuration_id
network_id =
databricks_mws_networks.network.network_id
}
16. Terraform Repository Structure
databricks-platform-infra │ ├── modules │ ├── network │ ├── workspace │ └── storage │ ├── environments │ ├── dev │ ├── uat │ └── prod │ └── global
17. CI/CD Deployment Flow
GitHub / GitLab
│
Terraform Plan
│
Terraform Apply
│
Databricks Workspace Created
18. Security Best Practices
- Private VPC networking
- Encryption at rest
- Secrets manager integration
- Cluster policies
- Unity Catalog governance
19. Enterprise Deployment Flow
Step 1 Create AWS Organization Step 2 Create LOB OU Step 3 Create Dev/UAT/Prod accounts Step 4 Deploy networking Step 5 Deploy S3 root storage Step 6 Deploy IAM roles Step 7 Terraform create workspace Step 8 Configure clusters Step 9 Deploy notebooks/jobs
Conclusion
You now have a complete enterprise framework to deploy Databricks on AWS using Terraform with proper governance, networking and workspace automation.
No comments:
Post a Comment