Wednesday, 18 March 2026

S3 Bucket Security for Databricks on AWS – Do You Need a Bucket Policy

S3 Bucket Security for Databricks on AWS – Do You Need a Bucket Policy?

Short Answer

Yes — if you don’t have a bucket policy (or any explicit deny), any IAM principal in that AWS account with s3:* (or sufficient S3 permissions) can access the bucket, including data used by Databricks.


Why This Happens

AWS authorization follows this rule:

  • Access is allowed if there is at least one ALLOW and no explicit DENY

So if:

  • An IAM role/user has s3:*
  • The bucket has no restrictive bucket policy

Then access is granted.


What This Means

Without Bucket Policy

  • Databricks role → ✅ Access (expected)
  • Any other IAM role with S3 permissions → ❗ Also has access

This includes:

  • Admin roles
  • Other application roles
  • Over-permissioned users

Security Risk

  • ❌ No data isolation
  • ❌ Violates zero-trust principles
  • ❌ Compliance risk (PII, GDPR, etc.)

Example: Another team’s EC2 role with s3:* can read your Databricks data.


Recommended Fix – Bucket Policy

Allow Only Databricks Role

{
  "Effect": "Allow",
  "Principal": {
    "AWS": "arn:aws:iam::<ACCOUNT_ID>:role/<UC_ROLE>"
  },
  "Action": "s3:*",
  "Resource": [
    "arn:aws:s3:::my-data-bucket",
    "arn:aws:s3:::my-data-bucket/*"
  ]
}

Deny Everyone Else (Critical)

{
  "Effect": "Deny",
  "NotPrincipal": {
    "AWS": "arn:aws:iam::<ACCOUNT_ID>:role/<UC_ROLE>"
  },
  "Action": "s3:*",
  "Resource": [
    "arn:aws:s3:::my-data-bucket",
    "arn:aws:s3:::my-data-bucket/*"
  ]
}

Security Model


IAM Role Policy → Defines WHAT actions are allowed
+
Bucket Policy → Defines WHO can access the bucket

Key Insight

  • IAM policy answers: “What can this role do?”
  • Bucket policy answers: “Who is allowed to access this bucket?”

Without a bucket policy, you lose resource-level protection.


Final Answer

  • ✔ Yes — without a bucket policy, any IAM role with s3:* can access your Databricks S3 data
  • ✔ Use a bucket policy to restrict access
  • ✔ Allow only the Unity Catalog role
  • ✔ Add explicit deny for all other principals

This ensures a secure, enterprise-grade, zero-trust setup for Databricks on AWS.

No comments:

Post a Comment