S3 Bucket Security for Databricks on AWS – Do You Need a Bucket Policy?
Short Answer
Yes — if you don’t have a bucket policy (or any explicit deny), any IAM principal in that AWS account with
s3:* (or sufficient S3 permissions) can access the bucket, including data used by Databricks.
Why This Happens
AWS authorization follows this rule:
- Access is allowed if there is at least one ALLOW and no explicit DENY
So if:
- An IAM role/user has
s3:* - The bucket has no restrictive bucket policy
Then access is granted.
What This Means
Without Bucket Policy
- Databricks role → ✅ Access (expected)
- Any other IAM role with S3 permissions → ❗ Also has access
This includes:
- Admin roles
- Other application roles
- Over-permissioned users
Security Risk
- ❌ No data isolation
- ❌ Violates zero-trust principles
- ❌ Compliance risk (PII, GDPR, etc.)
Example: Another team’s EC2 role with s3:* can read your Databricks data.
Recommended Fix – Bucket Policy
Allow Only Databricks Role
{
"Effect": "Allow",
"Principal": {
"AWS": "arn:aws:iam::<ACCOUNT_ID>:role/<UC_ROLE>"
},
"Action": "s3:*",
"Resource": [
"arn:aws:s3:::my-data-bucket",
"arn:aws:s3:::my-data-bucket/*"
]
}
Deny Everyone Else (Critical)
{
"Effect": "Deny",
"NotPrincipal": {
"AWS": "arn:aws:iam::<ACCOUNT_ID>:role/<UC_ROLE>"
},
"Action": "s3:*",
"Resource": [
"arn:aws:s3:::my-data-bucket",
"arn:aws:s3:::my-data-bucket/*"
]
}
Security Model
IAM Role Policy → Defines WHAT actions are allowed
+
Bucket Policy → Defines WHO can access the bucket
Key Insight
- IAM policy answers: “What can this role do?”
- Bucket policy answers: “Who is allowed to access this bucket?”
Without a bucket policy, you lose resource-level protection.
Final Answer
- ✔ Yes — without a bucket policy, any IAM role with
s3:*can access your Databricks S3 data - ✔ Use a bucket policy to restrict access
- ✔ Allow only the Unity Catalog role
- ✔ Add explicit deny for all other principals
This ensures a secure, enterprise-grade, zero-trust setup for Databricks on AWS.
No comments:
Post a Comment