Wednesday, 31 December 2025

GKE + Google Managed Prometheus Policy Enforcement with Terraform & OPA

GKE + Google Managed Prometheus Policy Enforcement with Terraform & OPA

GKE + Google Managed Prometheus Guardrails using Terraform & OPA

This guide shows how to enforce a policy that allows:

  • No monitoring enabled
  • Only Google Managed Prometheus (GMP)

And denies:

  • Legacy monitoring
  • Extra observability configs
  • Mixed monitoring setups

๐Ÿ“Œ Architecture Overview

Terraform → Plan JSON → OPA/Rego → Policy Decision


๐Ÿ“ Project Structure

gke-gmp-policy-lab/
│
├── case1-no-monitoring/
├── case2-gmp-only/
├── case3-invalid-monitoring/
├── policy/gke_gmp_only.rego
└── variables.tf

๐Ÿ”ง variables.tf


variable "project_id" {}
variable "region" {
  default = "us-east1"
}

๐Ÿงช Case 1 — No Monitoring (Allowed)


resource "google_container_cluster" "cluster" {
  name     = "case1-no-monitoring"
  location = var.region

  remove_default_node_pool = true
  initial_node_count       = 1
}

✔ Result: Allowed


๐Ÿงช Case 2 — Only GMP (Allowed)


resource "google_container_cluster" "cluster" {
  name     = "case2-gmp-only"
  location = var.region

  monitoring_config {
    managed_prometheus {
      enabled = true
    }
  }
}

✔ Result: Allowed


๐Ÿงช Case 3 — Invalid Monitoring (Denied)


resource "google_container_cluster" "cluster" {
  name     = "case3-invalid-monitoring"
  location = var.region

  monitoring_config {
    managed_prometheus { enabled = true }

    advanced_datapath_observability_config {
      enable_metrics = true
    }
  }
}

❌ Result: Denied


๐Ÿ›ก️ Rego Policy — Allow None or Only GMP


package gke.monitoring

deny[msg] {
  rc := input.resource_changes[_]
  rc.type == "google_container_cluster"
  rc.change.after.monitoring_service != null
  msg := sprintf("Cluster '%s' uses legacy monitoring_service", [rc.name])
}

deny[msg] {
  rc := input.resource_changes[_]
  rc.type == "google_container_cluster"
  mc := rc.change.after.monitoring_config
  mc != null

  some key
  mc[key]
  key != "managed_prometheus"

  msg := sprintf("Cluster '%s' has unsupported monitoring block: %s", [rc.name, key])
}

deny[msg] {
  rc := input.resource_changes[_]
  rc.type == "google_container_cluster"
  mp := rc.change.after.monitoring_config.managed_prometheus
  mp != null

  some key
  mp[key]
  key != "enabled"

  msg := sprintf("Cluster '%s' has invalid GMP field: %s", [rc.name, key])
}

allow {
  not deny[_]
}

๐Ÿš€ How to Test

Step 1 — Terraform Plan


terraform init
terraform plan -var="project_id=YOUR_PROJECT" -out=tfplan
terraform show -json tfplan > tfplan.json

Step 2 — Evaluate Policy


opa eval \
  --data policy/gke_gmp_only.rego \
  --input tfplan.json \
  "data.gke.monitoring"

✅ Expected Outputs

CaseResult
No MonitoringALLOW
Only GMPALLOW
GMP + ExtraDENY

๐Ÿ”Ž Example Deny Output


Cluster 'cluster' has unsupported monitoring block:
advanced_datapath_observability_config

๐Ÿ” CI/CD Integration (GitHub Actions)


- name: Terraform Plan
  run: terraform show -json tfplan > tfplan.json

- name: OPA Policy Check
  run: |
    opa eval --fail-defined \
      --data policy/gke_gmp_only.rego \
      --input tfplan.json \
      "data.gke.monitoring.deny"

๐ŸŽฏ Why This Matters

  • Prevents misconfigured monitoring
  • Enforces platform standards
  • Improves security & compliance
  • Scales governance across teams

๐Ÿš€ Next Steps

  • Require Workload Identity
  • Block public clusters
  • Cost control policies
  • Full GKE security baseline

Author

Cloud & Platform Engineering Guide — Terraform + OPA + GKE

No comments:

Post a Comment