Sunday, 3 August 2025

Debugging a Pod in Kubernetes

Kubernetes Pod Debugging Guide

🔍 Step-by-Step: Debugging a Crashing or Problematic Pod in Kubernetes

✅ 1. Check Pod Status

kubectl get pods -n <namespace>

STATUS: CrashLoopBackOff, Error, ImagePullBackOff, Pending, etc.
RESTARTS: Helps understand how often it's failing.

✅ 2. Describe the Pod

kubectl describe pod <pod-name> -n <namespace>

Check Events at the bottom: scheduling issues, volume mount errors, etc.
Check Container State (waiting, terminated, reason)

✅ 3. Get Pod Logs

kubectl logs <pod-name> -n <namespace>

kubectl logs <pod-name> -n <namespace> -c <container-name>

Use --previous if the pod restarted and you want logs from the prior container:

kubectl logs --previous <pod-name> -n <namespace>

✅ 4. Common Pod Failure States

CrashLoopBackOff: The container keeps crashing on startup
ImagePullBackOff / ErrImagePull: Image is incorrect or unauthenticated
OOMKilled: Out of memory — check resource limits
ContainerCreating: Volume or node issues
Completed: Pod exited successfully (common for Jobs)

✅ 5. Exec Into the Pod (If Running)

kubectl exec -it <pod-name> -n <namespace> -- /bin/sh

Explore logs/configs/environment manually

✅ 6. Check Events at Namespace Level

kubectl get events -n <namespace> --sort-by=.metadata.creationTimestamp

✅ 7. Look for Liveness/Readiness Probe Failures

kubectl describe pod <pod-name>

Check if probes are misconfigured and causing restarts.

✅ 8. Resource Limits

Check if the pod is being OOMKilled (killed due to memory)

kubectl describe pod <pod-name>

Look for: State: Terminated Reason: OOMKilled

✅ 9. Pod Stuck in Pending

No nodes available, missing resources, or bad nodeSelector/toleration

kubectl describe pod <pod-name>

✅ 10. Look at Node or DaemonSet Logs (for CNI/containerd issues)

If pod never gets created or stuck, may be CNI/networking issue

kubectl logs <node-name> -n kube-system -c aws-node

🛠️ Optional: Use Stern to Tail Pod Logs Across Containers

stern <pod-name> -n <namespace>

📌 TL;DR - Common Fixes for Crashing Pods

CrashLoopBackOff: Application error or misconfigured command
OOMKilled: Memory limit too low — increase it
ImagePullBackOff: Bad image or no access to private registry
Pending: No schedulable nodes or resource constraints
Probe failures: Health check misconfigured

No comments:

Post a Comment

Subscribe to: Post Comments (Atom)