Steven Rosales

Kubernetes Troubleshooting Commands for DevOps and Support

The goal is to document common kubectl commands, troubleshooting workflows, productivity shortcuts, and investigation patterns used to diagnose real Kubernetes issues.


Table of Contents


1. Cluster Information

Use these commands to understand the Kubernetes cluster context and basic cluster health.

Check current context

kubectl config current-context

View all contexts

kubectl config get-contexts

Switch context

kubectl config use-context <context-name>

Cluster information

kubectl cluster-info

Check Kubernetes version

kubectl version

Check API resources

kubectl api-resources

Troubleshooting logic


2. Daily Kubectl Productivity Setup

These commands help make daily Kubernetes work faster, safer, and easier.

This section is useful for engineers who work with Kubernetes clusters every day.


Create a short alias for kubectl

Instead of typing kubectl every time, create an alias called k.

Temporary alias for the current terminal session:

alias k=kubectl

Now you can run:

k get pods -A
k get nodes
k describe pod <pod-name> -n <namespace>

To make it permanent, add it to your shell profile.

For Bash:

echo 'alias k=kubectl' >> ~/.bashrc
source ~/.bashrc

For Zsh on macOS:

echo 'alias k=kubectl' >> ~/.zshrc
source ~/.zshrc

Common daily examples:

k get pods -A
k get nodes -o wide
k get svc -A
k get events -A --sort-by=.metadata.creationTimestamp
k logs <pod-name> -n <namespace>
k describe pod <pod-name> -n <namespace>

Enable kubectl autocomplete

Autocomplete helps you type Kubernetes commands faster.

For Bash:

source <(kubectl completion bash)
echo 'source <(kubectl completion bash)' >> ~/.bashrc

If using the k alias:

echo 'alias k=kubectl' >> ~/.bashrc
echo 'complete -o default -F __start_kubectl k' >> ~/.bashrc
source ~/.bashrc

For Zsh on macOS:

source <(kubectl completion zsh)
echo 'source <(kubectl completion zsh)' >> ~/.zshrc

If using the k alias:

echo 'alias k=kubectl' >> ~/.zshrc
echo 'compdef __start_kubectl k' >> ~/.zshrc
source ~/.zshrc

Check current Kubernetes context

The context tells you which cluster you are connected to.

kubectl config current-context

Using alias:

k config current-context

List all contexts:

kubectl config get-contexts

Switch context:

kubectl config use-context <context-name>

Example:

k config use-context dev-cluster

Important: Always confirm the context before running commands in production.


Set a default namespace

Instead of adding -n <namespace> every time, set a default namespace for your current context.

kubectl config set-context --current --namespace=<namespace>

Example:

kubectl config set-context --current --namespace=production

Using alias:

k config set-context --current --namespace=production

Check current namespace:

kubectl config view --minify | grep namespace

This helps avoid mistakes when working across multiple namespaces.


Create useful kubectl aliases

You can add these aliases to ~/.bashrc or ~/.zshrc.

alias k=kubectl
alias kgp='kubectl get pods'
alias kgpa='kubectl get pods -A'
alias kgn='kubectl get nodes -o wide'
alias kgs='kubectl get svc'
alias kgsa='kubectl get svc -A'
alias kge='kubectl get events --sort-by=.metadata.creationTimestamp'
alias kgea='kubectl get events -A --sort-by=.metadata.creationTimestamp'
alias kd='kubectl describe'
alias kdp='kubectl describe pod'
alias kl='kubectl logs'
alias klf='kubectl logs -f'
alias kaf='kubectl apply -f'
alias kdel='kubectl delete -f'

Reload shell:

source ~/.zshrc

or:

source ~/.bashrc

Examples:

kgpa
kgn
kgea
kdp <pod-name> -n <namespace>
kl <pod-name> -n <namespace>

Manage kubeconfig files

The kubeconfig file stores cluster connection information.

Default location:

~/.kube/config

Check which kubeconfig is being used:

echo $KUBECONFIG

Set a specific kubeconfig file:

export KUBECONFIG=~/.kube/config

Use a custom kubeconfig:

export KUBECONFIG=~/Downloads/dev-cluster-kubeconfig.yaml

Make it permanent for Zsh:

echo 'export KUBECONFIG=~/.kube/config' >> ~/.zshrc
source ~/.zshrc

Make it permanent for Bash:

echo 'export KUBECONFIG=~/.kube/config' >> ~/.bashrc
source ~/.bashrc

Merge multiple kubeconfig files

If you have multiple cluster config files, you can merge them.

First, back up your current kubeconfig:

cp ~/.kube/config ~/.kube/config.backup

Merge multiple kubeconfig files:

export KUBECONFIG=~/.kube/config:~/Downloads/dev-kubeconfig.yaml:~/Downloads/prod-kubeconfig.yaml
kubectl config view --flatten > ~/.kube/merged-config
mv ~/.kube/merged-config ~/.kube/config
chmod 600 ~/.kube/config

Validate:

kubectl config get-contexts

Important: Always back up your kubeconfig before merging.


Secure kubeconfig permissions

Kubeconfig files may contain sensitive authentication information.

Recommended permission:

chmod 600 ~/.kube/config

Check permissions:

ls -l ~/.kube/config

Avoid committing kubeconfig files to GitHub.


View cluster credentials and users

View kubeconfig details:

kubectl config view

View only current context details:

kubectl config view --minify

View users configured in kubeconfig:

kubectl config get-users

View clusters:

kubectl config get-clusters

Save a certificate from a Kubernetes Secret

TLS certificates are often stored in Kubernetes Secrets.

List secrets:

kubectl get secrets -n <namespace>

Check a TLS secret:

kubectl describe secret <secret-name> -n <namespace>

Save the TLS certificate to a file:

kubectl get secret <secret-name> -n <namespace> -o jsonpath='{.data.tls\.crt}' | base64 -d > tls.crt

Save the TLS private key to a file:

kubectl get secret <secret-name> -n <namespace> -o jsonpath='{.data.tls\.key}' | base64 -d > tls.key

Check certificate details:

openssl x509 -in tls.crt -text -noout

Check certificate expiration date:

openssl x509 -in tls.crt -noout -enddate

Important: Be careful with private keys. Do not commit them to GitHub.


Create a TLS Secret

Create a Kubernetes TLS secret from a certificate and key:

kubectl create secret tls <secret-name> \
  --cert=tls.crt \
  --key=tls.key \
  -n <namespace>

Example:

kubectl create secret tls app-tls \
  --cert=app.crt \
  --key=app.key \
  -n production

Create a Docker registry pull secret

Use this when Kubernetes needs credentials to pull images from a private registry.

kubectl create secret docker-registry <secret-name> \
  --docker-server=<registry-server> \
  --docker-username=<username> \
  --docker-password=<password> \
  --docker-email=<email> \
  -n <namespace>

Example:

kubectl create secret docker-registry regcred \
  --docker-server=docker.io \
  --docker-username=myuser \
  --docker-password=mypassword \
  --docker-email=user@example.com \
  -n production

Reference it in a Deployment:

imagePullSecrets:
  - name: regcred

Important: Avoid saving passwords in terminal history when possible.


Decode a Kubernetes Secret value

Secrets are base64 encoded.

View secret keys:

kubectl get secret <secret-name> -n <namespace> -o yaml

Decode a specific key:

kubectl get secret <secret-name> -n <namespace> -o jsonpath='{.data.<key-name>}' | base64 -d

Example:

kubectl get secret app-secret -n production -o jsonpath='{.data.DB_PASSWORD}' | base64 -d

Important: Do not paste decoded secrets into tickets, GitHub, or public chats.


Save pod logs to a local file

Useful for ticket evidence and incident investigation.

kubectl logs <pod-name> -n <namespace> > pod.log

Save logs with timestamp:

kubectl logs <pod-name> -n <namespace> > "pod-logs-$(date +%F-%H%M%S).log"

Save previous crashed container logs:

kubectl logs <pod-name> -n <namespace> --previous > "pod-previous-logs-$(date +%F-%H%M%S).log"

For multi-container pods:

kubectl logs <pod-name> -c <container-name> -n <namespace> > container.log

Export Kubernetes resources to YAML

Useful for documentation, backup, and troubleshooting.

Export a Deployment:

kubectl get deployment <deployment-name> -n <namespace> -o yaml > deployment.yaml

Export a Service:

kubectl get svc <service-name> -n <namespace> -o yaml > service.yaml

Export an Ingress:

kubectl get ingress <ingress-name> -n <namespace> -o yaml > ingress.yaml

Export all resources in a namespace:

kubectl get all -n <namespace> -o yaml > namespace-resources.yaml

Important: Review files before committing because YAML exports may contain sensitive data.


Dry-run before creating resources

Dry-run helps validate commands without applying changes.

Generate YAML without creating the resource:

kubectl create deployment nginx --image=nginx --dry-run=client -o yaml

Save generated YAML:

kubectl create deployment nginx --image=nginx --dry-run=client -o yaml > nginx-deployment.yaml

Validate an apply without changing the cluster:

kubectl apply -f deployment.yaml --dry-run=client

Useful output formatting

Wide output:

kubectl get pods -o wide

YAML output:

kubectl get pod <pod-name> -n <namespace> -o yaml

JSON output:

kubectl get pod <pod-name> -n <namespace> -o json

Custom columns:

kubectl get pods -A -o custom-columns=NAMESPACE:.metadata.namespace,NAME:.metadata.name,STATUS:.status.phase,NODE:.spec.nodeName

JSONPath example:

kubectl get pod <pod-name> -n <namespace> -o jsonpath='{.status.podIP}'

Quick daily Kubernetes checks

k config current-context
k get nodes -o wide
k get pods -A
k get svc -A
k get ingress -A
k get events -A --sort-by=.metadata.creationTimestamp

Quick namespace check:

k get all -n <namespace>

Quick pod investigation:

k describe pod <pod-name> -n <namespace>
k logs <pod-name> -n <namespace>
k logs <pod-name> -n <namespace> --previous

3. Nodes

Nodes are the worker machines where Kubernetes runs workloads.

List nodes

kubectl get nodes

List nodes with more details

kubectl get nodes -o wide

Describe a node

kubectl describe node <node-name>

Check node labels

kubectl get nodes --show-labels

Check pods running on a specific node

kubectl get pods -A -o wide | grep <node-name>

Common node states

Ready               Node is healthy
NotReady            Node has a problem
SchedulingDisabled  Node is cordoned

Troubleshooting logic


4. Namespaces

Namespaces separate Kubernetes resources logically.

List namespaces

kubectl get namespaces

Short version:

kubectl get ns

List resources in a namespace

kubectl get all -n <namespace>

Set default namespace for current context

kubectl config set-context --current --namespace=<namespace>

Troubleshooting logic


5. Pods

Pods are the smallest deployable units in Kubernetes.

List pods in current namespace

kubectl get pods

List pods in a specific namespace

kubectl get pods -n <namespace>

List pods across all namespaces

kubectl get pods -A

Show more pod details

kubectl get pods -n <namespace> -o wide

Describe a pod

kubectl describe pod <pod-name> -n <namespace>

Get pod YAML

kubectl get pod <pod-name> -n <namespace> -o yaml

Watch pods live

kubectl get pods -n <namespace> -w

Common pod statuses

Running             Pod is running
Pending             Pod is waiting to be scheduled
CrashLoopBackOff    Container keeps crashing
ImagePullBackOff    Kubernetes cannot pull the image
ErrImagePull        Image pull failed
Completed           Pod completed successfully
Evicted             Pod was removed from a node

Troubleshooting logic


6. Pod Logs

Logs are critical for troubleshooting application and container issues.

View pod logs

kubectl logs <pod-name> -n <namespace>

Follow logs live

kubectl logs -f <pod-name> -n <namespace>

View logs from a specific container

kubectl logs <pod-name> -c <container-name> -n <namespace>

View previous container logs

kubectl logs <pod-name> -n <namespace> --previous

View last 100 lines

kubectl logs <pod-name> -n <namespace> --tail=100

Logs since a specific time

kubectl logs <pod-name> -n <namespace> --since=1h

Troubleshooting logic


7. Events

Events show what Kubernetes is doing behind the scenes.

Get events in a namespace

kubectl get events -n <namespace>

Get events across all namespaces

kubectl get events -A

Sort events by time

kubectl get events -A --sort-by=.metadata.creationTimestamp

Watch events live

kubectl get events -n <namespace> -w

Troubleshooting logic


8. Deployments

Deployments manage application replicas and rolling updates.

List deployments

kubectl get deployments -n <namespace>

Short version:

kubectl get deploy -n <namespace>

Describe deployment

kubectl describe deployment <deployment-name> -n <namespace>

Get deployment YAML

kubectl get deployment <deployment-name> -n <namespace> -o yaml

Scale deployment

kubectl scale deployment <deployment-name> --replicas=3 -n <namespace>

Restart deployment

kubectl rollout restart deployment <deployment-name> -n <namespace>

Check rollout status

kubectl rollout status deployment <deployment-name> -n <namespace>

Troubleshooting logic


9. ReplicaSets

ReplicaSets maintain the desired number of pod replicas.

List ReplicaSets

kubectl get replicasets -n <namespace>

Short version:

kubectl get rs -n <namespace>

Describe ReplicaSet

kubectl describe rs <replicaset-name> -n <namespace>

Troubleshooting logic


10. Services

Services expose pods internally or externally.

List services

kubectl get services -n <namespace>

Short version:

kubectl get svc -n <namespace>

Describe service

kubectl describe svc <service-name> -n <namespace>

Get service YAML

kubectl get svc <service-name> -n <namespace> -o yaml

Common service types

ClusterIP      Internal service
NodePort       Exposes service on a node port
LoadBalancer   Exposes service using cloud/load balancer
ExternalName   Maps service to external DNS name

Troubleshooting logic


11. Endpoints

Endpoints show which pods are backing a service.

Check endpoints

kubectl get endpoints -n <namespace>

For a specific service:

kubectl get endpoints <service-name> -n <namespace>

Newer Kubernetes versions may use EndpointSlices:

kubectl get endpointslices -n <namespace>

Troubleshooting logic


12. Ingress

Ingress exposes HTTP/HTTPS services outside the cluster.

List ingress resources

kubectl get ingress -n <namespace>

Short version:

kubectl get ing -n <namespace>

Describe ingress

kubectl describe ingress <ingress-name> -n <namespace>

Get ingress YAML

kubectl get ingress <ingress-name> -n <namespace> -o yaml

Troubleshooting logic


13. ConfigMaps

ConfigMaps store non-sensitive configuration.

List ConfigMaps

kubectl get configmaps -n <namespace>

Short version:

kubectl get cm -n <namespace>

View ConfigMap

kubectl describe cm <configmap-name> -n <namespace>

Get ConfigMap YAML

kubectl get cm <configmap-name> -n <namespace> -o yaml

Troubleshooting logic


14. Secrets

Secrets store sensitive data such as passwords, tokens, or certificates.

List secrets

kubectl get secrets -n <namespace>

Describe secret

kubectl describe secret <secret-name> -n <namespace>

View secret YAML

kubectl get secret <secret-name> -n <namespace> -o yaml

Decode a secret value

kubectl get secret <secret-name> -n <namespace> -o jsonpath="{.data.<key>}" | base64 -d

Troubleshooting logic


15. Persistent Volumes and Claims

Persistent Volumes and Persistent Volume Claims provide storage to pods.

List PVCs

kubectl get pvc -n <namespace>

List PVs

kubectl get pv

Describe PVC

kubectl describe pvc <pvc-name> -n <namespace>

Describe PV

kubectl describe pv <pv-name>

Common PVC statuses

Bound     PVC is connected to storage
Pending   PVC is waiting for storage
Lost      PVC lost its backing volume

Troubleshooting logic


16. Resource Usage

Use these commands to check CPU and memory usage.

Check node usage

kubectl top nodes

Check pod usage

kubectl top pods -n <namespace>

Check pod usage across all namespaces

kubectl top pods -A

Note: kubectl top requires metrics-server.

Check resource requests and limits

kubectl describe pod <pod-name> -n <namespace>

Troubleshooting logic


17. Exec Into Pods

Use exec to run commands inside a container.

Open shell inside a pod

kubectl exec -it <pod-name> -n <namespace> -- /bin/bash

If Bash is not available:

kubectl exec -it <pod-name> -n <namespace> -- /bin/sh

Run a single command

kubectl exec <pod-name> -n <namespace> -- env

Test connectivity from inside the pod

kubectl exec -it <pod-name> -n <namespace> -- curl http://service-name:port

Troubleshooting logic


18. Port Forwarding

Port forwarding allows local access to a Kubernetes service or pod.

Port forward to a pod

kubectl port-forward pod/<pod-name> 8080:8080 -n <namespace>

Port forward to a service

kubectl port-forward svc/<service-name> 8080:80 -n <namespace>

Then test locally:

curl http://localhost:8080

Troubleshooting logic


19. Rollouts and Rollbacks

Use these commands to manage deployments.

Check rollout status

kubectl rollout status deployment <deployment-name> -n <namespace>

View rollout history

kubectl rollout history deployment <deployment-name> -n <namespace>

Restart deployment

kubectl rollout restart deployment <deployment-name> -n <namespace>

Roll back to previous version

kubectl rollout undo deployment <deployment-name> -n <namespace>

Roll back to specific revision

kubectl rollout undo deployment <deployment-name> -n <namespace> --to-revision=<revision-number>

Troubleshooting logic


20. Common Kubernetes Issues

CrashLoopBackOff

The container starts, crashes, and Kubernetes keeps restarting it.

Commands:

kubectl describe pod <pod-name> -n <namespace>
kubectl logs <pod-name> -n <namespace>
kubectl logs <pod-name> -n <namespace> --previous
kubectl get events -n <namespace> --sort-by=.metadata.creationTimestamp

Common causes:

Application error
Bad environment variable
Missing ConfigMap
Missing Secret
Bad command or entrypoint
Failed health check
Permission issue
Dependency unavailable

ImagePullBackOff / ErrImagePull

Kubernetes cannot pull the container image.

Commands:

kubectl describe pod <pod-name> -n <namespace>
kubectl get events -n <namespace> --sort-by=.metadata.creationTimestamp

Common causes:

Wrong image name
Wrong image tag
Image does not exist
Private registry authentication issue
Network issue reaching registry
Image pull secret missing

Pending Pod

The pod is waiting to be scheduled.

Commands:

kubectl describe pod <pod-name> -n <namespace>
kubectl get nodes
kubectl describe node <node-name>

Common causes:

Insufficient CPU
Insufficient memory
Node selector mismatch
Taints and tolerations issue
PVC not bound
No available nodes

OOMKilled

The container was killed because it used too much memory.

Commands:

kubectl describe pod <pod-name> -n <namespace>
kubectl logs <pod-name> -n <namespace> --previous
kubectl top pod <pod-name> -n <namespace>

Common causes:

Memory leak
Memory limit too low
Unexpected traffic spike
Large batch process
Inefficient application behavior

Service Has No Endpoints

The service exists, but no pods are behind it.

Commands:

kubectl get svc <service-name> -n <namespace>
kubectl get endpoints <service-name> -n <namespace>
kubectl get pods -n <namespace> --show-labels
kubectl describe svc <service-name> -n <namespace>

Common causes:

Service selector does not match pod labels
Pods are not ready
Readiness probe failing
Wrong namespace
Pods are not running

21. Real Troubleshooting Scenarios

Scenario 1: Pod is in CrashLoopBackOff

Commands:

kubectl get pods -n <namespace>
kubectl describe pod <pod-name> -n <namespace>
kubectl logs <pod-name> -n <namespace>
kubectl logs <pod-name> -n <namespace> --previous
kubectl get events -n <namespace> --sort-by=.metadata.creationTimestamp

What to check:

Possible fixes:


Scenario 2: Application is not reachable through Service

Commands:

kubectl get svc -n <namespace>
kubectl describe svc <service-name> -n <namespace>
kubectl get endpoints <service-name> -n <namespace>
kubectl get pods -n <namespace> --show-labels

What to check:

Possible fixes:


Scenario 3: Ingress is not working

Commands:

kubectl get ingress -n <namespace>
kubectl describe ingress <ingress-name> -n <namespace>
kubectl get svc -n <namespace>
kubectl get endpoints -n <namespace>
kubectl logs -n <ingress-controller-namespace> <ingress-controller-pod>

What to check:

Possible fixes:


Scenario 4: Pod is Pending

Commands:

kubectl describe pod <pod-name> -n <namespace>
kubectl get nodes
kubectl describe node <node-name>
kubectl get pvc -n <namespace>

What to check:

Possible fixes:


Scenario 5: Deployment failed after a new release

Commands:

kubectl rollout status deployment <deployment-name> -n <namespace>
kubectl rollout history deployment <deployment-name> -n <namespace>
kubectl get pods -n <namespace>
kubectl logs <pod-name> -n <namespace>
kubectl describe pod <pod-name> -n <namespace>

What to check:

Possible fixes:


Scenario 6: Pod cannot pull image

Commands:

kubectl get pods -n <namespace>
kubectl describe pod <pod-name> -n <namespace>
kubectl get events -n <namespace> --sort-by=.metadata.creationTimestamp
kubectl get secrets -n <namespace>

What to check:

Possible fixes:


Scenario 7: TLS certificate issue

Commands:

kubectl get secrets -n <namespace>
kubectl describe secret <tls-secret-name> -n <namespace>
kubectl get secret <tls-secret-name> -n <namespace> -o jsonpath='{.data.tls\.crt}' | base64 -d > tls.crt
openssl x509 -in tls.crt -text -noout
openssl x509 -in tls.crt -noout -enddate
kubectl describe ingress <ingress-name> -n <namespace>

What to check:

Possible fixes: