Part 5: Troubleshooting - Cumulative Quiz
30% of CKA Exam | 35 Questions | Target: 85%+
This quiz covers all troubleshooting topics from Part 5. Test yourself before moving to mock exams.
Instructions
Section titled “Instructions”- Answer each question before revealing the solution
- Track your score: ___/35
- Review any topics where you score below 80%
- Retake after reviewing weak areas
Section 1: Troubleshooting Methodology (5 questions)
Section titled “Section 1: Troubleshooting Methodology (5 questions)”Q1: First Steps
Section titled “Q1: First Steps”A user reports “the application isn’t working.” What’s your first troubleshooting action?
Answer
Identify the symptom specifically. Ask clarifying questions:
- Is the pod running? (
k get pods) - Is the service accessible? (
k get svc, endpoints) - What error are they seeing?
Then follow the framework: Identify → Isolate → Diagnose → Fix
Q2: Describe vs Logs
Section titled “Q2: Describe vs Logs”Why should you check kubectl describe pod before kubectl logs?
Answer
The Events section in describe often reveals the problem immediately without needing logs:
- Scheduling failures
- Image pull errors
- Volume mount issues
- Configuration errors
Logs are useful for application-level issues, but many problems are caught at the Kubernetes level first.
Q3: Event Retention
Section titled “Q3: Event Retention”You’re investigating an issue that happened 3 hours ago. Events show nothing. Why?
Answer
Events expire after 1 hour by default. The evidence is gone. This is why it’s important to:
- Check events immediately after incidents
- Have a log aggregation solution for historical data
- Note event messages when you see them
Q4: Exit Codes
Section titled “Q4: Exit Codes”Container exit code is 137. What does this indicate?
Answer
Exit code 137 = 128 + 9 (SIGKILL). Usually means:
- OOMKilled - Container exceeded memory limit
- Process was killed by the system
Check: k describe pod | grep -i oom and verify memory limits.
Q5: Troubleshooting Order
Section titled “Q5: Troubleshooting Order”List the correct troubleshooting order for a pod stuck in Pending:
Answer
k describe pod <pod>- Check Events section for scheduling messages- Check node availability:
k get nodes - Check node resources:
k describe nodes | grep -A 5 "Allocated resources" - Check taints:
k get nodes -o custom-columns='NAME:.metadata.name,TAINTS:.spec.taints[*].key' - Check pod’s nodeSelector/affinity:
k get pod <pod> -o yaml
Section 2: Application Failures (6 questions)
Section titled “Section 2: Application Failures (6 questions)”Q6: CrashLoopBackOff
Section titled “Q6: CrashLoopBackOff”Pod is in CrashLoopBackOff. What’s the maximum backoff time between restarts?
Answer
5 minutes (300 seconds)
Backoff doubles: 10s → 20s → 40s → 80s → 160s → 300s (max)
After 10 minutes of running successfully, the counter resets.
Q7: Image Pull Failure
Section titled “Q7: Image Pull Failure”Pod shows ImagePullBackOff. List 3 possible causes.
Answer
- Image doesn’t exist - Wrong name or tag
- Registry authentication failed - Missing or wrong imagePullSecrets
- Registry unreachable - Network issues or firewall
- Rate limited - Docker Hub pull limits exceeded
- Private registry not configured - Missing registry credentials
Q8: Missing ConfigMap
Section titled “Q8: Missing ConfigMap”Pod is stuck in ContainerCreating. Events show “configmap ‘app-config’ not found”. Fix it.
Answer
# Create the missing ConfigMapk create configmap app-config --from-literal=key=value
# Or if you have the datak create configmap app-config --from-file=config.yaml
# Verify pod startsk get pods -wQ9: Previous Logs
Section titled “Q9: Previous Logs”How do you view logs from a container that has crashed?
Answer
k logs <pod> --previous
# For multi-container podk logs <pod> -c <container> --previousThis shows logs from the previous container instance before it died.
Q10: Deployment Rollback
Section titled “Q10: Deployment Rollback”Deployment rollout is stuck with new pods failing. What’s the fastest fix?
Answer
k rollout undo deployment/<name>This immediately rolls back to the previous working version. Investigate the issue later.
Q11: OOMKilled
Section titled “Q11: OOMKilled”Pod keeps getting OOMKilled. How do you verify and fix?
Answer
# Verifyk describe pod <pod> | grep -i oomk get pod <pod> -o jsonpath='{.status.containerStatuses[0].lastState.terminated.reason}'
# Check current limitk get pod <pod> -o jsonpath='{.spec.containers[0].resources.limits.memory}'
# Fix by increasing limitk patch deployment <name> -p '{"spec":{"template":{"spec":{"containers":[{"name":"<container>","resources":{"limits":{"memory":"512Mi"}}}]}}}}'Section 3: Control Plane Failures (5 questions)
Section titled “Section 3: Control Plane Failures (5 questions)”Q12: Static Pod Location
Section titled “Q12: Static Pod Location”Where are control plane static pod manifests stored in kubeadm clusters?
Answer
/etc/kubernetes/manifests/Contains:
- kube-apiserver.yaml
- kube-scheduler.yaml
- kube-controller-manager.yaml
- etcd.yaml
Q13: API Server Down
Section titled “Q13: API Server Down”kubectl commands are timing out. You SSH to the control plane. What do you check first?
Answer
# Check if API server container is runningcrictl ps | grep kube-apiserver
# If not runningcrictl ps -a | grep kube-apiserver # See if it existsjournalctl -u kubelet | grep apiserver # Check kubelet logs
# Check manifestcat /etc/kubernetes/manifests/kube-apiserver.yamlQ14: Scheduler vs Controller Manager
Section titled “Q14: Scheduler vs Controller Manager”New pods stay Pending but Deployments show correct replica count. Which component is failing?
Answer
Scheduler
- Controller manager creates ReplicaSets (working - correct replica count)
- Scheduler assigns pods to nodes (failing - pods stuck Pending)
Q15: etcd Health
Section titled “Q15: etcd Health”Write the command to check etcd cluster health.
Answer
ETCDCTL_API=3 etcdctl endpoint health \ --endpoints=https://127.0.0.1:2379 \ --cacert=/etc/kubernetes/pki/etcd/ca.crt \ --cert=/etc/kubernetes/pki/etcd/server.crt \ --key=/etc/kubernetes/pki/etcd/server.keyQ16: Certificate Expiry
Section titled “Q16: Certificate Expiry”How do you check if Kubernetes certificates are expired?
Answer
kubeadm certs check-expirationTo renew:
kubeadm certs renew allSection 4: Worker Node Failures (5 questions)
Section titled “Section 4: Worker Node Failures (5 questions)”Q17: Node NotReady
Section titled “Q17: Node NotReady”A node shows NotReady status. What’s your SSH troubleshooting sequence?
Answer
ssh <node>
# 1. Check kubeletsudo systemctl status kubeletsudo journalctl -u kubelet -n 50
# 2. Check container runtimesudo systemctl status containerdsudo crictl ps
# 3. Check network to API servercurl -k https://<api-server>:6443/healthz
# 4. Check disk spacedf -hQ18: kubelet Not Running
Section titled “Q18: kubelet Not Running”How do you start kubelet and ensure it starts on boot?
Answer
sudo systemctl start kubeletsudo systemctl enable kubeletsudo systemctl status kubeletQ19: crictl vs kubectl
Section titled “Q19: crictl vs kubectl”When do you use crictl instead of kubectl?
Answer
Use crictl when:
- kubelet or API server is down
- kubectl won’t work
- Debugging at container runtime level
- Node is NotReady
crictl talks directly to containerd, bypassing Kubernetes API.
Q20: Node Drain
Section titled “Q20: Node Drain”What’s the command to safely drain a node for maintenance?
Answer
k drain <node> --ignore-daemonsets --delete-emptydir-dataAfter maintenance:
k uncordon <node>Q21: MemoryPressure
Section titled “Q21: MemoryPressure”Node shows MemoryPressure=True. What are the effects?
Answer
- No new pods scheduled to this node
- Existing pods may be evicted (starting with BestEffort, then Burstable)
- Node marked as having pressure in conditions
Fix: Free memory by evicting pods, killing processes, or adding capacity.
Section 5: Network Troubleshooting (6 questions)
Section titled “Section 5: Network Troubleshooting (6 questions)”Q22: DNS Resolution Test
Section titled “Q22: DNS Resolution Test”How do you test DNS resolution from inside a pod?
Answer
# Test cluster DNSk exec <pod> -- nslookup kubernetes
# Test service DNSk exec <pod> -- nslookup <service-name>
# Test external DNSk exec <pod> -- nslookup google.com
# Check DNS configk exec <pod> -- cat /etc/resolv.confQ23: CoreDNS Troubleshooting
Section titled “Q23: CoreDNS Troubleshooting”All DNS queries fail. What do you check?
Answer
# Check CoreDNS podsk -n kube-system get pods -l k8s-app=kube-dns
# Check CoreDNS logsk -n kube-system logs -l k8s-app=kube-dns
# Check kube-dns servicek -n kube-system get svc kube-dnsk -n kube-system get endpoints kube-dnsQ24: Empty Endpoints
Section titled “Q24: Empty Endpoints”Service exists but k get endpoints <svc> shows <none>. Cause?
Answer
Selector mismatch - service selector doesn’t match any pod labels, or matching pods aren’t Ready.
# Check selectork get svc <svc> -o jsonpath='{.spec.selector}'
# Find matching podsk get pods -l <selector>
# Check if pods are Readyk get pods -l <selector> -o wideQ25: Cross-Node Communication
Section titled “Q25: Cross-Node Communication”Pods on the same node communicate, but cross-node fails. What’s likely broken?
Answer
CNI plugin’s cross-node networking is not working:
- CNI pods not running on all nodes
- Network connectivity between nodes blocked
- Overlay network (VXLAN/IPinIP) misconfigured
- MTU mismatch
k -n kube-system get pods -o wide | grep <cni-name>Q26: NetworkPolicy Default
Section titled “Q26: NetworkPolicy Default”You create a NetworkPolicy selecting pods with only ingress rules. What happens to egress?
Answer
It depends on policyTypes:
- If
policyTypes: [Ingress]only → Egress unrestricted - If
policyTypes: [Ingress, Egress]with no egress rules → All egress denied
NetworkPolicies only affect traffic types listed in policyTypes.
Q27: CNI Troubleshooting
Section titled “Q27: CNI Troubleshooting”Pods stuck in ContainerCreating with “network not ready”. What do you check?
Answer
# Check CNI podsk -n kube-system get pods | grep -E "calico|flannel|weave|cilium"
# Check CNI configuration on nodels /etc/cni/net.d/
# Check CNI binariesls /opt/cni/bin/
# Check CNI pod logsk -n kube-system logs <cni-pod>Section 6: Service Troubleshooting (4 questions)
Section titled “Section 6: Service Troubleshooting (4 questions)”Q28: Port vs TargetPort
Section titled “Q28: Port vs TargetPort”Service has port: 80, targetPort: 8080. Container listens on 80. Will it work?
Answer
No. Traffic arrives at service port 80 but is forwarded to pod port 8080, where nothing is listening.
Fix: Change targetPort: 80 or make container listen on 8080.
Q29: NodePort Not Working
Section titled “Q29: NodePort Not Working”NodePort works from inside cluster but not externally. What’s wrong?
Answer
Firewall or security group blocking the port externally:
- Node iptables
- Cloud security groups
- Network ACLs
NodePort must be open on all nodes from external network.
Q30: LoadBalancer Pending
Section titled “Q30: LoadBalancer Pending”LoadBalancer service stays <pending> for EXTERNAL-IP. Why?
Answer
No cloud controller or MetalLB:
- Cloud controller manager not installed
- Wrong cloud credentials
- No LoadBalancer support (bare metal without MetalLB)
k -n kube-system get pods | grep cloud-controllerk get events --field-selector involvedObject.name=<svc>Q31: kube-proxy
Section titled “Q31: kube-proxy”All services stop working on a node. What’s likely the issue?
Answer
kube-proxy not running or misconfigured on that node:
k -n kube-system get pods -l k8s-app=kube-proxy -o widek -n kube-system logs -l k8s-app=kube-proxy
# Check iptables rulessudo iptables -t nat -L KUBE-SERVICES | headSection 7: Logging & Monitoring (4 questions)
Section titled “Section 7: Logging & Monitoring (4 questions)”Q32: Previous Container Logs
Section titled “Q32: Previous Container Logs”When do you use --previous flag with kubectl logs?
Answer
When container has crashed and restarted (CrashLoopBackOff). Shows logs from the previous instance before it died.
k logs <pod> --previousQ33: Metrics Server
Section titled “Q33: Metrics Server”kubectl top pods returns “metrics not available”. How do you fix it?
Answer
Install Metrics Server:
# Check if installedk -n kube-system get pods | grep metrics-server
# If not, install itkubectl apply -f https://github.com/kubernetes-sigs/metrics-server/releases/latest/download/components.yamlQ34: Log Locations
Section titled “Q34: Log Locations”Where are container logs stored on a node?
Answer
/var/log/containers/<pod>_<namespace>_<container>-<id>.logThese are symlinks managed by the container runtime. kubelet handles log rotation.
Q35: kubelet Logs
Section titled “Q35: kubelet Logs”How do you view kubelet logs on a node?
Answer
# SSH to nodessh <node>
# View logsjournalctl -u kubelet
# Follow logsjournalctl -u kubelet -f
# Recent errorsjournalctl -u kubelet --since "10 minutes ago" | grep -i errorScoring
Section titled “Scoring”| Score | Assessment |
|---|---|
| 32-35 (90%+) | Excellent - Ready for troubleshooting questions |
| 28-31 (80-89%) | Good - Review missed topics |
| 24-27 (70-79%) | Fair - Need more practice |
| <24 (<70%) | Review Part 5 modules thoroughly |
Your Score: ___/35 = ___%
Topic Review Guide
Section titled “Topic Review Guide”If you scored low on specific sections:
| Section | Review Module |
|---|---|
| Methodology | 5.1 |
| Application Failures | 5.2 |
| Control Plane | 5.3 |
| Worker Nodes | 5.4 |
| Network | 5.5 |
| Services | 5.6 |
| Logging | 5.7 |
Next Steps
Section titled “Next Steps”With Part 5 complete, you’ve covered:
- Part 0: Environment (5 modules)
- Part 1: Cluster Architecture (7 modules) - 25% of exam
- Part 2: Workloads & Scheduling (7 modules) - 15% of exam
- Part 3: Services & Networking (7 modules) - 20% of exam
- Part 4: Storage (5 modules) - 10% of exam
- Part 5: Troubleshooting (7 modules) - 30% of exam
Total: 38 modules covering 100% of CKA exam domains
Continue to Part 6: Mock Exams for timed practice under exam conditions.