Module 4.3: Resource Requirements and Limits
Complexity:
[MEDIUM]- Critical for production, affects schedulingTime to Complete: 35-45 minutes
Prerequisites: Module 1.1 (Pods), understanding of CPU and memory concepts
Learning Outcomes
Section titled “Learning Outcomes”After completing this module, you will be able to:
- Configure resource requests and limits for CPU and memory in pod specifications
- Diagnose OOMKilled and CPU throttling issues by correlating limits with observed behavior
- Design resource allocations that balance performance, cost, and scheduling reliability
- Explain how requests affect scheduling and limits affect runtime enforcement
Why This Module Matters
Section titled “Why This Module Matters”Resource requests and limits control how much CPU and memory your containers can use. Without them, a single container could consume all node resources, starving other pods. Proper resource management is essential for cluster stability.
The CKAD exam tests:
- Setting requests and limits
- Understanding the difference between them
- What happens when limits are exceeded
- LimitRanges and ResourceQuotas
The Apartment Lease Analogy
Resource requests are like a guaranteed parking spot—you’re assured that space. Limits are like the building’s max occupancy—you can use more space temporarily, but there’s a hard cap. If you exceed it (memory), you get evicted (OOMKilled). If the building is full (node), new tenants (pods) wait until space opens.
Requests vs Limits
Section titled “Requests vs Limits”Definitions
Section titled “Definitions”| Term | Meaning | When Enforced |
|---|---|---|
| Request | Guaranteed minimum resources | Scheduling time |
| Limit | Maximum allowed resources | Runtime |
How They Work
Section titled “How They Work”┌─────────────────────────────────────────────────────────────┐│ Resource Request vs Limit │├─────────────────────────────────────────────────────────────┤│ ││ Memory: ││ ├── Request: 256Mi (guaranteed, used for scheduling) ││ ├── Actual usage can vary between 0 and Limit ││ └── Limit: 512Mi (hard cap, exceeding = OOMKill) ││ ││ CPU: ││ ├── Request: 100m (guaranteed, used for scheduling) ││ ├── Can burst above request if node has spare capacity ││ └── Limit: 500m (throttled if exceeded, NOT killed) ││ ││ ┌────────────────────────────────────────────────────┐ ││ │ │ ││ │ 0 Request Actual Limit │ ││ │ | | | | │ ││ │ ├───────────┼───────────┼────────────┤ │ ││ │ │ guaranteed│ burstable │ max │ │ ││ │ └───────────┴───────────┴────────────┘ │ ││ │ │ ││ └────────────────────────────────────────────────────┘ ││ │└─────────────────────────────────────────────────────────────┘Setting Resources
Section titled “Setting Resources”Basic Syntax
Section titled “Basic Syntax”apiVersion: v1kind: Podmetadata: name: resource-demospec: containers: - name: app image: nginx resources: requests: memory: "256Mi" cpu: "100m" limits: memory: "512Mi" cpu: "500m"CPU:
| Value | Meaning |
|---|---|
1 | 1 CPU core |
1000m | 1000 millicores = 1 core |
500m | 0.5 cores |
100m | 0.1 cores (10%) |
Memory:
| Value | Meaning |
|---|---|
128Mi | 128 mebibytes (1024-based) |
1Gi | 1 gibibyte = 1024 Mi |
128M | 128 megabytes (1000-based) |
1G | 1 gigabyte = 1000 M |
What Happens at Limits
Section titled “What Happens at Limits”Pause and predict: A pod has
requests.cpu: 100mandlimits.cpu: 500m. The node has 1 CPU core available. Can this pod use 500m? What if three other pods on the same node also havelimits.cpu: 500m?
Memory Limit Exceeded
Section titled “Memory Limit Exceeded”Container uses > limit → OOMKilled → Container restarts# Check if pod was OOMKilledk describe pod my-pod | grep -A5 "Last State"k get pod my-pod -o jsonpath='{.status.containerStatuses[0].lastState}'CPU Limit Exceeded
Section titled “CPU Limit Exceeded”Container uses > limit → Throttled (slowed down, NOT killed)CPU throttling is invisible to the container—it just runs slower.
QoS Classes
Section titled “QoS Classes”Kubernetes assigns Quality of Service classes based on resource settings:
| QoS Class | Condition | Eviction Priority |
|---|---|---|
| Guaranteed | Requests = Limits for all containers | Last (protected) |
| Burstable | Requests < Limits (or only one set) | Middle |
| BestEffort | No requests or limits set | First (evicted first) |
Stop and think: A pod has requests but no limits set. Which QoS class will it receive — Guaranteed, Burstable, or BestEffort? What about a pod with limits but no requests?
Guaranteed Example
Section titled “Guaranteed Example”resources: requests: memory: "256Mi" cpu: "100m" limits: memory: "256Mi" # Same as request cpu: "100m" # Same as requestBurstable Example
Section titled “Burstable Example”resources: requests: memory: "256Mi" cpu: "100m" limits: memory: "512Mi" # Higher than request cpu: "500m" # Higher than requestBestEffort Example
Section titled “BestEffort Example”resources: {} # No resources definedScheduling Impact
Section titled “Scheduling Impact”Pod Won’t Schedule
Section titled “Pod Won’t Schedule”If no node has enough available resources (capacity - allocated requests):
# Check why pod is Pendingk describe pod my-pod
# Events will show:# 0/3 nodes are available: 3 Insufficient cpu.# or# 0/3 nodes are available: 3 Insufficient memory.Check Node Capacity
Section titled “Check Node Capacity”# Node capacity and allocatablek describe node NODE_NAME | grep -A5 Capacityk describe node NODE_NAME | grep -A5 Allocatable
# Already allocatedk describe node NODE_NAME | grep -A10 "Allocated resources"LimitRange
Section titled “LimitRange”Namespace-level defaults and constraints:
apiVersion: v1kind: LimitRangemetadata: name: cpu-memory-limitsspec: limits: - default: # Default limits if not specified cpu: "500m" memory: "512Mi" defaultRequest: # Default requests if not specified cpu: "100m" memory: "256Mi" max: # Maximum allowed cpu: "2" memory: "2Gi" min: # Minimum allowed cpu: "50m" memory: "64Mi" type: Container# View LimitRangek get limitrangek describe limitrange cpu-memory-limitsResourceQuota
Section titled “ResourceQuota”Namespace-level total resource limits:
apiVersion: v1kind: ResourceQuotametadata: name: compute-quotaspec: hard: requests.cpu: "4" requests.memory: "8Gi" limits.cpu: "8" limits.memory: "16Gi" pods: "10"# View quota usagek get resourcequotak describe resourcequota compute-quotaQuick Reference
Section titled “Quick Reference”# Set resources in pod specresources: requests: cpu: "100m" memory: "256Mi" limits: cpu: "500m" memory: "512Mi"
# Check pod resourcesk get pod POD -o jsonpath='{.spec.containers[*].resources}'
# Check node capacityk describe node NODE | grep -A10 "Allocated"
# Check QoS classk get pod POD -o jsonpath='{.status.qosClass}'Did You Know?
Section titled “Did You Know?”-
CPU is compressible, memory is not. If you exceed CPU limits, you’re throttled. If you exceed memory limits, you’re killed.
-
Requests affect scheduling, limits affect runtime. A pod with 1Gi memory request won’t schedule on a node with only 512Mi available, even if the container only uses 100Mi.
-
Kubernetes doesn’t prevent memory overcommit. If all pods burst to their limits simultaneously, the node runs out of memory and starts killing pods.
-
The
cpu: 0.1syntax is equivalent tocpu: 100m(100 millicores).
Common Mistakes
Section titled “Common Mistakes”| Mistake | Why It Hurts | Solution |
|---|---|---|
| No resources set | BestEffort pods evicted first | Always set requests |
| Request > Limit | Invalid, rejected | Request must be ≤ Limit |
| Memory too low | OOMKilled constantly | Profile app, increase limits |
| CPU too low | App runs slowly | Monitor with k top, adjust |
| Same as node capacity | No room for system pods | Leave headroom |
-
A pod keeps getting OOMKilled — you see
Last State: Terminated, Reason: OOMKilledinkubectl describe. The pod haslimits.memory: 128Mi. The developer says “but the app only uses 80MB.” What is likely happening and how do you fix it?Answer
The application likely uses more memory than the developer thinks. Memory usage includes the runtime overhead (JVM heap, Go GC, Python interpreter), shared libraries, and any temporary allocations. The developer might be measuring only heap or RSS, not the full container memory. Use `kubectl top pod` to see actual usage approaching the limit. The fix is to increase the memory limit based on observed peak usage (with ~25% headroom), or profile the application to find memory leaks. Also check if the container runs multiple processes — each contributes to the container's memory total. The 128Mi limit might be appropriate for the app code but not for the runtime + app combined. -
A pod is stuck in Pending state.
kubectl describeshows: “0/3 nodes are available: 3 Insufficient cpu.” The pod requests 2 CPU cores. Each node has 4 cores but already runs several pods. What are your options to get this pod scheduled?Answer
The scheduler can't find a node where allocatable CPU minus already-requested CPU is >= 2 cores. Options: (1) Reduce the pod's CPU request if the application doesn't truly need 2 cores — check actual usage with `kubectl top` on similar pods. (2) Scale down or delete other pods to free up capacity. (3) Add more nodes to the cluster. (4) Check if other pods are over-requesting — their requests might be higher than actual usage, wasting schedulable capacity. Run `kubectl describe node` on each node and look at "Allocated resources" to see where CPU is committed. Requests affect scheduling, not actual usage, so over-requesting is a common cause of scheduling failures. -
A deployment runs 5 replicas with no resource requests or limits set. During a node memory pressure event, all 5 pods are evicted before pods from other deployments. Why were these pods targeted first?
Answer
Pods without resource requests or limits receive the BestEffort QoS class, which has the lowest priority during eviction. When a node runs low on memory, the kubelet evicts pods in QoS order: BestEffort first, then Burstable, then Guaranteed last. The other deployments likely had requests and/or limits set, giving them Burstable or Guaranteed QoS class. The fix is to always set at least resource requests on production pods. Setting requests equal to limits gives Guaranteed QoS (highest protection), while having requests lower than limits gives Burstable QoS (middle tier). -
A namespace has a LimitRange with
default.cpu: 200manddefault.memory: 256Mi. A developer creates a pod without specifying any resources. They later notice the pod has resource limits they didn’t set. What happened, and how does this interact with ResourceQuota?Answer
LimitRange automatically injects default resource requests and limits into containers that don't specify them. The developer's pod received `limits.cpu: 200m` and `limits.memory: 256Mi` from the LimitRange defaults. If a ResourceQuota also exists in the namespace, every pod must have resource requests (so the quota can track usage). The LimitRange defaults ensure pods aren't rejected for missing resource specifications when a ResourceQuota is active. Check with `kubectl get pod -o jsonpath='{.spec.containers[0].resources}'` to see the injected values, and `kubectl describe limitrange` to see the namespace defaults.
Hands-On Exercise
Section titled “Hands-On Exercise”Task: Configure and observe resource behavior.
Part 1: Basic Resources
cat << 'EOF' | k apply -f -apiVersion: v1kind: Podmetadata: name: resource-demospec: containers: - name: app image: nginx resources: requests: memory: "64Mi" cpu: "50m" limits: memory: "128Mi" cpu: "100m"EOF
# Check QoS classk get pod resource-demo -o jsonpath='{.status.qosClass}'echo
# Check resourcesk get pod resource-demo -o jsonpath='{.spec.containers[0].resources}'Part 2: OOMKill Demo
cat << 'EOF' | k apply -f -apiVersion: v1kind: Podmetadata: name: memory-hogspec: containers: - name: app image: polinux/stress command: ["stress"] args: ["--vm", "1", "--vm-bytes", "200M", "--vm-hang", "1"] resources: limits: memory: "100Mi"EOF
# Watch it get OOMKilledk get pod memory-hog -w
# Check reasonk describe pod memory-hog | grep -A3 "Last State"Cleanup:
k delete pod resource-demo memory-hogPractice Drills
Section titled “Practice Drills”Drill 1: Basic Resources (Target: 2 minutes)
Section titled “Drill 1: Basic Resources (Target: 2 minutes)”cat << 'EOF' | k apply -f -apiVersion: v1kind: Podmetadata: name: drill1spec: containers: - name: nginx image: nginx resources: requests: cpu: "100m" memory: "128Mi" limits: cpu: "200m" memory: "256Mi"EOF
k get pod drill1 -o jsonpath='{.spec.containers[0].resources}'echok delete pod drill1Drill 2: Check QoS Class (Target: 2 minutes)
Section titled “Drill 2: Check QoS Class (Target: 2 minutes)”# Guaranteed (requests = limits)cat << 'EOF' | k apply -f -apiVersion: v1kind: Podmetadata: name: drill2spec: containers: - name: nginx image: nginx resources: requests: cpu: "100m" memory: "128Mi" limits: cpu: "100m" memory: "128Mi"EOF
k get pod drill2 -o jsonpath='{.status.qosClass}'echok delete pod drill2Drill 3: Generate Pod with Resources (Target: 2 minutes)
Section titled “Drill 3: Generate Pod with Resources (Target: 2 minutes)”# Use --dry-run to generate, then add resourcesk run drill3 --image=nginx --dry-run=client -o yaml > /tmp/drill3.yaml
# Edit to add resources (in exam, use vim)cat << 'EOF' | k apply -f -apiVersion: v1kind: Podmetadata: name: drill3spec: containers: - name: drill3 image: nginx resources: requests: cpu: 50m memory: 64Mi limits: cpu: 100m memory: 128MiEOF
k get pod drill3 -o yaml | grep -A8 resourcesk delete pod drill3Drill 4: Deployment with Resources (Target: 3 minutes)
Section titled “Drill 4: Deployment with Resources (Target: 3 minutes)”cat << 'EOF' | k apply -f -apiVersion: apps/v1kind: Deploymentmetadata: name: drill4spec: replicas: 2 selector: matchLabels: app: drill4 template: metadata: labels: app: drill4 spec: containers: - name: nginx image: nginx resources: requests: cpu: "50m" memory: "64Mi" limits: cpu: "100m" memory: "128Mi"EOF
k get pods -l app=drill4k delete deploy drill4Drill 5: Check Node Resources (Target: 2 minutes)
Section titled “Drill 5: Check Node Resources (Target: 2 minutes)”# Get node nameNODE=$(k get nodes -o jsonpath='{.items[0].metadata.name}')
# Check capacityk describe node $NODE | grep -A5 "Capacity:"
# Check allocatablek describe node $NODE | grep -A5 "Allocatable:"
# Check allocatedk describe node $NODE | grep -A10 "Allocated resources:"Drill 6: LimitRange (Target: 4 minutes)
Section titled “Drill 6: LimitRange (Target: 4 minutes)”# Create namespace with LimitRangek create ns drill6
cat << 'EOF' | k apply -n drill6 -f -apiVersion: v1kind: LimitRangemetadata: name: default-limitsspec: limits: - default: cpu: "200m" memory: "256Mi" defaultRequest: cpu: "100m" memory: "128Mi" type: ContainerEOF
# Create pod without resourcesk run drill6-pod --image=nginx -n drill6
# Check defaults were appliedk get pod drill6-pod -n drill6 -o jsonpath='{.spec.containers[0].resources}'echo
# Cleanupk delete ns drill6Next Module
Section titled “Next Module”Module 4.4: SecurityContexts - Configure pod and container security settings.