Module 2.4: Deployment Strategies
Complexity:
[MEDIUM]- Conceptual understanding with practical implementationTime to Complete: 40-50 minutes
Prerequisites: Module 2.1 (Deployments), understanding of Services
Learning Outcomes
Section titled “Learning Outcomes”After completing this module, you will be able to:
- Implement blue/green and canary deployments using Kubernetes-native resources
- Compare rolling update, blue/green, and canary strategies with their trade-offs
- Design a deployment strategy that meets availability and rollback requirements
- Evaluate deployment health during a rollout and decide when to proceed or rollback
Why This Module Matters
Section titled “Why This Module Matters”How you deploy new versions matters. A bad deployment strategy can cause downtime, data corruption, or user-facing errors. The CKAD expects you to understand different deployment strategies and when to use each.
You’ll face questions like:
- Implement a blue/green deployment
- Set up a canary release
- Configure rolling update parameters
- Choose the appropriate strategy for a scenario
The Restaurant Menu Analogy
Rolling updates are like gradually replacing menu items—customers ordering at different times might get slightly different menus. Blue/green is like having two complete kitchens—you switch all customers to the new kitchen at once. Canary releases are like giving the new dish to 10% of customers first—if they like it, everyone gets it.
Strategy Overview
Section titled “Strategy Overview”Comparison
Section titled “Comparison”| Strategy | Downtime | Rollback | Resource Cost | Risk |
|---|---|---|---|---|
| Rolling Update | None | Slow (gradual) | Low | Medium |
| Recreate | Yes | Fast (redeploy old) | Low | High |
| Blue/Green | None | Instant | 2x resources | Low |
| Canary | None | Instant | Low-Medium | Very Low |
When to Use Each
Section titled “When to Use Each”| Strategy | Best For |
|---|---|
| Rolling Update | Most applications, default choice |
| Recreate | Apps that can’t run multiple versions |
| Blue/Green | Critical apps needing instant rollback |
| Canary | Risk-averse deployments, testing with real traffic |
Pause and predict: Before reading the details, consider this scenario: your application has 4 replicas and you need to update to a new version. Rank the four strategies (rolling update, recreate, blue/green, canary) by resource cost during the transition. Which one needs the most extra pods?
Rolling Update (Default)
Section titled “Rolling Update (Default)”Kubernetes gradually replaces old pods with new ones.
Configuration
Section titled “Configuration”apiVersion: apps/v1kind: Deploymentmetadata: name: web-appspec: replicas: 4 strategy: type: RollingUpdate rollingUpdate: maxSurge: 1 # Can exceed replicas by 1 maxUnavailable: 1 # At most 1 unavailable selector: matchLabels: app: web template: metadata: labels: app: web spec: containers: - name: nginx image: nginx:1.20Update Behavior
Section titled “Update Behavior”With replicas=4, maxSurge=1, maxUnavailable=1:
Start: [v1] [v1] [v1] [v1] (4 running)Step 1: [v1] [v1] [v1] [--] [v2] (3 old, 1 new starting)Step 2: [v1] [v1] [--] [v2] [v2] (2 old, 2 new)Step 3: [v1] [--] [v2] [v2] [v2] (1 old, 3 new)Step 4: [v2] [v2] [v2] [v2] (4 new, complete)Trigger Rolling Update
Section titled “Trigger Rolling Update”# Update imagek set image deploy/web-app nginx=nginx:1.21
# Watch rolloutk rollout status deploy/web-app
# Check pods transitioningk get pods -l app=web -wRecreate Strategy
Section titled “Recreate Strategy”All existing pods are killed before new ones are created.
Configuration
Section titled “Configuration”apiVersion: apps/v1kind: Deploymentmetadata: name: database-appspec: replicas: 1 strategy: type: Recreate selector: matchLabels: app: database template: metadata: labels: app: database spec: containers: - name: postgres image: postgres:13Update Behavior
Section titled “Update Behavior”Start: [v1] [v1] [v1]Step 1: [--] [--] [--] (all old pods terminated)Step 2: [v2] [v2] [v2] (all new pods created)When to Use
Section titled “When to Use”- Database applications with single-writer requirement
- Applications with filesystem locks
- Apps that can’t handle multiple versions accessing shared state
- Stateful applications without proper migration support
Blue/Green Deployment
Section titled “Blue/Green Deployment”Run two identical environments. Switch traffic instantly by updating the Service selector.
Implementation
Section titled “Implementation”Step 1: Deploy Blue (Current)
apiVersion: apps/v1kind: Deploymentmetadata: name: app-bluespec: replicas: 3 selector: matchLabels: app: myapp version: blue template: metadata: labels: app: myapp version: blue spec: containers: - name: app image: myapp:1.0Step 2: Create Service (Points to Blue)
apiVersion: v1kind: Servicemetadata: name: myappspec: selector: app: myapp version: blue # Points to blue ports: - port: 80Step 3: Deploy Green (New Version)
apiVersion: apps/v1kind: Deploymentmetadata: name: app-greenspec: replicas: 3 selector: matchLabels: app: myapp version: green template: metadata: labels: app: myapp version: green spec: containers: - name: app image: myapp:2.0Step 4: Switch Traffic
# Switch service to greenk patch svc myapp -p '{"spec":{"selector":{"version":"green"}}}'
# Instant rollback if neededk patch svc myapp -p '{"spec":{"selector":{"version":"blue"}}}'Complete Blue/Green Script
Section titled “Complete Blue/Green Script”# Deploy bluek apply -f blue-deployment.yaml
# Create service pointing to bluek apply -f service.yaml
# Test bluek run test --image=busybox --rm -it --restart=Never -- wget -qO- http://myapp
# Deploy green (without traffic)k apply -f green-deployment.yaml
# Test green directly (port-forward or separate service)k port-forward deploy/app-green 8080:80 &curl localhost:8080
# Switch traffic to greenk patch svc myapp -p '{"spec":{"selector":{"version":"green"}}}'
# If problems, instant rollbackk patch svc myapp -p '{"spec":{"selector":{"version":"blue"}}}'
# Once confirmed, remove bluek delete deploy app-blueCanary Deployment
Section titled “Canary Deployment”Route a small percentage of traffic to the new version. Gradually increase if successful.
Implementation with Multiple Deployments
Section titled “Implementation with Multiple Deployments”Stable Deployment (90% traffic)
apiVersion: apps/v1kind: Deploymentmetadata: name: app-stablespec: replicas: 9 # 90% of traffic selector: matchLabels: app: myapp track: stable template: metadata: labels: app: myapp track: stable spec: containers: - name: app image: myapp:1.0Canary Deployment (10% traffic)
apiVersion: apps/v1kind: Deploymentmetadata: name: app-canaryspec: replicas: 1 # 10% of traffic selector: matchLabels: app: myapp track: canary template: metadata: labels: app: myapp track: canary spec: containers: - name: app image: myapp:2.0Stop and think: In the canary setup below, the Service selector uses
app: myappwhich matches BOTH the stable and canary pods. How does Kubernetes distribute traffic between them? Is it exactly 90/10, or approximately? What controls the ratio?
Service (Routes to Both)
apiVersion: v1kind: Servicemetadata: name: myappspec: selector: app: myapp # Matches both stable and canary ports: - port: 80Traffic Distribution
Section titled “Traffic Distribution”With 9 stable pods and 1 canary pod:
- ~90% traffic → stable (v1.0)
- ~10% traffic → canary (v2.0)
Progressive Canary Rollout
Section titled “Progressive Canary Rollout”# Start: 9 stable, 1 canary (10%)k scale deploy app-canary --replicas=1k scale deploy app-stable --replicas=9
# Increase canary to 25%k scale deploy app-canary --replicas=3k scale deploy app-stable --replicas=9
# Increase canary to 50%k scale deploy app-canary --replicas=5k scale deploy app-stable --replicas=5
# Full rollout (100% new version)k scale deploy app-canary --replicas=10k scale deploy app-stable --replicas=0
# Cleanup: rename canary to stablek delete deploy app-stablek patch deploy app-canary -p '{"metadata":{"name":"app-stable"}}'Rolling Update Parameters Deep Dive
Section titled “Rolling Update Parameters Deep Dive”maxSurge
Section titled “maxSurge”Maximum number of pods that can be created over desired count:
rollingUpdate: maxSurge: 25% # 25% extra pods (default) # or maxSurge: 2 # 2 extra podsmaxUnavailable
Section titled “maxUnavailable”Maximum pods that can be unavailable during update:
rollingUpdate: maxUnavailable: 25% # 25% can be down (default) # or maxUnavailable: 0 # Zero downtimeCommon Configurations
Section titled “Common Configurations”# Zero downtime (conservative)rollingUpdate: maxSurge: 1 maxUnavailable: 0
# Fast update (aggressive)rollingUpdate: maxSurge: 100% maxUnavailable: 50%
# Balanced (default)rollingUpdate: maxSurge: 25% maxUnavailable: 25%What would happen if: You deploy a new version using a rolling update, but you forgot to add a readiness probe. The new version takes 30 seconds to start accepting requests, but Kubernetes considers the pod “ready” immediately. What happens to user requests during those 30 seconds?
Readiness Gates and Probes
Section titled “Readiness Gates and Probes”Proper probes ensure smooth deployments.
Readiness Probe for Deployments
Section titled “Readiness Probe for Deployments”spec: template: spec: containers: - name: app image: myapp readinessProbe: httpGet: path: /ready port: 8080 initialDelaySeconds: 5 periodSeconds: 5Without readiness probes, Kubernetes considers pods ready immediately—traffic might route to pods that aren’t fully initialized.
minReadySeconds
Section titled “minReadySeconds”spec: minReadySeconds: 10 # Pod must be ready 10s before considered availableThis adds a buffer to catch early crashes.
Practical Exam Scenarios
Section titled “Practical Exam Scenarios”Scenario 1: Configure Zero-Downtime Rolling Update
Section titled “Scenario 1: Configure Zero-Downtime Rolling Update”apiVersion: apps/v1kind: Deploymentmetadata: name: webappspec: replicas: 4 strategy: type: RollingUpdate rollingUpdate: maxSurge: 1 maxUnavailable: 0 # Key: never reduce below desired selector: matchLabels: app: webapp template: metadata: labels: app: webapp spec: containers: - name: nginx image: nginx:1.20 readinessProbe: # Important for zero-downtime httpGet: path: / port: 80 initialDelaySeconds: 2 periodSeconds: 3Scenario 2: Quick Blue/Green Switch
Section titled “Scenario 2: Quick Blue/Green Switch”# Create blue deploymentk create deploy app-blue --image=nginx:1.20 --replicas=3k label deploy app-blue version=blue
# Add version label to pod templatek patch deploy app-blue -p '{"spec":{"template":{"metadata":{"labels":{"version":"blue"}}}}}'
# Create servicek expose deploy app-blue --name=myapp --port=80 --selector=version=blue
# Deploy greenk create deploy app-green --image=nginx:1.21 --replicas=3k patch deploy app-green -p '{"spec":{"template":{"metadata":{"labels":{"version":"green"}}}}}'
# Switch to greenk patch svc myapp -p '{"spec":{"selector":{"version":"green"}}}'Did You Know?
Section titled “Did You Know?”-
Kubernetes rolling updates are self-healing. If a new pod fails its readiness probe, the rollout pauses automatically, preventing a bad version from fully deploying.
-
Blue/green deployments require 2x resources during the switch. This is their main downside but enables instant rollback.
-
Canary deployments originated at Google. The term comes from “canary in a coal mine”—miners used canaries to detect toxic gases. If the canary died, miners knew to evacuate.
Common Mistakes
Section titled “Common Mistakes”| Mistake | Why It Hurts | Solution |
|---|---|---|
| No readiness probe | Traffic to unready pods | Always add readiness probes |
maxUnavailable: 100% | All pods killed at once | Keep at 25% or less |
| Wrong service selector for blue/green | Traffic doesn’t switch | Verify label matching |
| Not testing canary separately | Canary issues undetected | Test canary pods directly first |
| Forgetting to scale down old deployment | Resource waste | Scale down after successful switch |
-
Your company’s payment processing application can’t handle two different versions running simultaneously because of database schema differences. You need to update it with minimal downtime. Which strategy do you choose, and what precaution should you take before starting?
Answer
Use `strategy: type: Recreate`. This terminates all old pods before creating new ones, ensuring only one version runs at a time. Before starting, warn stakeholders about the brief downtime window and schedule the update during low-traffic hours. Apply the database schema migration first (if needed), then update the Deployment. While `Recreate` causes downtime, it's the safest choice when two versions can't coexist. The alternative -- running a blue/green with database migrations -- is complex and risky for schema-dependent applications. -
You’ve deployed a blue/green setup:
app-blue(v1) with 3 replicas is receiving all traffic via a Service. You deployapp-green(v2) with 3 replicas and switch the Service selector. Users immediately report errors. What’s the fastest way to recover, and what should you have done before switching?Answer
Instant recovery: `kubectl patch svc myapp -p '{"spec":{"selector":{"version":"blue"}}}'` -- this switches traffic back to blue in seconds, which is the main advantage of blue/green. Before switching, you should have: (1) verified green pods are all Running and Ready; (2) tested green directly via port-forward or a temporary test Service; (3) run smoke tests against the green deployment. Blue/green's strength is instant rollback, but its weakness is that you don't catch issues until 100% of traffic hits the new version -- unlike canary, which exposes issues at a small percentage. -
Your SRE team wants to deploy a new recommendation engine version that might have performance issues under load. They want to expose it to only ~10% of real traffic first, monitor for 30 minutes, then gradually increase. Describe how you’d set this up using only Kubernetes-native resources (no Istio or service mesh).
Answer
Create two Deployments with a shared label: `app-stable` with 9 replicas (image v1) and `app-canary` with 1 replica (image v2). Both must have a common label like `app: recommender` in their pod templates. Create a Service with `selector: {app: recommender}` to route to both. Kubernetes distributes traffic roughly proportional to pod count -- ~90% stable, ~10% canary. After 30 minutes, if metrics look good, scale canary to 3 and stable to 7 (~30% canary). Continue until canary reaches 10 replicas and stable reaches 0. The limitation: traffic split is approximate and controlled by pod ratio, not precise percentage routing. -
A Deployment has
replicas: 6,maxSurge: 50%, andmaxUnavailable: 0. You trigger a rolling update, but after 3 new pods are created, the rollout stalls. All 3 new pods are inPendingstate due to insufficient cluster resources. Meanwhile, the 6 old pods are still running. What’s the total pod count right now, and why can’t Kubernetes make progress?Answer
There are 9 pods total: 6 old (running) + 3 new (pending). `maxSurge: 50%` of 6 = 3 extra pods allowed, so the surge limit is reached. `maxUnavailable: 0` means Kubernetes can't terminate any old pods until new ones are Ready. Since the new pods are Pending (not Ready), no old pods can be removed to free resources. This creates a deadlock: the cluster can't schedule new pods, and old pods can't be removed. Fix by either adding node capacity, setting `maxUnavailable: 1` to allow removing an old pod, or reducing the surge percentage. This demonstrates why `maxUnavailable: 0` is conservative but can deadlock on resource-constrained clusters.
Hands-On Exercise
Section titled “Hands-On Exercise”Task: Implement all three deployment strategies.
Part 1: Rolling Update with Parameters
# Create deployment with custom rolling updatecat << 'EOF' | k apply -f -apiVersion: apps/v1kind: Deploymentmetadata: name: rolling-demospec: replicas: 4 strategy: type: RollingUpdate rollingUpdate: maxSurge: 1 maxUnavailable: 0 selector: matchLabels: app: rolling template: metadata: labels: app: rolling spec: containers: - name: nginx image: nginx:1.20EOF
# Update and watch (should see 5 pods max)k set image deploy/rolling-demo nginx=nginx:1.21k get pods -l app=rolling -w
# Cleanupk delete deploy rolling-demoPart 2: Blue/Green Deployment
# Blue deploymentcat << 'EOF' | k apply -f -apiVersion: apps/v1kind: Deploymentmetadata: name: bluespec: replicas: 3 selector: matchLabels: app: demo version: blue template: metadata: labels: app: demo version: blue spec: containers: - name: nginx image: nginx:1.20EOF
# Service pointing to bluecat << 'EOF' | k apply -f -apiVersion: v1kind: Servicemetadata: name: demo-svcspec: selector: app: demo version: blue ports: - port: 80EOF
# Green deploymentcat << 'EOF' | k apply -f -apiVersion: apps/v1kind: Deploymentmetadata: name: greenspec: replicas: 3 selector: matchLabels: app: demo version: green template: metadata: labels: app: demo version: green spec: containers: - name: nginx image: nginx:1.21EOF
# Switch to greenk patch svc demo-svc -p '{"spec":{"selector":{"version":"green"}}}'
# Rollback to bluek patch svc demo-svc -p '{"spec":{"selector":{"version":"blue"}}}'
# Cleanupk delete deploy blue greenk delete svc demo-svcPart 3: Canary Deployment
# Stable deployment (9 replicas)cat << 'EOF' | k apply -f -apiVersion: apps/v1kind: Deploymentmetadata: name: stablespec: replicas: 9 selector: matchLabels: app: canary-demo track: stable template: metadata: labels: app: canary-demo track: stable spec: containers: - name: nginx image: nginx:1.20EOF
# Canary deployment (1 replica = ~10%)cat << 'EOF' | k apply -f -apiVersion: apps/v1kind: Deploymentmetadata: name: canaryspec: replicas: 1 selector: matchLabels: app: canary-demo track: canary template: metadata: labels: app: canary-demo track: canary spec: containers: - name: nginx image: nginx:1.21EOF
# Service routes to bothcat << 'EOF' | k apply -f -apiVersion: v1kind: Servicemetadata: name: canary-svcspec: selector: app: canary-demo ports: - port: 80EOF
# Gradually increase canaryk scale deploy canary --replicas=3 # ~25%k scale deploy stable --replicas=7
# Full rolloutk scale deploy canary --replicas=10k scale deploy stable --replicas=0
# Cleanupk delete deploy stable canaryk delete svc canary-svcPractice Drills
Section titled “Practice Drills”Drill 1: Rolling Update Config (Target: 3 minutes)
Section titled “Drill 1: Rolling Update Config (Target: 3 minutes)”# Create with specific rolling update settingsk create deploy drill1 --image=nginx:1.20 --replicas=4
# Patch strategyk patch deploy drill1 -p '{"spec":{"strategy":{"type":"RollingUpdate","rollingUpdate":{"maxSurge":1,"maxUnavailable":0}}}}'
# Update and observek set image deploy/drill1 nginx=nginx:1.21k rollout status deploy/drill1
# Cleanupk delete deploy drill1Drill 2: Recreate Strategy (Target: 2 minutes)
Section titled “Drill 2: Recreate Strategy (Target: 2 minutes)”# Create with recreate strategycat << 'EOF' | k apply -f -apiVersion: apps/v1kind: Deploymentmetadata: name: drill2spec: replicas: 3 strategy: type: Recreate selector: matchLabels: app: drill2 template: metadata: labels: app: drill2 spec: containers: - name: nginx image: nginx:1.20EOF
# Update (watch all pods terminate first)k set image deploy/drill2 nginx=nginx:1.21k get pods -l app=drill2 -w
# Cleanupk delete deploy drill2Drill 3: Blue/Green Switch (Target: 4 minutes)
Section titled “Drill 3: Blue/Green Switch (Target: 4 minutes)”# Create bluek create deploy blue --image=nginx:1.20 --replicas=2k patch deploy blue -p '{"spec":{"selector":{"matchLabels":{"version":"blue"}},"template":{"metadata":{"labels":{"version":"blue"}}}}}'
# Servicek expose deploy blue --name=app --port=80 --selector=version=blue
# Create greenk create deploy green --image=nginx:1.21 --replicas=2k patch deploy green -p '{"spec":{"selector":{"matchLabels":{"version":"green"}},"template":{"metadata":{"labels":{"version":"green"}}}}}'
# Switchk patch svc app -p '{"spec":{"selector":{"version":"green"}}}'
# Verifyk get ep app
# Cleanupk delete deploy blue greenk delete svc appDrill 4: Canary Percentage (Target: 3 minutes)
Section titled “Drill 4: Canary Percentage (Target: 3 minutes)”# 10% canaryk create deploy stable --image=nginx:1.20 --replicas=9k create deploy canary --image=nginx:1.21 --replicas=1
# Add common labelk patch deploy stable -p '{"spec":{"template":{"metadata":{"labels":{"app":"myapp"}}}}}'k patch deploy canary -p '{"spec":{"template":{"metadata":{"labels":{"app":"myapp"}}}}}'
# Service for bothk expose deploy stable --name=myapp --port=80 --selector=app=myapp
# Verify endpoints include bothk get ep myapp
# Cleanupk delete deploy stable canaryk delete svc myappDrill 5: Zero-Downtime Verification (Target: 3 minutes)
Section titled “Drill 5: Zero-Downtime Verification (Target: 3 minutes)”# Create deployment with readiness probecat << 'EOF' | k apply -f -apiVersion: apps/v1kind: Deploymentmetadata: name: drill5spec: replicas: 3 strategy: rollingUpdate: maxUnavailable: 0 selector: matchLabels: app: drill5 template: metadata: labels: app: drill5 spec: containers: - name: nginx image: nginx:1.20 readinessProbe: httpGet: path: / port: 80EOF
# Servicek expose deploy drill5 --port=80
# Update (zero downtime)k set image deploy/drill5 nginx=nginx:1.21k rollout status deploy/drill5
# Cleanupk delete deploy drill5k delete svc drill5Drill 6: Complete Deployment Strategy Scenario (Target: 6 minutes)
Section titled “Drill 6: Complete Deployment Strategy Scenario (Target: 6 minutes)”Scenario: Production deployment with canary testing.
# 1. Deploy stable versionk create deploy prod --image=nginx:1.20 --replicas=5
# 2. Expose servicek expose deploy prod --name=production --port=80
# 3. Create canary (10%)k create deploy canary --image=nginx:1.21 --replicas=1
# 4. Point service to bothk patch deploy prod -p '{"spec":{"template":{"metadata":{"labels":{"release":"production"}}}}}'k patch deploy canary -p '{"spec":{"template":{"metadata":{"labels":{"release":"production"}}}}}'k patch svc production -p '{"spec":{"selector":{"release":"production"}}}'
# 5. Test canaryk logs -l app=canary
# 6. Gradual rolloutk scale deploy canary --replicas=3k scale deploy prod --replicas=3
# 7. Full rolloutk scale deploy canary --replicas=5k scale deploy prod --replicas=0
# 8. Cleanupk delete deploy prod canaryk delete svc productionNext Module
Section titled “Next Module”Part 2 Cumulative Quiz - Test your Application Deployment knowledge.