Skip to content

Module 2.6: Scheduling

Hands-On Lab Available
K8s Cluster advanced 45 min
Launch Lab ↗

Opens in Killercoda in a new tab

Complexity: [MEDIUM] - Critical exam topic

Time to Complete: 45-55 minutes

Prerequisites: Module 2.1 (Pods), Module 2.5 (Resource Management)


After this module, you will be able to:

  • Configure nodeSelector, node affinity, and pod affinity/anti-affinity rules
  • Use taints and tolerations to control which pods can run on specific nodes
  • Implement pod topology spread constraints for high availability across zones
  • Debug Pending pods by reading scheduler events and matching them to node constraints

By default, the scheduler places pods on any node with available resources. But in production, you need control:

  • Run database pods on nodes with SSDs
  • Keep certain pods apart for high availability
  • Spread workloads across availability zones
  • Reserve nodes for specific workloads

The CKA exam frequently tests scheduling constraints. You’ll need to use nodeSelector, affinity rules, and taints/tolerations.

The Event Planner Analogy

Think of scheduling like seating at a wedding. nodeSelector is “VIPs at Table 1.” Node affinity is “Prefer tables near the stage, but anywhere is fine.” Taints are reserved tables with “Staff Only” signs. Tolerations are staff badges that let you sit at reserved tables. Anti-affinity is “Don’t seat the exes at the same table.”


By the end of this module, you’ll be able to:

  • Use nodeSelector for simple node selection
  • Configure node affinity and anti-affinity
  • Apply taints to nodes and tolerations to pods
  • Spread pods across topology domains
  • Troubleshoot scheduling issues

nodeSelector is the simplest way to constrain pods to specific nodes:

apiVersion: v1
kind: Pod
metadata:
name: ssd-pod
spec:
nodeSelector:
disk: ssd # Only schedule on nodes with this label
containers:
- name: nginx
image: nginx
Terminal window
# List node labels
kubectl get nodes --show-labels
# Label a node
kubectl label node worker-1 disk=ssd
# Remove a label
kubectl label node worker-1 disk-
# Overwrite a label
kubectl label node worker-1 disk=hdd --overwrite
LabelDescription
kubernetes.io/hostnameNode hostname
kubernetes.io/osOperating system (linux, windows)
kubernetes.io/archArchitecture (amd64, arm64)
topology.kubernetes.io/zoneCloud availability zone
topology.kubernetes.io/regionCloud region
node.kubernetes.io/instance-typeInstance type (cloud)
# Example: Schedule only on Linux nodes
spec:
nodeSelector:
kubernetes.io/os: linux

Did You Know?

You can combine multiple nodeSelector labels. The pod only schedules on nodes that match ALL labels (AND logic).


Pause and predict: You want a pod to run on SSD nodes but also accept NVMe nodes. With nodeSelector, you can only specify one value per key. How would you express “disk must be SSD OR NVMe” as a scheduling constraint?

Node affinity is more expressive than nodeSelector:

  • Soft preferences (“prefer but don’t require”)
  • Multiple match options (OR logic)
  • Operators (In, NotIn, Exists, DoesNotExist, Gt, Lt)
TypeBehavior
requiredDuringSchedulingIgnoredDuringExecutionHard requirement (like nodeSelector)
preferredDuringSchedulingIgnoredDuringExecutionSoft preference

Key Point: “IgnoredDuringExecution” means if labels change after scheduling, the pod stays. There’s no rescheduling.

apiVersion: v1
kind: Pod
metadata:
name: affinity-required
spec:
affinity:
nodeAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
nodeSelectorTerms:
- matchExpressions:
- key: disk
operator: In
values:
- ssd
- nvme
containers:
- name: nginx
image: nginx
apiVersion: v1
kind: Pod
metadata:
name: affinity-preferred
spec:
affinity:
nodeAffinity:
preferredDuringSchedulingIgnoredDuringExecution:
- weight: 80 # Higher weight = stronger preference
preference:
matchExpressions:
- key: disk
operator: In
values:
- ssd
- weight: 20
preference:
matchExpressions:
- key: zone
operator: In
values:
- us-west-1a
containers:
- name: nginx
image: nginx

War Story: The Lopsided Cluster

A team used preferredDuringSchedulingIgnoredDuringExecution to attract all CI/CD builder pods to nodes with a builder=true label. Because it was only a soft preference, the cluster autoscaler didn’t provision new builder nodes when they got full; it just dumped the overflow pods onto general-purpose web nodes. The web applications were starved for CPU by the greedy builder pods. If a workload absolutely requires specific hardware isolation, use hard affinity or taints, not soft affinity.

OperatorMeaning
InLabel value is in set
NotInLabel value not in set
ExistsLabel exists (any value)
DoesNotExistLabel doesn’t exist
GtGreater than (integer comparison)
LtLess than (integer comparison)
# Example: Node must have "gpu" label with any value
matchExpressions:
- key: gpu
operator: Exists
# Example: Node must NOT be in zone us-east-1c
matchExpressions:
- key: topology.kubernetes.io/zone
operator: NotIn
values:
- us-east-1c

Control pod placement relative to other pods:

  • Pod Affinity: “Schedule near pods with label X” (co-location)
  • Pod Anti-Affinity: “Don’t schedule near pods with label X” (spreading)

“Schedule this pod on the same node as pods with app=cache”:

apiVersion: v1
kind: Pod
metadata:
name: web-pod
spec:
affinity:
podAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
- labelSelector:
matchLabels:
app: cache
topologyKey: kubernetes.io/hostname # Same node
containers:
- name: web
image: nginx

“Don’t schedule on nodes that already have app=web pods”:

apiVersion: v1
kind: Pod
metadata:
name: web-pod
labels:
app: web
spec:
affinity:
podAntiAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
- labelSelector:
matchLabels:
app: web
topologyKey: kubernetes.io/hostname
containers:
- name: web
image: nginx

War Story: The Scheduling Gridlock

In a large multi-tenant cluster, every team started adding required pod anti-affinity to ensure their microservices didn’t share nodes with each other. Eventually, to schedule a simple 5-replica deployment, the scheduler had to find 5 completely empty nodes because every node already contained a pod that repelled the new ones. The cluster was only 20% utilized on CPU and memory, but couldn’t schedule anything new. Over-constraining causes massive resource waste. Stick to soft constraints unless strictly necessary.

The topologyKey determines the “zone” for affinity:

topologyKeyMeaning
kubernetes.io/hostnameSame node
topology.kubernetes.io/zoneSame availability zone
topology.kubernetes.io/regionSame region
┌────────────────────────────────────────────────────────────────┐
│ Anti-Affinity with Different topologyKeys │
│ │
│ topologyKey: kubernetes.io/hostname │
│ → Pods spread across nodes (one per node) │
│ │
│ Node1: [web-1] Node2: [web-2] Node3: [web-3] │
│ │
│ topologyKey: topology.kubernetes.io/zone │
│ → Pods spread across zones (one per zone) │
│ │
│ Zone-A Zone-B Zone-C │
│ [web-1] [web-2] [web-3] │
│ Node1,Node2 Node3,Node4 Node5,Node6 │
│ │
└────────────────────────────────────────────────────────────────┘

Exam Tip

For spreading replicas across nodes, use pod anti-affinity with topologyKey: kubernetes.io/hostname. For spreading across zones for HA, use topology.kubernetes.io/zone.


Taints are applied to nodes and repel pods unless the pod has a matching toleration.

┌────────────────────────────────────────────────────────────────┐
│ Taints and Tolerations │
│ │
│ Node with taint: gpu=true:NoSchedule │
│ ┌─────────────────────────────────────────────┐ │
│ │ │ │
│ │ Regular Pod: ❌ Cannot schedule │ │
│ │ │ │
│ │ Pod with matching ✅ Can schedule │ │
│ │ toleration: │ │
│ │ │ │
│ └─────────────────────────────────────────────┘ │
│ │
└────────────────────────────────────────────────────────────────┘

Stop and think: An SRE needs to perform maintenance on a node. They want to prevent new pods from being scheduled there, but existing pods should keep running until they complete naturally. Which taint effect should they use — NoSchedule, PreferNoSchedule, or NoExecute? What if they need to evict existing pods too?

EffectBehavior
NoSchedulePods won’t be scheduled (existing pods stay)
PreferNoScheduleSoft version - avoid but allow if necessary
NoExecuteEvict existing pods, prevent new scheduling
Terminal window
# Add taint to node
kubectl taint nodes worker-1 gpu=true:NoSchedule
# View taints
kubectl describe node worker-1 | grep Taints
# Remove taint (note the minus sign)
kubectl taint nodes worker-1 gpu=true:NoSchedule-
# Multiple taints
kubectl taint nodes worker-1 dedicated=ml:NoSchedule
kubectl taint nodes worker-1 gpu=nvidia:NoSchedule
apiVersion: v1
kind: Pod
metadata:
name: gpu-pod
spec:
tolerations:
- key: "gpu"
operator: "Equal"
value: "true"
effect: "NoSchedule"
containers:
- name: cuda-app
image: nvidia/cuda
OperatorMeaning
EqualKey and value must match
ExistsKey exists (any value matches)
# Match specific value
tolerations:
- key: "gpu"
operator: "Equal"
value: "nvidia"
effect: "NoSchedule"
# Match any value for key
tolerations:
- key: "gpu"
operator: "Exists"
effect: "NoSchedule"
# Tolerate all taints (wildcard)
tolerations:
- operator: "Exists"
Use CaseTaint Example
GPU nodesgpu=true:NoSchedule
Dedicated nodesdedicated=team-a:NoSchedule
Control plane nodesnode-role.kubernetes.io/control-plane:NoSchedule
Draining nodesnode.kubernetes.io/unschedulable:NoSchedule

War Story: The Disappeared Pods

An SRE added NoExecute taint for maintenance instead of NoSchedule. Existing pods were immediately evicted, causing a production outage. Know your taint effects! Use NoSchedule to prevent new pods. Use NoExecute only when you want to evict running pods.


Pause and predict: You have a 3-replica Deployment with pod anti-affinity using requiredDuringSchedulingIgnoredDuringExecution on kubernetes.io/hostname, but your cluster only has 2 nodes. What happens to the third replica?

Distribute pods evenly across failure domains:

apiVersion: v1
kind: Pod
metadata:
name: spread-pod
labels:
app: web
spec:
topologySpreadConstraints:
- maxSkew: 1 # Max difference between zones
topologyKey: topology.kubernetes.io/zone
whenUnsatisfiable: DoNotSchedule # Hard requirement
labelSelector:
matchLabels:
app: web
containers:
- name: nginx
image: nginx
ParameterDescription
maxSkewMaximum allowed difference in pod count across domains
topologyKeyLabel key defining domains (zone, node, etc.)
whenUnsatisfiableDoNotSchedule (hard) or ScheduleAnyway (soft)
labelSelectorWhich pods to count for distribution
┌────────────────────────────────────────────────────────────────┐
│ Topology Spread (maxSkew: 1) │
│ │
│ Zone A Zone B Zone C │
│ [pod][pod] [pod] [pod] │
│ Count: 2 Count: 1 Count: 1 │
│ │
│ Max difference = 2-1 = 1 ≤ maxSkew ✓ │
│ │
│ New pod arrives - where can it go? │
│ Zone A: 3 pods → difference 3-1=2 > maxSkew ❌ │
│ Zone B: 2 pods → difference 2-1=1 ≤ maxSkew ✓ │
│ Zone C: 2 pods → difference 2-1=1 ≤ maxSkew ✓ │
│ │
└────────────────────────────────────────────────────────────────┘

War Story: The Un-Scalable Spread

An engineering team set a strict maxSkew: 1 with whenUnsatisfiable: DoNotSchedule across 3 zones, but their cloud provider ran out of spot instances in us-east-1c. The cluster autoscaler couldn’t add nodes in zone C. Because of the strict maxSkew, the scheduler refused to place pods in zones A and B (which had plenty of capacity) because it would make the skew greater than 1. Their deployment stalled completely. They learned to use ScheduleAnyway for soft spreading, or ensure autoscaler and instance types are highly available across all zones.


┌────────────────────────────────────────────────────────────────┐
│ Scheduling Decision Flow │
│ │
│ Pod Created │
│ │ │
│ ▼ │
│ Filter Nodes │
│ ├── nodeSelector matches? │
│ ├── Node affinity required matches? │
│ ├── Taints tolerated? │
│ ├── Resources available? │
│ ├── Pod anti-affinity satisfied? │
│ └── Topology spread constraints ok? │
│ │ │
│ ▼ │
│ Score Remaining Nodes │
│ ├── Node affinity preferred │
│ ├── Pod affinity preferred │
│ └── Resource optimization │
│ │ │
│ ▼ │
│ Select Highest Scoring Node │
│ │ │
│ ▼ │
│ Bind Pod to Node │
│ │
└────────────────────────────────────────────────────────────────┘

Production Trade-offs: The Cost of Control

  • Hard vs. Soft Affinity: Hard rules (required) guarantee placement but increase the risk of Pending pods and failed deployments if capacity is constrained. Soft rules (preferred) maximize scheduling success but can lead to localized hotspots or performance degradation.
  • Cross-AZ Network Costs: Using topology spread across availability zones provides excellent high availability. However, if those pods communicate heavily with each other, cloud providers will charge you for cross-AZ data transfer.
  • Taint and Toleration Overhead: At scale, managing dozens of custom taints creates administrative bloat. It becomes difficult to onboard new applications because developers must remember to add a huge list of tolerations just to get their pods to run.

Databases need fast I/O and must not run on the same physical hardware as their replicas.

  • Node Affinity: Required affinity for nodes labeled disk=ssd or instance-family=storage-optimized.
  • Pod Anti-Affinity: Required pod anti-affinity using topologyKey: kubernetes.io/hostname to ensure replicas never share a node (avoiding a single point of failure).

Web servers need high availability and can run on almost any node.

  • Topology Spread: Soft or hard constraints across topology.kubernetes.io/zone to survive datacenter outages.
  • Node Affinity: Preferred affinity for newer, cost-effective instance types, falling back to older instances if necessary.

Background processing jobs are fault-tolerant and perfect for preemptible or spot instances.

  • Tolerations: Tolerate taints like node.kubernetes.io/lifecycle=spot:NoSchedule.
  • Node Affinity: Required affinity to strictly run on spot nodes, keeping regular nodes free for critical user-facing services.

SymptomLikely CauseDebug Command
Pending (no events)No nodes match constraintskubectl describe pod
Pending (Insufficient)Resource shortageCheck node resources
Pending (Taints)No toleration for taintCheck node taints, pod tolerations
Pending (Affinity)No nodes match affinity rulesSimplify/remove affinity
Terminal window
# Check pod events
kubectl describe pod <pod-name> | grep -A10 Events
# Check node labels
kubectl get nodes --show-labels
# Check node taints
kubectl describe node <node> | grep Taints
# Check node resources
kubectl describe node <node> | grep -A10 "Allocated resources"
# Simulate scheduling
kubectl get pods -o wide # See where pods landed

  • Control plane nodes are tainted by default with node-role.kubernetes.io/control-plane:NoSchedule. That’s why regular pods don’t run there.

  • Affinity can be combined. You can have nodeAffinity, podAffinity, and podAntiAffinity all on the same pod.

  • Multiple topologySpreadConstraints are ANDed. All constraints must be satisfied.

  • DaemonSets ignore taints by default for certain system taints. That’s how they run on every node.


MistakeProblemSolution
nodeSelector typoPod stays PendingVerify label exists on target node
Missing tolerationPod can’t schedule on tainted nodeAdd matching toleration
Wrong topologyKeyAffinity doesn’t work as expectedUse correct label key
NoExecute instead of NoSchedulePods evicted unexpectedlyUse NoSchedule for new pods only
Anti-affinity too strictNot enough nodes for all replicasUse preferred or reduce replicas

  1. Your team needs pods to run on SSD nodes but also accept NVMe nodes. A junior engineer used nodeSelector: {disk: ssd} but that excludes NVMe nodes. You need both SSD and NVMe without creating two separate Deployments. How do you solve this, and what type of affinity rule do you write?

    Answer Use `requiredDuringSchedulingIgnoredDuringExecution` node affinity with the `In` operator, which supports multiple values (OR logic). Write a `matchExpressions` rule with `key: disk, operator: In, values: [ssd, nvme]`. This schedules the pod on any node where the `disk` label is either `ssd` or `nvme`. `nodeSelector` can't do this because it only supports exact single-value matching. Node affinity also supports soft preferences via `preferredDuringSchedulingIgnoredDuringExecution`, which nodeSelector cannot express at all.
  2. Your cluster has 3 nodes. Node-1 has taint gpu=nvidia:NoSchedule, node-2 has taint dedicated=ml-team:NoSchedule, and node-3 has no taints. You deploy a pod with only a toleration for gpu=nvidia:NoSchedule. On which node(s) can this pod be scheduled, and why?

    Answer The pod can schedule on node-1 and node-3. Node-1 has the `gpu=nvidia:NoSchedule` taint, which the pod tolerates, so it passes the taint filter. Node-3 has no taints, so any pod can schedule there (tolerations are only needed when taints exist). Node-2 has a taint `dedicated=ml-team:NoSchedule` that the pod does NOT tolerate, so it is excluded. A common misconception is that a toleration *requires* the taint to be present -- it doesn't. Tolerations are permissive: they allow scheduling on tainted nodes but don't prevent scheduling on untainted ones. To restrict a pod to only tainted nodes, combine tolerations with node affinity or nodeSelector.
  3. You’re deploying a critical web application across 3 availability zones for high availability. You have 6 replicas. Using pod anti-affinity with requiredDuringSchedulingIgnoredDuringExecution and topologyKey: topology.kubernetes.io/zone, you notice some pods stay Pending. Why? What would you use instead for a more flexible approach?

    Answer With `required` anti-affinity by zone and 6 replicas across 3 zones, the first 3 pods schedule fine (one per zone). But the 4th pod cannot find a zone without an existing pod, so it stays Pending -- the hard constraint means "never place two pods in the same zone." Switch to `preferredDuringSchedulingIgnoredDuringExecution` (soft preference) or use `topologySpreadConstraints` with `maxSkew: 1`, which distributes pods evenly (2 per zone for 6 replicas) rather than requiring strict uniqueness. Topology spread constraints are generally better for HA because they balance pods across domains instead of imposing a hard one-per-domain limit.
  4. During a CKA exam scenario, you see a pod stuck in Pending with the event: 0/3 nodes are available: 2 insufficient cpu, 1 node(s) had taint {node-role.kubernetes.io/control-plane: NoSchedule}. Walk through your diagnosis. What are the two separate issues, and what are your options to resolve each?

    Answer There are two distinct issues. First, 2 worker nodes don't have enough allocatable CPU for this pod's resource requests -- check with `kubectl describe node` and compare the pod's `requests.cpu` against the node's available capacity. Fix by reducing the pod's CPU request, scaling down other workloads on those nodes, or adding nodes with more capacity. Second, the third node is a control plane node with the standard `NoSchedule` taint. Fix by adding a toleration for `node-role.kubernetes.io/control-plane` (only appropriate for infrastructure pods, not application workloads) or by adding more worker nodes. In production, the control plane node should generally stay reserved for system components.

Task: Practice all scheduling techniques.

Steps:

  1. Label a node and use nodeSelector:
Terminal window
# Get a node name
NODE=$(kubectl get nodes -o jsonpath='{.items[0].metadata.name}')
# Label the node
kubectl label node $NODE disk=ssd
# Create pod with nodeSelector
cat << EOF | kubectl apply -f -
apiVersion: v1
kind: Pod
metadata:
name: ssd-pod
spec:
nodeSelector:
disk: ssd
containers:
- name: nginx
image: nginx
EOF
# Verify placement
kubectl get pod ssd-pod -o wide
# Cleanup
kubectl delete pod ssd-pod
kubectl label node $NODE disk-
  1. Add taint and create pod with toleration:
Terminal window
# Taint the node
kubectl taint nodes $NODE dedicated=special:NoSchedule
# Try to create pod without toleration
kubectl run no-toleration --image=nginx
# Check - should be Pending or on different node
kubectl get pod no-toleration -o wide
# Create pod with toleration
cat << EOF | kubectl apply -f -
apiVersion: v1
kind: Pod
metadata:
name: with-toleration
spec:
tolerations:
- key: "dedicated"
operator: "Equal"
value: "special"
effect: "NoSchedule"
containers:
- name: nginx
image: nginx
EOF
# Verify placement
kubectl get pod with-toleration -o wide
# Cleanup
kubectl delete pod no-toleration with-toleration
kubectl taint nodes $NODE dedicated-
  1. Spread pods across nodes:
Terminal window
cat << EOF | kubectl apply -f -
apiVersion: apps/v1
kind: Deployment
metadata:
name: spread-deploy
spec:
replicas: 3
selector:
matchLabels:
app: spread
template:
metadata:
labels:
app: spread
spec:
affinity:
podAntiAffinity:
preferredDuringSchedulingIgnoredDuringExecution:
- weight: 100
podAffinityTerm:
labelSelector:
matchLabels:
app: spread
topologyKey: kubernetes.io/hostname
containers:
- name: nginx
image: nginx
EOF
# Check pod distribution
kubectl get pods -l app=spread -o wide
# Cleanup
kubectl delete deployment spread-deploy

Success Criteria:

  • Can use nodeSelector
  • Can add/remove node taints
  • Can add tolerations to pods
  • Understand affinity vs anti-affinity
  • Can troubleshoot scheduling issues

Terminal window
NODE=$(kubectl get nodes -o jsonpath='{.items[0].metadata.name}')
# Label node
kubectl label node $NODE env=production
# Create pod with nodeSelector
kubectl run selector-test --image=nginx --dry-run=client -o yaml | \
kubectl patch --dry-run=client -o yaml -f - \
-p '{"spec":{"nodeSelector":{"env":"production"}}}' | kubectl apply -f -
# Or simpler - just use YAML
cat << 'EOF' | kubectl apply -f -
apiVersion: v1
kind: Pod
metadata:
name: selector-test
spec:
nodeSelector:
env: production
containers:
- name: nginx
image: nginx
EOF
# Verify
kubectl get pod selector-test -o wide
# Cleanup
kubectl delete pod selector-test
kubectl label node $NODE env-
Terminal window
NODE=$(kubectl get nodes -o jsonpath='{.items[0].metadata.name}')
# Add taint
kubectl taint nodes $NODE app=critical:NoSchedule
# View taint
kubectl describe node $NODE | grep Taints
# Pod without toleration - will be Pending or elsewhere
kubectl run no-tol --image=nginx
kubectl get pod no-tol -o wide
# Pod with toleration
cat << 'EOF' | kubectl apply -f -
apiVersion: v1
kind: Pod
metadata:
name: with-tol
spec:
tolerations:
- key: "app"
operator: "Equal"
value: "critical"
effect: "NoSchedule"
containers:
- name: nginx
image: nginx
EOF
kubectl get pod with-tol -o wide
# Cleanup
kubectl delete pod no-tol with-tol
kubectl taint nodes $NODE app-

Drill 3: Node Affinity (Target: 5 minutes)

Section titled “Drill 3: Node Affinity (Target: 5 minutes)”
Terminal window
NODE=$(kubectl get nodes -o jsonpath='{.items[0].metadata.name}')
kubectl label node $NODE size=large
cat << 'EOF' | kubectl apply -f -
apiVersion: v1
kind: Pod
metadata:
name: affinity-test
spec:
affinity:
nodeAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
nodeSelectorTerms:
- matchExpressions:
- key: size
operator: In
values:
- large
- xlarge
containers:
- name: nginx
image: nginx
EOF
kubectl get pod affinity-test -o wide
# Cleanup
kubectl delete pod affinity-test
kubectl label node $NODE size-

Drill 4: Pod Anti-Affinity (Target: 5 minutes)

Section titled “Drill 4: Pod Anti-Affinity (Target: 5 minutes)”
Terminal window
cat << 'EOF' | kubectl apply -f -
apiVersion: apps/v1
kind: Deployment
metadata:
name: anti-affinity
spec:
replicas: 3
selector:
matchLabels:
app: anti-test
template:
metadata:
labels:
app: anti-test
spec:
affinity:
podAntiAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
- labelSelector:
matchLabels:
app: anti-test
topologyKey: kubernetes.io/hostname
containers:
- name: nginx
image: nginx
EOF
# Check distribution (each pod on different node)
kubectl get pods -l app=anti-test -o wide
# Cleanup
kubectl delete deployment anti-affinity

Drill 5: Troubleshooting - Pending Pod (Target: 5 minutes)

Section titled “Drill 5: Troubleshooting - Pending Pod (Target: 5 minutes)”
Terminal window
NODE=$(kubectl get nodes -o jsonpath='{.items[0].metadata.name}')
# Create impossible scenario
kubectl taint nodes $NODE impossible=true:NoSchedule
cat << 'EOF' | kubectl apply -f -
apiVersion: v1
kind: Pod
metadata:
name: pending-pod
spec:
nodeSelector:
nonexistent: label
containers:
- name: nginx
image: nginx
EOF
# Diagnose
kubectl get pod pending-pod
kubectl describe pod pending-pod | grep -A10 Events
# YOUR TASK: Why is it Pending? Fix it.
# Cleanup
kubectl delete pod pending-pod
kubectl taint nodes $NODE impossible-
Solution

The pod is pending for two reasons:

  1. nodeSelector requires label nonexistent=label which no node has
  2. All nodes have taint that the pod doesn’t tolerate

Fix by either:

  • Adding the label to a node: kubectl label node $NODE nonexistent=label
  • Adding toleration and removing nodeSelector

Create a pod that:

  1. Must run on nodes with label tier=frontend
  2. Prefers nodes with label zone=us-east-1a
  3. Tolerates taint frontend=true:NoSchedule
Terminal window
# YOUR TASK: Create this pod
Solution
Terminal window
NODE=$(kubectl get nodes -o jsonpath='{.items[0].metadata.name}')
kubectl label node $NODE tier=frontend zone=us-east-1a
kubectl taint nodes $NODE frontend=true:NoSchedule
cat << 'EOF' | kubectl apply -f -
apiVersion: v1
kind: Pod
metadata:
name: complex-schedule
spec:
tolerations:
- key: "frontend"
operator: "Equal"
value: "true"
effect: "NoSchedule"
affinity:
nodeAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
nodeSelectorTerms:
- matchExpressions:
- key: tier
operator: In
values:
- frontend
preferredDuringSchedulingIgnoredDuringExecution:
- weight: 100
preference:
matchExpressions:
- key: zone
operator: In
values:
- us-east-1a
containers:
- name: nginx
image: nginx
EOF
kubectl get pod complex-schedule -o wide
# Cleanup
kubectl delete pod complex-schedule
kubectl label node $NODE tier- zone-
kubectl taint nodes $NODE frontend-

Module 2.7: ConfigMaps & Secrets - Application configuration management.