Skip to content

Module 1.2: Extension Interfaces - CNI, CSI, CRI

Hands-On Lab Available
K8s Cluster intermediate 35 min
Launch Lab ↗

Opens in Killercoda in a new tab

Complexity: [MEDIUM] - Conceptual with some hands-on

Time to Complete: 30-40 minutes

Prerequisites: Module 1.1 (Control Plane)


After this module, you will be able to:

  • Explain the plugin architecture (CNI, CSI, CRI) and why Kubernetes uses interfaces instead of built-in implementations
  • Identify which runtime, network, and storage plugins are installed on a cluster
  • Diagnose plugin failures by checking pod logs, socket files, and kubelet configuration
  • Compare common implementations (containerd vs CRI-O, Calico vs Cilium, local vs CSI drivers) and their trade-offs

Kubernetes doesn’t actually know how to run containers, set up networking, or provision storage. Surprised?

Kubernetes is designed to be pluggable. It defines interfaces—contracts that say “if you implement these functions, I’ll use you.” This is why you can run Kubernetes with Docker, containerd, or CRI-O. With Calico, Cilium, or Flannel. With AWS EBS, GCP PD, or local storage.

The CKA 2025 curriculum specifically calls out understanding CNI, CSI, and CRI. You need to know what they are, why they exist, and how to troubleshoot when they fail.

The USB Port Analogy

Think of these interfaces like USB ports. Your computer doesn’t need to know the specifics of every mouse, keyboard, or drive ever made—it just needs USB compliance. Similarly, Kubernetes defines CNI/CSI/CRI interfaces, and any compliant plugin works. Want to swap your network plugin? Unplug one, plug in another. Same cluster, different implementation.


By the end of this module, you’ll understand:

  • What CNI, CSI, and CRI are
  • Why Kubernetes uses plugin architecture
  • Common implementations of each
  • How to check which plugins your cluster uses
  • Basic troubleshooting for each

┌────────────────────────────────────────────────────────────────┐
│ The Kubernetes Philosophy │
│ │
│ "Do one thing well, let others do the rest" │
│ │
│ Kubernetes Core: │
│ ✓ API and resource management │
│ ✓ Scheduling and orchestration │
│ ✓ Controller patterns │
│ ✓ Desired state reconciliation │
│ │
│ NOT Kubernetes Core (delegated to plugins): │
│ ✗ Container runtime details │
│ ✗ Network implementation │
│ ✗ Storage provisioning │
│ │
└────────────────────────────────────────────────────────────────┘

Benefits:

  • Choice: Use the best tool for your environment
  • Innovation: New plugins without changing Kubernetes core
  • Specialization: Network vendors focus on networking, storage vendors on storage
  • Portability: Same Kubernetes, different infrastructure
InterfacePurposekubelet Talks To
CRI (Container Runtime Interface)Run containerscontainerd, CRI-O
CNI (Container Network Interface)Pod networkingCalico, Cilium, Flannel
CSI (Container Storage Interface)Persistent storageAWS EBS, GCP PD, Ceph

CRI defines how kubelet communicates with container runtimes. Without CRI, kubelet would need custom code for every runtime.

┌────────────────────────────────────────────────────────────────┐
│ CRI Flow │
│ │
│ kubelet │
│ │ │
│ │ CRI (gRPC) │
│ │ "Create container with image X" │
│ │ "Start container Y" │
│ │ "Stop container Z" │
│ ▼ │
│ ┌─────────────────┐ or ┌─────────────────┐ │
│ │ containerd │ │ CRI-O │ │
│ └────────┬────────┘ └────────┬────────┘ │
│ │ │ │
│ ▼ ▼ │
│ ┌─────────┐ ┌─────────┐ │
│ │ runc │ │ runc │ │
│ └─────────┘ └─────────┘ │
│ │ │ │
│ ▼ ▼ │
│ Linux containers Linux containers │
│ │
└────────────────────────────────────────────────────────────────┘

Pause and predict: Why does Kubernetes use a gRPC interface (CRI) to talk to container runtimes instead of just calling containerd functions directly? What advantage does this indirection give you as a cluster operator?

RuntimeDescriptionUsed By
containerdIndustry standard, CNCF graduatedkubeadm default, GKE, EKS, AKS
CRI-OKubernetes-focused, lightweightOpenShift, some enterprise
Docker⚠️ Deprecated in K8s 1.24Legacy clusters

Did You Know?

Docker deprecation caused panic in 2020, but it was overblown. Docker images still work everywhere—they’re OCI-compliant. What changed is kubelet no longer talks to Docker daemon directly. Most users saw zero impact because Docker builds images, containerd runs them.

Terminal window
# What runtime is kubelet using?
kubectl get nodes -o wide
# Look at CONTAINER-RUNTIME column
# On a node, check containerd
systemctl status containerd
crictl info
# List running containers
crictl ps
# List images
crictl images
# Inspect a container
crictl inspect <container-id>
Terminal window
# Container runtime not responding
systemctl status containerd
journalctl -u containerd -f
# kubelet can't talk to runtime
journalctl -u kubelet | grep -i "runtime"
# Check CRI socket
ls -la /var/run/containerd/containerd.sock
# Verify kubelet configuration
cat /var/lib/kubelet/config.yaml | grep -i container

Gotcha: crictl vs docker

On modern clusters, docker commands don’t work because Docker isn’t running. Use crictl instead—it’s the CRI-compatible CLI that talks to containerd directly.


CNI handles pod networking:

  • Assigns IP addresses to pods
  • Sets up routes between pods
  • Creates network namespaces
  • Implements network policies (some plugins)
┌────────────────────────────────────────────────────────────────┐
│ CNI Flow │
│ │
│ 1. kubelet creates pod sandbox (pause container) │
│ 2. kubelet calls CNI plugin: "Configure network for pod X" │
│ 3. CNI plugin: │
│ - Creates veth pair (virtual ethernet) │
│ - Assigns IP from pool │
│ - Sets up routes │
│ - Connects pod to cluster network │
│ 4. Pod can now communicate with other pods │
│ │
└────────────────────────────────────────────────────────────────┘

Stop and think: You just ran kubeadm init and the node shows as Ready, but CoreDNS pods are stuck in Pending. What’s missing, and why does this specific component fail before others?

PluginKey FeaturesComplexity
CalicoNetworkPolicy, BGP routing, high performanceMedium
CiliumeBPF-based, observability, securityHigher
FlannelSimple overlay networkLow
WeaveEncryption, easy setupLow
AWS VPC CNINative VPC networkingAWS-specific

CNI plugins are typically:

  1. Binary in /opt/cni/bin/
  2. Configuration in /etc/cni/net.d/
  3. DaemonSet running on every node (for plugin-specific agents)
Terminal window
# List CNI binaries
ls /opt/cni/bin/
# List CNI configurations (first file wins!)
ls /etc/cni/net.d/
# Check the active CNI config
cat /etc/cni/net.d/10-calico.conflist # Example for Calico
Terminal window
# What CNI is running?
kubectl get pods -n kube-system | grep -E "calico|cilium|flannel|weave"
# Check CNI pod logs
kubectl logs -n kube-system -l k8s-app=calico-node --tail=50
# Verify pod IP allocation
kubectl get pods -o wide # Check IP column
# Test pod-to-pod connectivity
kubectl exec pod-a -- ping <pod-b-ip>
Terminal window
# Pods stuck in ContainerCreating
kubectl describe pod <pod-name>
# Look for: "failed to set up sandbox container network"
# Check CNI configuration exists
ls -la /etc/cni/net.d/
# Check CNI binary exists
ls -la /opt/cni/bin/
# Check CNI agent logs
kubectl logs -n kube-system -l k8s-app=calico-node -c calico-node
# Node network issues
ip addr show # Check interfaces
ip route # Check routes

War Story: The Missing CNI

A junior engineer built a kubeadm cluster but skipped the CNI installation step. Pods were stuck in ContainerCreating forever with cryptic “network not ready” errors. The fix was one command: kubectl apply -f calico.yaml. Lesson: kubeadm gives you a cluster, CNI gives you networking. Both are required.


CSI handles persistent storage:

  • Provisions volumes (creates actual disks)
  • Attaches volumes to nodes
  • Mounts volumes in pods
  • Snapshots and clones
┌────────────────────────────────────────────────────────────────┐
│ CSI Flow │
│ │
│ PersistentVolumeClaim created │
│ │ │
│ ▼ │
│ StorageClass defines CSI driver │
│ │ │
│ ▼ │
│ CSI Controller (Deployment in kube-system) │
│ "Provision 10Gi volume from AWS EBS" │
│ │ │
│ ▼ │
│ Cloud Provider: Creates actual disk │
│ │ │
│ ▼ │
│ PersistentVolume created and bound │
│ │ │
│ ▼ │
│ Pod scheduled to node │
│ │ │
│ ▼ │
│ CSI Node plugin: Attaches & mounts disk │
│ │
└────────────────────────────────────────────────────────────────┘

What would happen if: The CSI controller pod crashes but the CSI node plugins on each worker are still running. Can existing pods still read and write to their mounted volumes? Can new PVCs be provisioned?

┌────────────────────────────────────────────────────────────────┐
│ CSI Architecture │
│ │
│ ┌─────────────────────────────────────────────────────────┐ │
│ │ CSI Controller (Deployment) │ │
│ │ - Runs as pods in kube-system │ │
│ │ - Handles provisioning, attach/detach │ │
│ │ - Usually 1-3 replicas │ │
│ └─────────────────────────────────────────────────────────┘ │
│ │
│ ┌─────────────────────────────────────────────────────────┐ │
│ │ CSI Node Plugin (DaemonSet) │ │
│ │ - Runs on every node │ │
│ │ - Handles mount/unmount │ │
│ │ - Registers node with CSI driver │ │
│ └─────────────────────────────────────────────────────────┘ │
│ │
└────────────────────────────────────────────────────────────────┘
DriverStorageEnvironment
ebs.csi.aws.comAWS EBSAWS EKS
pd.csi.storage.gke.ioGCP Persistent DiskGKE
disk.csi.azure.comAzure DiskAKS
csi.vsphere.vmware.comvSphereVMware
rook-ceph.rbd.csi.ceph.comCeph RBDOn-premises
hostpath.csi.k8s.ioLocal pathDevelopment
Terminal window
# List CSI drivers in cluster
kubectl get csidrivers
# Check CSI pods
kubectl get pods -n kube-system | grep csi
# View StorageClasses (use CSI drivers)
kubectl get storageclasses
kubectl describe storageclass <name>
# Check PV provisioner
kubectl get pv -o jsonpath='{.items[*].spec.csi.driver}'
Terminal window
# PVC stuck in Pending
kubectl describe pvc <pvc-name>
# Look for: Events showing provisioning errors
# Check CSI controller logs
kubectl logs -n kube-system -l app=ebs-csi-controller -c ebs-plugin
# Check CSI node logs
kubectl logs -n kube-system -l app=ebs-csi-node -c ebs-plugin
# Verify CSI driver registered
kubectl get csinodes
kubectl describe csinode <node-name>

Did You Know?

Before CSI, Kubernetes had “in-tree” storage plugins—code baked into Kubernetes itself for each storage provider. This was unsustainable. CSI allows storage vendors to develop drivers independently, released on their schedule, not Kubernetes’ release cycle.


InterfaceKubernetes Talks ToPlugin ProvidesConfig Location
CRIContainer runtimeContainer lifecycle/var/run/containerd/
CNINetwork pluginPod IPs, routing, policies/etc/cni/net.d/
CSIStorage driverVolume provisioning, mountCSI Driver CRDs
Terminal window
# ===== CRI Issues =====
# Symptom: Pods won't start, "container runtime not running"
systemctl status containerd
journalctl -u containerd
crictl info
# ===== CNI Issues =====
# Symptom: Pods stuck in ContainerCreating, "network not ready"
ls /etc/cni/net.d/
kubectl get pods -n kube-system | grep -E "calico|cilium|flannel"
kubectl logs -n kube-system <cni-pod>
# ===== CSI Issues =====
# Symptom: PVC stuck in Pending, "provisioning failed"
kubectl get csidrivers
kubectl describe pvc <name>
kubectl logs -n kube-system <csi-controller-pod>

  • CNI configuration file order matters. Kubernetes uses the first file alphabetically in /etc/cni/net.d/. That’s why you see files named 10-calico.conflist—the number ensures ordering.

  • CSI replaced Flexvolume. Flexvolume was the predecessor, but it required plugins on every node and had security issues. CSI runs as containerized workloads.

  • Cilium uses eBPF instead of iptables for networking. This makes it faster and more observable, but requires a modern Linux kernel (4.9+).


MistakeProblemSolution
Forgetting to install CNIPods stuck in ContainerCreatingInstall CNI plugin after kubeadm init
Multiple CNI configsWrong plugin usedRemove old configs, keep one
Wrong CSI driverPVC won’t bindMatch StorageClass to your infrastructure
Ignoring CSI controller logsCan’t diagnose provisioningAlways check controller and node plugin logs
Using docker on new clustersCommand not foundUse crictl instead

  1. Your team migrated from an on-prem cluster using Flannel to a new cluster using Cilium. A developer asks: “Do I need to change my Deployment manifests?” What’s your answer, and what does this tell you about how CNI works?

    Answer No manifest changes are needed. CNI is an interface — Kubernetes defines a contract for pod networking (assign IPs, set up routes, enable pod-to-pod communication), and any compliant plugin fulfills that contract transparently. From the application's perspective, pods still get IPs, DNS works, and Services resolve the same way regardless of whether Flannel or Cilium handles the underlying networking. This is the core benefit of the plugin architecture: you can swap implementations without changing workloads. The only differences are operational — Cilium adds eBPF-based observability and network policy enforcement that Flannel lacks.
  2. After a node reboot, pods on that node are stuck in ContainerCreating. You SSH into the node and find that /etc/cni/net.d/ contains two files: 10-calico.conflist and 05-flannel.conflist. What’s the problem and how do you fix it?

    Answer Kubernetes uses the first CNI configuration file alphabetically in `/etc/cni/net.d/`. Since `05-flannel.conflist` comes before `10-calico.conflist`, kubelet is trying to use Flannel for networking — but if your cluster runs Calico, the Flannel binaries and agents aren't present, causing the ContainerCreating state. The fix is to remove the stale Flannel config: `sudo rm /etc/cni/net.d/05-flannel.conflist`. This commonly happens when a previous CNI installation wasn't cleaned up properly. After removing it, the pods should start using Calico correctly.
  3. A PersistentVolumeClaim has been in Pending state for 5 minutes. You run kubectl get csidrivers and see your EBS CSI driver listed. The CSI controller pods are running. Where should you look next, and what are two likely causes?

    Answer Run `kubectl describe pvc ` and check the Events section for provisioning error messages. Then check the CSI controller logs: `kubectl logs -n kube-system -l app=ebs-csi-controller -c ebs-plugin`. Two likely causes: (1) The StorageClass specified in the PVC doesn't match any available StorageClass, or the StorageClass references a different CSI driver than what's installed. (2) The CSI controller lacks IAM permissions to provision EBS volumes in AWS — the controller pod needs the right ServiceAccount and IAM role to call the AWS API. You can also check `kubectl get sc` to verify the StorageClass exists and its provisioner matches the CSI driver name.
  4. A colleague upgrades a cluster from Kubernetes 1.23 to 1.25 and reports that all docker ps commands on nodes return “command not found,” even though containers are still running. They’re panicking about data loss. What do you tell them?

    Answer There's no data loss. Starting with Kubernetes 1.24, Docker (dockershim) was removed as a supported container runtime. The cluster now uses containerd directly, which was already running containers under Docker previously. The containers are still running — they just need to use `crictl ps` instead of `docker ps` to inspect them. All existing Docker images continue to work because they're OCI-compliant, meaning containerd can pull and run them without modification. The key distinction is that Docker was the runtime interface, not the image format — and containerd implements CRI natively, which is what kubelet actually needs.

Task: Explore your cluster’s CNI and CRI configuration.

Steps:

  1. Identify your container runtime:
Terminal window
kubectl get nodes -o wide
# Note the CONTAINER-RUNTIME column
  1. Explore CRI:
Terminal window
# On a node (SSH or kubectl debug node)
crictl info
crictl ps
crictl images | head -10
  1. Identify your CNI plugin:
Terminal window
kubectl get pods -n kube-system | grep -E "calico|cilium|flannel|weave"
  1. Check CNI configuration:
Terminal window
# On a node
ls -la /etc/cni/net.d/
cat /etc/cni/net.d/*.conflist | head -30
  1. Check pod networking:
Terminal window
# Create two pods
kubectl run test1 --image=nginx
kubectl run test2 --image=nginx
# Get their IPs
kubectl get pods -o wide
# Test connectivity
kubectl exec test1 -- curl -s <test2-ip>:80
  1. Check CSI drivers (if available):
Terminal window
kubectl get csidrivers
kubectl get storageclasses

Success Criteria:

  • Can identify container runtime in use
  • Can use crictl to inspect containers
  • Can identify CNI plugin and its configuration
  • Understand pod-to-pod networking path
  • Know where to look for each interface’s logs

Cleanup:

Terminal window
kubectl delete pod test1 test2

Drill 1: Interface Identification (Target: 2 minutes)

Section titled “Drill 1: Interface Identification (Target: 2 minutes)”

Match each tool/plugin to its interface:

Tool/PluginInterface (CRI/CNI/CSI)
containerd___
Calico___
AWS EBS driver___
CRI-O___
Cilium___
Rook-Ceph___
Answers
  1. CRI - Container Runtime Interface
  2. CNI - Container Network Interface
  3. CSI - Container Storage Interface
  4. CRI - Container Runtime Interface
  5. CNI - Container Network Interface
  6. CSI - Container Storage Interface

Drill 2: CRI Troubleshooting - Container Runtime Down (Target: 5 minutes)

Section titled “Drill 2: CRI Troubleshooting - Container Runtime Down (Target: 5 minutes)”
Terminal window
# Setup: Stop containerd (WARNING: breaks cluster temporarily!)
# Only do on practice nodes you can restart!
sudo systemctl stop containerd
# Observe the damage
kubectl get nodes # Node becomes NotReady
kubectl describe node <your-node> | grep -A5 Conditions
# YOUR TASK: Restore containerd and verify recovery
Solution
Terminal window
sudo systemctl start containerd
sudo systemctl status containerd
# Wait for node to recover
kubectl get nodes -w # Watch until Ready
# Verify containers running
sudo crictl ps

Drill 3: CNI Troubleshooting - Pods Stuck in ContainerCreating (Target: 5 minutes)

Section titled “Drill 3: CNI Troubleshooting - Pods Stuck in ContainerCreating (Target: 5 minutes)”
Terminal window
# Setup: Temporarily break CNI config
sudo mv /etc/cni/net.d/10-calico.conflist /tmp/
# Create a test pod
kubectl run cni-broken --image=nginx
# Observe
kubectl get pods # ContainerCreating forever
kubectl describe pod cni-broken | grep -A10 Events
# YOUR TASK: Diagnose and fix
Solution
Terminal window
# Check CNI config directory
ls /etc/cni/net.d/ # Empty!
# Restore CNI config
sudo mv /tmp/10-calico.conflist /etc/cni/net.d/
# Delete stuck pod and recreate
kubectl delete pod cni-broken --force --grace-period=0
kubectl run cni-fixed --image=nginx
kubectl get pods # Running!
# Cleanup
kubectl delete pod cni-fixed

Drill 4: crictl Mastery (Target: 3 minutes)

Section titled “Drill 4: crictl Mastery (Target: 3 minutes)”

Practice using crictl commands:

Terminal window
# 1. List all running containers
sudo crictl ps
# 2. List all pods (sandbox containers)
sudo crictl pods
# 3. Get container logs
sudo crictl logs <container-id>
# 4. Inspect a container
sudo crictl inspect <container-id> | head -50
# 5. List images
sudo crictl images
# 6. Get runtime info
sudo crictl info

Drill 5: CSI Driver Investigation (Target: 3 minutes)

Section titled “Drill 5: CSI Driver Investigation (Target: 3 minutes)”

Explore CSI drivers in your cluster:

Terminal window
# List all CSI drivers
kubectl get csidrivers
# Get details on a driver
kubectl describe csidriver <driver-name>
# Check CSI nodes
kubectl get csinodes
kubectl describe csinode <node-name>
# View StorageClasses using CSI
kubectl get sc -o jsonpath='{range .items[*]}{.metadata.name}{"\t"}{.provisioner}{"\n"}{end}'

Drill 6: CSI Provisioning - Create PVC with StorageClass (Target: 5 minutes)

Section titled “Drill 6: CSI Provisioning - Create PVC with StorageClass (Target: 5 minutes)”

Practice the full CSI workflow from PVC to mounted volume:

Terminal window
# Check available StorageClasses
kubectl get sc
# Create a PVC using the default StorageClass
cat << 'EOF' | kubectl apply -f -
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: csi-test-pvc
spec:
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 1Gi
# storageClassName: standard # Uncomment to use specific class
EOF
# Watch the PVC get bound (CSI provisioner creates PV)
kubectl get pvc csi-test-pvc -w
# Check the dynamically provisioned PV
kubectl get pv
# Create a pod that uses the PVC
cat << 'EOF' | kubectl apply -f -
apiVersion: v1
kind: Pod
metadata:
name: csi-test-pod
spec:
containers:
- name: app
image: nginx
volumeMounts:
- name: data
mountPath: /data
volumes:
- name: data
persistentVolumeClaim:
claimName: csi-test-pvc
EOF
# Wait for pod to be ready
kubectl wait --for=condition=ready pod/csi-test-pod --timeout=60s
# Verify the volume is mounted
kubectl exec csi-test-pod -- df -h /data
# Write test data
kubectl exec csi-test-pod -- sh -c "echo 'CSI works!' > /data/test.txt"
kubectl exec csi-test-pod -- cat /data/test.txt
# Cleanup
kubectl delete pod csi-test-pod
kubectl delete pvc csi-test-pvc

Drill 7: Network Connectivity Test (Target: 5 minutes)

Section titled “Drill 7: Network Connectivity Test (Target: 5 minutes)”

Verify CNI is working correctly:

Terminal window
# Create pods on different nodes
kubectl run net-test-1 --image=nginx --overrides='{"spec":{"nodeName":"worker-01"}}'
kubectl run net-test-2 --image=nginx --overrides='{"spec":{"nodeName":"worker-02"}}'
# Wait for running
kubectl wait --for=condition=ready pod/net-test-1 pod/net-test-2 --timeout=60s
# Get IPs
POD1_IP=$(kubectl get pod net-test-1 -o jsonpath='{.status.podIP}')
POD2_IP=$(kubectl get pod net-test-2 -o jsonpath='{.status.podIP}')
# Test cross-node connectivity
kubectl exec net-test-1 -- curl -s --connect-timeout 5 $POD2_IP:80
kubectl exec net-test-2 -- curl -s --connect-timeout 5 $POD1_IP:80
# Cleanup
kubectl delete pod net-test-1 net-test-2

Without documentation, identify all plugins in your cluster:

Terminal window
# 1. Find container runtime
kubectl get nodes -o wide | awk '{print $NF}'
# 2. Find CNI plugin
ls /etc/cni/net.d/
kubectl get pods -n kube-system | grep -E "calico|cilium|flannel|weave"
# 3. Find CSI drivers
kubectl get csidrivers
kubectl get sc
# Write down what you found - this is exam knowledge!

Module 1.3: Helm - Package management for Kubernetes, deploying applications with charts.