Skip to content

Module 3.3: Linux Kernel and OS Hardening

Hands-On Lab Available
K8s Cluster advanced 35 min
Launch Lab ↗

Opens in Killercoda in a new tab

Complexity: [MEDIUM] - System administration with security focus

Time to Complete: 40-45 minutes

Prerequisites: Basic Linux administration, container runtime knowledge


After completing this module, you will be able to:

  1. Configure sysctl parameters to harden kernel settings on Kubernetes nodes
  2. Implement host-level protections including minimal OS images and automatic updates
  3. Audit node configurations for dangerous kernel parameters and missing hardening
  4. Evaluate container runtime isolation options to reduce kernel attack surface

Containers share the host kernel. A vulnerability in the kernel can compromise all containers on that host. Hardening the host OS and kernel settings reduces the attack surface that containers can exploit.

CKS tests your understanding of OS-level security measures.


┌─────────────────────────────────────────────────────────────┐
│ CONTAINER KERNEL SHARING │
├─────────────────────────────────────────────────────────────┤
│ │
│ ┌─────────────┐ ┌─────────────┐ ┌─────────────┐ │
│ │ Container A │ │ Container B │ │ Container C │ │
│ │ (nginx) │ │ (redis) │ │ (attacker?) │ │
│ └─────┬───────┘ └─────┬───────┘ └─────┬───────┘ │
│ │ │ │ │
│ └────────────────┼────────────────┘ │
│ │ │
│ ▼ │
│ ┌──────────────────────────────────────────────────────┐ │
│ │ HOST KERNEL │ │
│ │ │ │
│ │ All containers use the SAME kernel │ │
│ │ Kernel exploit = compromise ALL containers │ │
│ │ Privilege escalation = host access │ │
│ │ │ │
│ └──────────────────────────────────────────────────────┘ │
│ │ │
│ ▼ │
│ ┌──────────────────────────────────────────────────────┐ │
│ │ HARDWARE │ │
│ └──────────────────────────────────────────────────────┘ │
│ │
└─────────────────────────────────────────────────────────────┘

Stop and think: All containers on a node share the same Linux kernel. If a container exploits a kernel vulnerability and gains root on the host, what happens to every other container on that node? How is this fundamentally different from virtual machines?

Terminal window
# List installed packages
dpkg -l | wc -l # Debian/Ubuntu
rpm -qa | wc -l # RHEL/CentOS
# Remove unnecessary packages
sudo apt remove --purge $(dpkg -l | grep -E 'games|office' | awk '{print $2}')
# Kubernetes nodes should have minimal software:
# - Container runtime (containerd, CRI-O)
# - kubelet
# - kube-proxy (if not running as pod)
# - Networking tools
# - Monitoring agents
Terminal window
# List running services
systemctl list-units --type=service --state=running
# Disable unnecessary services
sudo systemctl disable --now cups # Printing
sudo systemctl disable --now avahi-daemon # mDNS
sudo systemctl disable --now bluetooth # Bluetooth
# Essential services to keep:
# - containerd or docker
# - kubelet
# - SSH (for management)
# - NTP/chrony (time sync)
Terminal window
# Check for security updates (Ubuntu)
sudo apt update
sudo apt list --upgradable | grep -i security
# Apply security updates only
sudo unattended-upgrades
# Check kernel version
uname -r
# Check for known kernel CVEs
# https://www.kernel.org/

Terminal window
# View current settings
sysctl -a | grep -E "net.ipv4|kernel" | head -20
# Recommended security settings
# Add to /etc/sysctl.d/99-kubernetes-security.conf
# Disable IP forwarding if not needed (kubelets need it!)
# net.ipv4.ip_forward = 0 # Don't disable on K8s nodes!
# Ignore ICMP redirects
net.ipv4.conf.all.accept_redirects = 0
net.ipv4.conf.default.accept_redirects = 0
net.ipv4.conf.all.send_redirects = 0
# Enable SYN flood protection
net.ipv4.tcp_syncookies = 1
# Log suspicious packets
net.ipv4.conf.all.log_martians = 1
# Ignore broadcast ping
net.ipv4.icmp_echo_ignore_broadcasts = 1
# Disable source routing
net.ipv4.conf.all.accept_source_route = 0
# Enable ASLR
kernel.randomize_va_space = 2
# Restrict dmesg access
kernel.dmesg_restrict = 1
# Restrict kernel pointers
kernel.kptr_restrict = 1
# Apply settings
sudo sysctl -p /etc/sysctl.d/99-kubernetes-security.conf
/etc/sysctl.d/99-kubernetes.conf
# Required for Kubernetes networking
net.ipv4.ip_forward = 1
net.bridge.bridge-nf-call-iptables = 1
net.bridge.bridge-nf-call-ip6tables = 1
# For pod networking
net.ipv4.conf.all.forwarding = 1
# Connection tracking for services
net.netfilter.nf_conntrack_max = 131072

Pause and predict: You set net.ipv4.ip_forward = 0 on a Kubernetes worker node to harden it. What breaks immediately, and why is this sysctl setting required for Kubernetes to function?

Terminal window
# Restrict access to process information
# In /etc/fstab or mount options:
proc /proc proc defaults,hidepid=2 0 0
# hidepid options:
# 0 = default (all users can see all processes)
# 1 = users can see their own processes only
# 2 = users can't see other users' processes
# For containers, Kubernetes manages these mounts
# But host should restrict access

/etc/ssh/sshd_config
PermitRootLogin no
PasswordAuthentication no
AllowUsers admin
# Restart SSH
sudo systemctl restart sshd
Terminal window
# List users that can login
grep -v '/nologin\|/false' /etc/passwd
# Lock unnecessary accounts
sudo usermod -L olduser
# Remove user
sudo userdel -r unnecessaryuser
Terminal window
# Secure kubelet files
sudo chmod 600 /var/lib/kubelet/config.yaml
sudo chmod 600 /etc/kubernetes/kubelet.conf
sudo chown root:root /var/lib/kubelet/config.yaml
# Secure certificates
sudo chmod 600 /etc/kubernetes/pki/*.key
sudo chmod 644 /etc/kubernetes/pki/*.crt

Terminal window
# /etc/fstab entries with security options
# Separate partitions for:
# /var/lib/containerd - Container data
# /var/log - Logs
# /tmp - Temporary files
# Example secure mount options:
/dev/sda3 /var/lib/containerd ext4 defaults,nodev,nosuid 0 2
/dev/sda4 /tmp ext4 defaults,nodev,nosuid,noexec 0 2
/dev/sda5 /var/log ext4 defaults,nodev,nosuid,noexec 0 2
# Options:
# nodev - No device files
# nosuid - No setuid executables
# noexec - No execution
Terminal window
# For immutable infrastructure:
# Mount root as read-only, use overlay for writes
# This is often handled by the OS distribution
# (CoreOS, Flatcar, Talos, etc.)

/etc/containerd/config.toml
[plugins."io.containerd.grpc.v1.cri"]
# Disable privileged containers (if possible)
# Note: Some system pods need privileges
[plugins."io.containerd.grpc.v1.cri".containerd]
# Use native (not kata) for performance
default_runtime_name = "runc"
[plugins."io.containerd.grpc.v1.cri".containerd.runtimes.runc]
runtime_type = "io.containerd.runc.v2"
[plugins."io.containerd.grpc.v1.cri".containerd.runtimes.runc.options]
# Enable seccomp
SystemdCgroup = true
/etc/docker/daemon.json
{
"userns-remap": "default",
"no-new-privileges": true,
"seccomp-profile": "/etc/docker/seccomp.json",
"icc": false,
"live-restore": true
}

Pause and predict: You SSH to a Kubernetes worker node and run dpkg -l | wc -l — it shows 847 packages installed. A container-optimized OS like Talos would have fewer than 50. How many of those 847 packages are potential attack surface for an attacker who escapes a container?

Terminal window
# Install auditd
sudo apt install auditd
# Configure audit rules for container security
# /etc/audit/rules.d/docker.rules
# Monitor Docker daemon
-w /usr/bin/dockerd -k docker
-w /usr/bin/containerd -k containerd
# Monitor Docker config
-w /etc/docker -k docker-config
# Monitor container directories
-w /var/lib/docker -k docker-storage
-w /var/lib/containerd -k containerd-storage
# Monitor Kubernetes files
-w /etc/kubernetes -k kubernetes
-w /var/lib/kubelet -k kubelet
# Apply rules
sudo augenrules --load

Terminal window
# Check sysctl settings
sysctl net.ipv4.ip_forward
sysctl kernel.randomize_va_space
# Check for unnecessary services
systemctl list-units --type=service --state=running | wc -l
# Check SSH configuration
grep -E "PermitRootLogin|PasswordAuthentication" /etc/ssh/sshd_config
Terminal window
# Create hardening config
sudo tee /etc/sysctl.d/99-cks-hardening.conf << 'EOF'
kernel.dmesg_restrict = 1
kernel.kptr_restrict = 2
net.ipv4.conf.all.accept_redirects = 0
net.ipv4.conf.default.accept_redirects = 0
EOF
# Apply
sudo sysctl -p /etc/sysctl.d/99-cks-hardening.conf
# Verify
sysctl kernel.dmesg_restrict
Terminal window
# Fix kubelet config permissions
sudo chmod 600 /var/lib/kubelet/config.yaml
sudo chown root:root /var/lib/kubelet/config.yaml
# Verify
ls -la /var/lib/kubelet/config.yaml

AreaRecommendationCommand/Config
PackagesMinimal installRemove unused packages
ServicesDisable unusedsystemctl disable <service>
UpdatesRegular patchingapt update && apt upgrade
SSHNo root, key only/etc/ssh/sshd_config
sysctlRestrictive settings/etc/sysctl.d/*.conf
FilesProper permissionschmod 600 for sensitive files
AuditEnable auditd/etc/audit/rules.d/

  • Container-optimized OSes like Flatcar, Talos, and Bottlerocket are purpose-built for running containers with minimal attack surface.

  • ASLR (Address Space Layout Randomization) makes buffer overflow attacks much harder. It’s enabled by default on modern Linux.

  • User namespaces can provide additional isolation by mapping container root to unprivileged host user. Enable with userns-remap in Docker.

  • Kernel live patching (Ubuntu Livepatch, RHEL kpatch) allows applying security patches without rebooting.


MistakeWhy It HurtsSolution
Not patching regularlyKnown CVEs remainAutomated updates
Default SSH configRoot login, passwordsHarden sshd_config
Too many servicesIncreased attack surfaceMinimize services
Wrong file permissionsUnauthorized accessAudit permissions
No audit loggingCan’t investigate incidentsEnable auditd

  1. An attacker escapes a container and lands on the host node. They run dmesg to understand the kernel version and find exploit targets. Then they check /proc/kallsyms for kernel function addresses. What two sysctl parameters would have blocked this reconnaissance, and what values should they be set to?

    Answer `kernel.dmesg_restrict = 1` prevents unprivileged users from reading kernel ring buffer messages (dmesg), which reveal kernel version, loaded modules, and hardware information useful for exploit selection. `kernel.kptr_restrict = 1` (or 2) hides kernel pointer addresses in `/proc/kallsyms`, preventing attackers from finding the memory addresses they need for kernel exploits like return-oriented programming. Without these, a container escape gives the attacker a roadmap to further exploitation. Set both in `/etc/sysctl.d/99-security.conf` and apply with `sysctl -p`.
  2. During a security audit, you find that a Kubernetes worker node has 847 packages installed, including gcc, make, python3, and netcat. The node’s sole purpose is running container workloads. The admin says “we might need debugging tools someday.” What’s the security argument against this, and what’s the minimum set of packages needed?

    Answer Every installed package is attack surface: `gcc` and `make` let an attacker compile exploits on the node, `python3` enables script-based attacks, and `netcat` enables reverse shells and data exfiltration. If an attacker escapes a container, these tools turn a difficult privilege escalation into trivial exploitation. A Kubernetes worker node needs only: container runtime (containerd), kubelet, kube-proxy (if not a pod), networking tools for the CNI, and basic utilities. Container-optimized OSes like Talos, Flatcar, and Bottlerocket have fewer than 50 packages by design. For debugging, use ephemeral debug containers (`kubectl debug`) or temporary privileged pods rather than leaving tools on the host permanently.
  3. Your organization’s compliance team requires that all Kubernetes nodes disable IP forwarding (net.ipv4.ip_forward = 0) for security hardening. Your cluster stops working immediately. Explain why, and how you satisfy both the compliance requirement and Kubernetes’s networking needs.

    Answer Kubernetes requires `net.ipv4.ip_forward = 1` because pod-to-pod and pod-to-service networking depends on the node forwarding IP packets between network interfaces (pod veth interfaces, bridge interfaces, and physical interfaces). Disabling it breaks all cross-node pod communication and service routing. To satisfy compliance: explain that this specific sysctl is a functional requirement for Kubernetes networking, not a security weakness. The "no IP forwarding" rule exists for general-purpose servers where forwarding could create unintended routes. In Kubernetes, forwarding is controlled by CNI and iptables/nftables rules. Compensating controls include NetworkPolicies, firewalling the node's external interface, and restricting `net.ipv4.conf.all.accept_redirects = 0` and `net.ipv4.conf.all.accept_source_route = 0`.
  4. You run systemctl list-units --type=service --state=running on a worker node and find 27 running services. Beyond kubelet and containerd, you see avahi-daemon, cups, bluetooth, and rpcbind. A security scanner flags these as unnecessary. What’s the risk, and what’s the safe order for disabling them?

    Answer Each unnecessary service is a potential attack vector: avahi-daemon exposes mDNS (network discovery), cups enables network printing, bluetooth opens wireless attack surface, and rpcbind enables remote procedure calls -- none are needed for Kubernetes. The risk is that vulnerabilities in these services (like the 2024 CUPS remote code execution CVE) can be exploited even without container escapes if the services listen on network interfaces. Safe removal order: (1) Check for dependencies first with `systemctl list-dependencies `. (2) Disable in non-critical order: bluetooth first, then avahi-daemon, cups, rpcbind. (3) Verify kubelet and containerd still run after each disable. (4) Test pod scheduling. Use `systemctl disable --now ` to stop and prevent restart.

Task: Demonstrate kernel hardening concepts using Kubernetes security contexts.

Terminal window
# Since we can't modify the host kernel from Kubernetes, we'll demonstrate
# how kernel hardening concepts translate to container security.
# Step 1: Create namespace for testing
kubectl create namespace kernel-test
# Step 2: Deploy pod WITHOUT security hardening (insecure baseline)
cat <<EOF | kubectl apply -f -
apiVersion: v1
kind: Pod
metadata:
name: insecure-pod
namespace: kernel-test
spec:
containers:
- name: app
image: busybox
command: ["sleep", "3600"]
# No security context = dangerous!
EOF
# Step 3: Deploy pod WITH security hardening (secure)
cat <<EOF | kubectl apply -f -
apiVersion: v1
kind: Pod
metadata:
name: secure-pod
namespace: kernel-test
spec:
securityContext:
runAsNonRoot: true
runAsUser: 1000
runAsGroup: 1000
fsGroup: 1000
seccompProfile:
type: RuntimeDefault
containers:
- name: app
image: busybox
command: ["sleep", "3600"]
securityContext:
allowPrivilegeEscalation: false
readOnlyRootFilesystem: true
capabilities:
drop:
- ALL
EOF
# Wait for pods to be ready
kubectl wait --for=condition=Ready pod/insecure-pod -n kernel-test --timeout=60s
kubectl wait --for=condition=Ready pod/secure-pod -n kernel-test --timeout=60s
# Step 4: Compare what each pod can do
echo "=== Insecure Pod: Who am I? ==="
kubectl exec -n kernel-test insecure-pod -- id
echo "=== Secure Pod: Who am I? ==="
kubectl exec -n kernel-test secure-pod -- id
# Step 5: Test filesystem access
echo "=== Insecure Pod: Can write to /tmp? ==="
kubectl exec -n kernel-test insecure-pod -- sh -c "echo 'test' > /tmp/test.txt && echo 'Write succeeded' || echo 'Write failed'"
echo "=== Secure Pod: Can write to /tmp? (should fail - readOnlyRootFilesystem) ==="
kubectl exec -n kernel-test secure-pod -- sh -c "echo 'test' > /tmp/test.txt && echo 'Write succeeded' || echo 'Write blocked!'"
# Step 6: Test process visibility (demonstrates hidepid concept)
echo "=== Insecure Pod: Process list ==="
kubectl exec -n kernel-test insecure-pod -- ps aux | head -5
echo "=== Secure Pod: Process list (limited view) ==="
kubectl exec -n kernel-test secure-pod -- ps aux | head -5
# Step 7: Check proc access
echo "=== Checking /proc access in secure pod ==="
kubectl exec -n kernel-test secure-pod -- cat /proc/1/cmdline 2>&1 | tr '\0' ' ' && echo ""
# Step 8: Check security context applied
echo "=== Security Context Comparison ==="
echo "Insecure pod security context:"
kubectl get pod insecure-pod -n kernel-test -o jsonpath='{.spec.securityContext}' && echo ""
echo "Secure pod security context:"
kubectl get pod secure-pod -n kernel-test -o jsonpath='{.spec.securityContext}' && echo ""
# Step 9: Test privilege escalation (critical kernel hardening)
echo "=== Testing Privilege Escalation Prevention ==="
# Secure pod should block this
kubectl exec -n kernel-test secure-pod -- sh -c "cat /etc/shadow 2>&1" || echo "Access denied (expected)"
# Step 10: Check host sysctl (if running on actual node)
echo ""
echo "=== Host Kernel Checks (run on actual node) ==="
echo "To check on your actual cluster nodes:"
echo " sysctl kernel.randomize_va_space # Should be 2"
echo " sysctl kernel.dmesg_restrict # Should be 1"
echo " sysctl kernel.kptr_restrict # Should be 1 or 2"
echo " sysctl net.ipv4.conf.all.accept_redirects # Should be 0"
# Cleanup
echo ""
echo "=== Cleanup ==="
kubectl delete namespace kernel-test
echo ""
echo "=== Exercise Complete ==="
echo "Key learnings demonstrated:"
echo "1. ✓ runAsNonRoot prevents root execution"
echo "2. ✓ readOnlyRootFilesystem blocks writes"
echo "3. ✓ Dropping capabilities limits syscalls"
echo "4. ✓ allowPrivilegeEscalation=false prevents escalation"
echo "5. ✓ seccompProfile applies syscall filtering"

Success criteria: Deploy insecure and secure pods, observe the security differences, understand how pod security contexts implement kernel hardening concepts.


OS Hardening Principles:

  • Minimize packages and services
  • Keep system updated
  • Restrict SSH access
  • Proper file permissions

Kernel Hardening:

  • Enable ASLR (kernel.randomize_va_space = 2)
  • Restrict dmesg (kernel.dmesg_restrict = 1)
  • Disable dangerous network features

Container Runtime:

  • Enable security features
  • Use seccomp profiles
  • Consider user namespaces

Exam Tips:

  • Know key sysctl parameters
  • Understand file permission requirements
  • Be able to check current configuration

Module 3.4: Network Security - Host-level network hardening.