Module 2.3: Immutable OS for Kubernetes
Цей контент ще не доступний вашою мовою.
Complexity:
[MEDIUM]| Time: 45 minutesPrerequisites: Module 2.2: OS Provisioning & PXE Boot, K8s Distributions
What You’ll Be Able to Do
Section titled “What You’ll Be Able to Do”After completing this module, you will be able to:
- Evaluate immutable OS options (Talos, Flatcar, Bottlerocket) against traditional Linux distributions for Kubernetes node hosting
- Implement an immutable OS deployment pipeline that produces identical, reproducible node images
- Design an image update strategy with atomic rollouts and rollback capabilities across bare-metal fleets
- Diagnose configuration drift issues in mutable environments and implement immutable alternatives that prevent recurrence
Why This Module Matters
Section titled “Why This Module Matters”In May 2023, a manufacturing company running Kubernetes on Ubuntu 22.04 across 80 bare metal nodes discovered that 23 of their nodes had diverged from the expected state. Over 14 months, engineers had SSH’d into nodes to debug issues and made “temporary” changes: installing tcpdump here, modifying a sysctl there, adding a cron job on another. Some nodes had different kernel versions because an engineer had manually run apt upgrade on a subset. Three nodes had leftover debugging containers that consumed 4GB of RAM each. Two nodes had modified iptables rules that broke pod networking for certain CIDR ranges.
The platform team spent 3 weeks auditing all 80 nodes, found 47 undocumented changes, and rebuilt 23 nodes from scratch. The postmortem identified the root cause: mutable infrastructure. When any engineer can SSH into a node and change anything, configuration drift is inevitable. It is not a question of discipline — it is a property of the system.
Immutable operating systems solve this by making the root filesystem read-only. You cannot SSH in and apt install something. You cannot edit /etc/sysctl.conf. You cannot add cron jobs. The entire OS is a single image that is deployed atomically and replaced atomically. If you need a change, you build a new image and roll it out.
The Printer Cartridge Analogy
A mutable OS is like a refillable ink cartridge — you can add more ink, change the color, clean the nozzle, but eventually it gets messy and inconsistent. An immutable OS is like a sealed cartridge — when it runs out, you replace the entire unit. It is always in a known state. You never debug “why is this cartridge printing blue instead of black” because every cartridge from the factory is identical.
What You’ll Learn
Section titled “What You’ll Learn”- Why immutable OS matters more on bare metal than in cloud
- Talos Linux: Kubernetes-native, API-driven, no SSH
- Flatcar Container Linux: CoreOS successor, systemd-based
- Red Hat CoreOS (RHCOS): OpenShift’s immutable foundation
- How to choose between them
- Upgrade strategies for immutable OS on bare metal
Why Immutable Matters More on Bare Metal
Section titled “Why Immutable Matters More on Bare Metal”In the cloud, you can terminate and recreate instances. Configuration drift is less dangerous because you can destroy the drifted instance and launch a fresh one in minutes. Furthermore, the cloud hypervisor provides a strict security boundary. On bare metal, the OS is the host with no hypervisor isolation, and hardware lifetime spans 3-5 years, allowing much more time for configuration drift to accumulate. Reprovisioning bare metal means PXE booting, waiting for OS installation, and rejoining the cluster — a 15-30 minute process that requires network infrastructure (DHCP, TFTP).
Immutable OS on bare metal gives you:
┌─────────────────────────────────────────────────────────────┐│ MUTABLE vs IMMUTABLE ON BARE METAL ││ ││ MUTABLE (Ubuntu, RHEL) IMMUTABLE (Talos, Flatcar) ││ ──────────────────── ────────────────────────── ││ ✗ SSH access = drift risk ✓ No SSH (Talos) or limited ││ ✗ apt/yum = untracked pkgs ✓ Read-only rootfs ││ ✗ Config files editable ✓ Config via API/ignition ││ ✗ Manual kernel updates ✓ Atomic OS upgrades ││ ✗ Node identity unclear ✓ Node = disposable image ││ ✗ Security: large attack ✓ Minimal surface (no pkg ││ surface (systemd, cron, │ manager, no shell, no ││ sshd, package manager) │ cron, minimal userspace) ││ ││ On cloud: you can destroy and recreate easily ││ On bare metal: reprovisioning is slow and expensive ││ → Immutable prevents the need to reprovision ││ │└─────────────────────────────────────────────────────────────┘Talos Linux
Section titled “Talos Linux”Talos Linux is purpose-built for Kubernetes. There is no SSH, no shell, no package manager. The entire OS is managed via a gRPC API.
Architecture
Section titled “Architecture”┌─────────────────────────────────────────────────────────────┐│ TALOS LINUX ││ ││ ┌──────────────────────────────────────────────────────┐ ││ │ Kernel (Linux 6.x, minimal config) │ ││ │ └── Only modules needed for K8s + hardware drivers │ ││ └──────────────────────────────────────────────────────┘ ││ ┌──────────────────────────────────────────────────────┐ ││ │ machined (PID 1, replaces systemd) │ ││ │ ├── apid (gRPC API for management) │ ││ │ ├── trustd (certificate management) │ ││ │ ├── networkd (network configuration) │ ││ │ └── containerd (container runtime) │ ││ │ ├── kubelet │ ││ │ ├── etcd (control plane nodes) │ ││ │ └── kube-apiserver, etc. │ ││ └──────────────────────────────────────────────────────┘ ││ ││ No SSH. No shell. No package manager. No cron. ││ Everything is managed via talosctl (CLI) or API. ││ ││ Root filesystem: SquashFS (read-only, compressed) ││ Ephemeral data: /var (writable, reset on upgrade) ││ Persistent data: /system/state (survives upgrades) ││ │└─────────────────────────────────────────────────────────────┘Stop and think: An engineer on your team says “I need SSH access to debug networking issues on the nodes.” With Talos Linux, SSH does not exist. Before reading the operations below, think about how you would debug a networking problem on a node with no shell access. What tools or approaches would you use?
Talos Key Operations
Section titled “Talos Key Operations”Every operation below is performed through the Talos gRPC API via talosctl, not SSH. This is a fundamental paradigm shift — instead of connecting to a node and running commands, you send API requests from your workstation. The API enforces what operations are allowed, creating an auditable, reproducible management model. Because Talos lacks SSH, shell access, and Python, it is entirely incompatible with configuration management tools like Ansible. Instead, for deep debugging, you must use API-native tools (like talosctl pcap for packet capture) or deploy ephemeral debugging containers via kubectl debug node:
# Install talosctlcurl -sL https://talos.dev/install | sh
# Generate cluster configurationtalosctl gen config my-cluster https://10.0.1.10:6443 \ --output _out
# Apply config to a node (via API, not SSH)# The --insecure flag is required on first boot because the node hasn't generated its mTLS certificates yet.talosctl apply-config --insecure \ --nodes 10.0.1.10 \ --file _out/controlplane.yaml
# Bootstrap the cluster (first control plane node)talosctl bootstrap --nodes 10.0.1.10
# Get kubeconfigtalosctl kubeconfig --nodes 10.0.1.10
# Check node healthtalosctl health --nodes 10.0.1.10
# View system logs (no SSH needed)talosctl logs kubelet --nodes 10.0.1.10
# Capture network packets (since tcpdump cannot be installed)talosctl pcap --nodes 10.0.1.10 --interface eth0 --output capture.pcap
# Upgrade Talos OS (atomic, in-place)talosctl upgrade --nodes 10.0.1.10 \ --image ghcr.io/siderolabs/installer:v1.9.0
# Upgrade Kubernetes versiontalosctl upgrade-k8s --nodes 10.0.1.10 \ --to 1.35.0Talos Machine Configuration
Section titled “Talos Machine Configuration”# controlplane.yaml (abbreviated)version: v1alpha1machine: type: controlplane token: <generated> ca: crt: <generated> key: <generated> network: hostname: cp-01 interfaces: - interface: eth0 addresses: - 10.0.1.10/24 routes: - network: 0.0.0.0/0 gateway: 10.0.1.1 install: disk: /dev/sda image: ghcr.io/siderolabs/installer:v1.9.0 kubelet: extraArgs: rotate-server-certificates: "true" sysctls: net.core.somaxconn: "65535" net.ipv4.ip_forward: "1"cluster: controlPlane: endpoint: https://10.0.1.10:6443 clusterName: my-cluster network: podSubnets: - 10.244.0.0/16 serviceSubnets: - 10.96.0.0/12 cni: name: ciliumFlatcar Container Linux
Section titled “Flatcar Container Linux”Flatcar is the community successor to CoreOS Container Linux. It uses systemd and Ignition for configuration, and provides a familiar Linux environment with an immutable root filesystem.
Pause and predict: Your team is split: half want Talos (maximum security, no SSH) and half want Flatcar (familiar Linux, SSH available). The team has 2 engineers with deep Linux experience and 3 with mostly Kubernetes experience. Which would you recommend as a starting point, and what would your migration path look like?
Flatcar vs Talos
Section titled “Flatcar vs Talos”| Feature | Talos Linux | Flatcar Container Linux |
|---|---|---|
| Shell access | None (no SSH, no shell) | SSH available (optional) |
| Init system | machined (custom) | systemd |
| Configuration | Machine config (YAML) | Ignition (JSON) |
| Package manager | None | None (but can run toolbox) |
| Config Management | Incompatible with Ansible (no SSH/Python) | Partially compatible with Ansible (SSH + toolbox Python) |
| Root filesystem | SquashFS (read-only) | dm-verity (read-only) |
| Update mechanism | talosctl upgrade | Nebraska/Omaha (auto-update) |
| K8s integration | Built-in (machined runs kubelet) | External (kubeadm, etc.) |
| Debugging | talosctl logs, talosctl dashboard | SSH + journalctl |
| Learning curve | Higher (new paradigm) | Lower (familiar Linux) |
| Best for | Maximum security, zero-touch ops | Teams that need occasional SSH |
Flatcar Ignition Configuration
Section titled “Flatcar Ignition Configuration”{ "ignition": { "version": "3.3.0" }, "storage": { "files": [ { "path": "/etc/hostname", "contents": { "source": "data:,k8s-worker-01" }, "mode": 420 }, { "path": "/etc/sysctl.d/k8s.conf", "contents": { "source": "data:,net.bridge.bridge-nf-call-iptables%3D1%0Anet.ipv4.ip_forward%3D1" }, "mode": 420 } ] }, "systemd": { "units": [ { "name": "containerd.service", "enabled": true }, { "name": "kubelet.service", "enabled": true, "contents": "[Unit]\nDescription=kubelet\nAfter=containerd.service\n[Service]\nExecStart=/opt/bin/kubelet ...\nRestart=always\n[Install]\nWantedBy=multi-user.target" } ] }, "passwd": { "users": [ { "name": "core", "sshAuthorizedKeys": ["ssh-ed25519 AAAA... admin@kubedojo"] } ] }}Red Hat CoreOS (RHCOS)
Section titled “Red Hat CoreOS (RHCOS)”RHCOS is the immutable OS for OpenShift. It is not used standalone — it is tightly coupled with the OpenShift Machine Config Operator (MCO).
| Aspect | Details |
|---|---|
| Base | RHEL kernel + rpm-ostree |
| Management | OpenShift MCO (MachineConfig CRDs) |
| Updates | Coordinated with OpenShift cluster upgrades |
| Standalone use | Not supported (OpenShift only) |
| SSH | Available but discouraged (use oc debug node) |
If you are running OpenShift, RHCOS is your OS. If you are running vanilla Kubernetes, use Talos or Flatcar.
Pause and predict: A defense contractor needs an air-gapped Kubernetes cluster with Secure Boot, no SSH access, and FIPS-compliant cryptography. Which immutable OS would you recommend and why? What if they also need to run legacy applications that require custom kernel modules?
AWS Bottlerocket
Section titled “AWS Bottlerocket”Bottlerocket is an open-source, immutable Linux distribution built by Amazon Web Services, specifically designed for running containers. While heavily used in EKS, it can be deployed on bare metal and other environments.
Bottlerocket Architecture
Section titled “Bottlerocket Architecture”- Root Filesystem: Read-only dm-verity device. Any block-level changes trigger a kernel panic and restart, ensuring strict immutability.
- Language: Most first-party OS components are written in Rust for memory safety.
- Updates: A/B partition scheme. The inactive partition receives the new image, and the system swaps partitions on reboot.
- Security: Enforces an always-on SELinux policy on its mutable filesystem.
- Configuration: API-driven via an agent. No interactive package manager exists; settings migrate automatically during atomic OS updates.
Choosing the Right Immutable OS
Section titled “Choosing the Right Immutable OS”┌─────────────────────────────────────────────────────────────┐│ DECISION TREE ││ ││ Running OpenShift? ││ └── Yes → RHCOS (no choice, it's built in) ││ └── No → Continue ││ ││ Need maximum security (no SSH, minimal surface)? ││ └── Yes → Talos Linux ││ └── No → Continue ││ ││ Team comfortable with no shell access? ││ └── Yes → Talos Linux or Bottlerocket ││ └── No → Flatcar Container Linux ││ ││ Need auto-update mechanism without Cluster API? ││ └── Yes → Flatcar (Nebraska/Omaha updates) ││ └── No → Either works ││ ││ Using Sidero/Metal3 for declarative provisioning? ││ └── Yes → Talos (native Sidero integration) ││ └── No → Either works ││ │└─────────────────────────────────────────────────────────────┘Implementing an OS Deployment and Update Pipeline
Section titled “Implementing an OS Deployment and Update Pipeline”To achieve true immutability at scale, you must implement a deployment pipeline that generates customized, reproducible images and an automated rollout strategy.
The Image Factory Pattern
Section titled “The Image Factory Pattern”Using a tool like the Talos Image Factory or a CI/CD pipeline building Flatcar Ignition configs, you can bake specific configurations into a reproducible image artifact:
- Version Control: Store OS requirements (trusted registries, extra kernel modules) in Git.
- Build Stage: A CI runner submits the schema to the image builder API (like Talos Image Factory or Packer for Bottlerocket/Flatcar).
- Artifact: The pipeline produces a stamped, immutable ISO or raw disk image with a unique SHA-256 hash.
Fleet Update Strategy & Rollbacks
Section titled “Fleet Update Strategy & Rollbacks”Once images are built, upgrading bare-metal fleets requires a controlled strategy to prevent cluster-wide outages:
- A/B Partitioning: All major immutable OSs (Talos, Flatcar, Bottlerocket) use an A/B partition scheme. Updates are written to the inactive partition, and the node swaps partitions on reboot. If the new OS fails to boot, the bootloader automatically rolls back to the known-good A partition.
- Canary Rollouts: Tag a subset of worker nodes as the
canarychannel. Update these nodes first and monitor application health metrics before rolling out to the rest of the fleet. - Coordinated Reboots: Use controllers like Flatcar’s Nebraska or Talos’s native API to sequence reboots. The controller cordons and drains a node, reboots it into the new image, and waits for it to become
Readybefore proceeding to the next node in the rack.
Did You Know?
Section titled “Did You Know?”-
Talos Linux has no
/bin/sh. There is literally no shell binary on the system. Even if an attacker gains code execution inside a container and escapes to the host, there is no shell to drop into. This is the most extreme form of attack surface reduction in any Linux distribution. -
Flatcar Container Linux was forked from CoreOS in 2018 when Red Hat acquired CoreOS and discontinued the community edition. Kinvolk (later acquired by Microsoft) maintained the fork. Today it is managed by the Flatcar community and Microsoft contributes regularly.
-
Google’s GKE nodes run Container-Optimized OS (COS), Google’s own immutable Linux distribution. Amazon EKS uses Amazon Linux 2 or Bottlerocket (Amazon’s immutable OS). The hyperscalers all converged on immutable OS independently because the benefits are overwhelming at scale.
-
The immutable OS concept dates back to ChromeOS (2011) and even earlier to Plan 9 from Bell Labs (1992). Kubernetes made it mainstream for servers because containers already provide the application isolation that a full OS traditionally provided.
Common Mistakes
Section titled “Common Mistakes”| Mistake | Problem | Solution |
|---|---|---|
| Choosing Talos without team buy-in | Engineers panic when they cannot SSH | Start with Flatcar, graduate to Talos |
| Not testing upgrades | Atomic upgrade fails and node is bricked | Test upgrades in staging first; always |
| Ignoring BIOS/firmware | Immutable OS does not update BIOS | Separate firmware management process |
| Custom kernel modules | Immutable OS has fixed kernel | Use Talos extensions or Flatcar’s custom image builder |
| Mixing mutable and immutable | Two operational models = double complexity | Commit to one approach per environment |
| No rollback plan | Bad OS image = all nodes broken | A/B partition scheme (both Talos and Flatcar support this) |
| Hardcoded IPs in config | Cannot scale or replace nodes | Use DHCP or dynamic config generation |
Question 1
Section titled “Question 1”An engineer needs to install tcpdump on a Talos Linux node to debug a networking issue. How do they do it?
Answer
They don’t install anything on the node. Talos has no package manager, no shell, and no SSH. Instead:
-
Use
talosctlto capture packets:Terminal window talosctl pcap --nodes 10.0.1.50 --interface eth0 \--output capture.pcap -
Use a debug pod on the node:
Terminal window kubectl debug node/worker-01 -it --image=nicolaka/netshoot# Inside the debug pod:tcpdump -i eth0 -nn port 6443 -
Use
talosctl logsto examine network-related logs:Terminal window talosctl logs networkd --nodes 10.0.1.50
The key insight: you never modify the node OS. You bring your debugging tools to the node via containers, or you use the management API.
Question 2
Section titled “Question 2”Your Talos cluster upgrade from v1.8.0 to v1.9.0 fails on the third control plane node. The node is stuck in a boot loop. What do you do?
Answer
Talos uses an A/B partition scheme for atomic upgrades:
-
The old OS image is still on partition A. The upgrade wrote the new image to partition B. Since the boot failed, Talos’s bootloader will automatically fall back to partition A after a configurable number of boot failures.
-
If automatic rollback does not trigger, use the BMC to power cycle the node. The A/B fallback should activate.
-
If the node is truly bricked, use
talosctl resetvia the API (if the API is still reachable) to wipe and reprovision. -
If the API is unreachable, PXE boot the node with a fresh Talos image and apply the control plane config.
Prevention:
- Always upgrade one node at a time
- Verify each node is healthy before upgrading the next
- Test the upgrade in a non-production cluster first
- Ensure BMC access is working (your last resort for bricked nodes)
Question 3
Section titled “Question 3”A platform team is migrating from AWS EKS to an on-premises bare-metal cluster. A senior engineer argues: “We used mutable Ubuntu instances in AWS without major drift issues, so we can safely use mutable Ubuntu on bare metal too.” Based on the characteristics of bare-metal infrastructure, why is this reasoning flawed?
Answer
The reasoning ignores three fundamental differences between cloud VMs and bare-metal servers:
- Reprovisioning cost: In AWS, replacing a drifted Ubuntu VM takes seconds. On bare metal, reprovisioning requires a slow PXE boot, OS installation, and cluster rejoin (15-30 minutes), making drift much more expensive to fix.
- Hardware lifetime: A bare metal server typically runs for 3-5 years without replacement, allowing much more time for configuration drift to accumulate compared to ephemeral cloud VMs.
- No hypervisor isolation: In the cloud, the hypervisor isolates the VM. On bare metal, the OS is the host. A compromised mutable OS presents a significantly larger attack surface for the entire infrastructure.
Question 4
Section titled “Question 4”Your organization uses Ansible for server management. Is an immutable OS compatible with Ansible?
Answer
Partially, depending on the OS:
-
Talos Linux: Not compatible with Ansible. Talos has no SSH, no Python interpreter, and no shell — Ansible cannot connect to or execute on Talos nodes. All management is done via
talosctlor the Talos gRPC API. You would use Ansible only to manage thetalosctlcommands from a management node. -
Flatcar Container Linux: Partially compatible. Flatcar supports SSH and has Python available via the
toolboxcontainer. Ansible can connect and gather facts, but the read-only root filesystem means you cannot use Ansible to install packages or modify system files in the traditional way. You can manage containerized services and configuration files in writable areas (/etc,/opt). -
Transition strategy: Use Ansible to manage your PXE/provisioning infrastructure and to orchestrate
talosctlcommands. Treat nodes as cattle (replace, don’t repair) rather than pets (configure in place).
Hands-On Exercise: Deploy Talos Linux in Docker
Section titled “Hands-On Exercise: Deploy Talos Linux in Docker”Task: Create a minimal Talos Linux cluster using Docker (no bare metal needed).
# Install talosctlcurl -sL https://talos.dev/install | sh
# Create a local Talos cluster (3 CP + 1 worker in Docker)talosctl cluster create \ --name demo \ --controlplanes 3 \ --workers 1
# The cluster create command automatically configures ~/.kube/config.# Verify the cluster endpoints and capture the control plane IP:talosctl cluster show --name demoexport CP_IP=$(docker inspect -f '{{range .NetworkSettings.Networks}}{{.IPAddress}}{{end}}' demo-controlplane-1)
# Verify the clusterkubectl get nodes# NAME STATUS ROLES AGE VERSION# demo-controlplane-1 Ready control-plane 2m v1.35.0# demo-controlplane-2 Ready control-plane 2m v1.35.0# demo-controlplane-3 Ready control-plane 2m v1.35.0# demo-worker-1 Ready <none> 2m v1.35.0
# Try to SSH into a node (this will fail — no SSH on Talos)ssh -o BatchMode=yes -o ConnectTimeout=5 root@$CP_IP# Connection refused
# Instead, use talosctl for managementtalosctl dashboard --nodes $CP_IP# Shows real-time node metrics, logs, services
# View system servicestalosctl services --nodes $CP_IP
# Cleanuptalosctl cluster destroy --name demoSuccess Criteria
Section titled “Success Criteria”- Talos cluster created with 3 CP + 1 worker
- kubeconfig retrieved and kubectl works
- SSH connection attempt fails (as expected)
- talosctl dashboard shows node status
- talosctl services shows node status
- Cluster destroyed cleanly
Next Module
Section titled “Next Module”Continue to Module 2.4: Declarative Bare Metal to learn how Sidero and Metal3 bring Cluster API to bare metal, enabling GitOps-driven hardware lifecycle management.