Module 1.1: Infrastructure as Code
Complexity:
[MEDIUM]- Foundational conceptTime to Complete: 30-35 minutes
Prerequisites: Basic command line skills
What You’ll Be Able to Do
Section titled “What You’ll Be Able to Do”After this module, you will be able to:
- Explain what Infrastructure as Code means and why it replaced manual server configuration
- Compare IaC tools (Terraform, Ansible, Pulumi) and explain when to use each
- Write a simple declarative configuration and explain how it differs from a script
- Identify IaC anti-patterns (clickops, imperative scripts for declarative problems, configuration drift)
Why This Module Matters
Section titled “Why This Module Matters”In 2012, Knight Capital Group lost $460 million in just 45 minutes. Why? A technician manually deployed new software to 7 of their 8 servers, forgetting the 8th. The mismatch caused the system to aggressively buy high and sell low. A single manual configuration error destroyed a multi-billion dollar company.
Before Infrastructure as Code (IaC), setting up servers was manual, error-prone, and impossible to reproduce. “It works on my machine” was everyone’s excuse. IaC changed everything—infrastructure became versionable, testable, and repeatable. Understanding IaC is essential because Kubernetes itself is an IaC system.
Stop and think: How does your current organization track infrastructure changes? If your primary data center vanished today, could you rebuild it from a repository, or would you rely on someone’s memory?
The Old Way: ClickOps
Section titled “The Old Way: ClickOps”Picture this: It’s 2005. You need to set up a web server.
Manual Process:1. Order physical server (2-4 weeks)2. Wait for data center to rack it (1 week)3. SSH in and install packages4. Configure by editing files5. Hope you remember what you did6. Pray nothing breaks
Documentation: "Ask Dave, he set it up"Problems:
- No record of what was done
- Can’t reproduce the setup
- Different “identical” servers behave differently
- Dave goes on vacation; everything breaks
Infrastructure as Code
Section titled “Infrastructure as Code”IaC means describing infrastructure in files that can be versioned, shared, and executed.
┌─────────────────────────────────────────────────────────────┐│ INFRASTRUCTURE AS CODE │├─────────────────────────────────────────────────────────────┤│ ││ Traditional: ││ ┌─────────┐ ┌─────────┐ ┌─────────┐ ││ │ Human │ ───► │ Console │ ───► │ Server │ ││ │ │ │ (GUI) │ │ │ ││ └─────────┘ └─────────┘ └─────────┘ ││ ││ With IaC: ││ ┌─────────┐ ┌─────────┐ ┌─────────┐ ││ │ Code │ ───► │ Tool │ ───► │ Server │ ││ │ (files) │ │(Terraform)│ │ │ ││ └─────────┘ └─────────┘ └─────────┘ ││ │ ││ ▼ ││ ┌─────────┐ ││ │ Git │ Version controlled, reviewable, repeatable ││ └─────────┘ ││ │└─────────────────────────────────────────────────────────────┘Key Principles
Section titled “Key Principles”1. Declarative vs Imperative
Section titled “1. Declarative vs Imperative”Imperative (How):"Install nginx, then edit /etc/nginx/nginx.conf,then restart nginx"
Declarative (What):"I want nginx running with this configuration"Declarative is preferred—you describe the desired state, the tool figures out how to get there.
2. Idempotency
Section titled “2. Idempotency”Running the same code multiple times produces the same result:
# Running this 10 times creates 10 servers (BAD)create_server web-1
# Running this 10 times ensures 1 server exists (GOOD)ensure_server_exists web-1Pause and predict: If you run an imperative bash script that creates a user twice, it will likely throw a fatal error the second time because the user already exists. What will an idempotent declarative system do?
3. Version Control
Section titled “3. Version Control”git log --oneline infrastructure/abc123 Add production database replicadef456 Increase web server count to 5ghi789 Initial infrastructure setup
# "Who changed production?" - Just check git blameIaC Tools Landscape
Section titled “IaC Tools Landscape”┌─────────────────────────────────────────────────────────────┐│ IaC TOOL CATEGORIES │├─────────────────────────────────────────────────────────────┤│ ││ PROVISIONING (Create infrastructure) ││ ├── Terraform (cloud-agnostic, most popular) ││ ├── Pulumi (real programming languages) ││ ├── CloudFormation (AWS only) ││ └── ARM Templates (Azure only) ││ ││ CONFIGURATION (Configure existing machines) ││ ├── Ansible (agentless, SSH-based) ││ ├── Chef (Ruby DSL, agent-based) ││ ├── Puppet (agent-based, enterprise) ││ └── Salt (Python-based) ││ ││ KUBERNETES-NATIVE (Both provisions and configures K8s) ││ ├── Helm (package manager for K8s) ││ ├── Kustomize (patch-based customization) ││ └── kubectl apply (direct YAML application) ││ │└─────────────────────────────────────────────────────────────┘Terraform: The Industry Standard
Section titled “Terraform: The Industry Standard”Terraform by HashiCorp is the most widely used IaC tool:
# main.tf - Terraform configuration
# Define provider (where to create resources)provider "aws" { region = "us-west-2"}
# Define a resourceresource "aws_instance" "web" { ami = "ami-0c55b159cbfafe1f0" instance_type = "t2.micro"
tags = { Name = "web-server" Environment = "production" }}
# Define outputoutput "public_ip" { value = aws_instance.web.public_ip}# Terraform workflowterraform init # Download providersterraform plan # Preview changesterraform apply # Create infrastructureterraform destroy # Tear it all downWhy Terraform Wins
Section titled “Why Terraform Wins”| Feature | Terraform | CloudFormation |
|---|---|---|
| Cloud support | Any cloud | AWS only |
| State management | Built-in | Managed by AWS |
| Syntax | HCL (readable) | JSON/YAML (verbose) |
| Learning curve | Moderate | AWS-specific |
| Community | Huge | AWS-limited |
Ansible: Configuration Made Simple
Section titled “Ansible: Configuration Made Simple”Ansible uses YAML “playbooks” to configure machines:
# playbook.yml - Ansible playbook---- name: Configure web server hosts: webservers become: yes # Run as root
tasks: - name: Install nginx apt: name: nginx state: present update_cache: yes
- name: Copy configuration template: src: nginx.conf.j2 dest: /etc/nginx/nginx.conf notify: Restart nginx
- name: Ensure nginx is running service: name: nginx state: started enabled: yes
handlers: - name: Restart nginx service: name: nginx state: restarted# Run the playbookansible-playbook -i inventory.ini playbook.ymlKey advantage: Agentless. Just needs SSH access.
IaC for Kubernetes
Section titled “IaC for Kubernetes”Kubernetes IS Infrastructure as Code:
# deployment.yaml - Desired stateapiVersion: apps/v1kind: Deploymentmetadata: name: webspec: replicas: 3 selector: matchLabels: app: web template: metadata: labels: app: web spec: containers: - name: nginx image: nginx:1.25# Apply desired statekubectl apply -f deployment.yaml
# Kubernetes reconciles actual state to match desired state# This is IaC in action!The connection: Kubernetes uses the same declarative, idempotent principles as Terraform and Ansible.
Trade-Offs: The Cost of IaC
Section titled “Trade-Offs: The Cost of IaC”While IaC is essential for modern engineering, it comes with specific trade-offs:
- Speed vs. Structure: Clicking through a cloud console (ClickOps) is much faster for a quick, one-off experiment. IaC requires writing code, planning, and applying, which introduces overhead for simple tasks.
- Learning Curve: Teams cannot simply provision servers; they must learn domain-specific languages (like HCL for Terraform) and understand state management principles.
- State Management Complexity: Tools like Terraform store the environment’s state in a file. Managing this state file securely (locking it to prevent concurrent runs, encrypting it to hide secrets) becomes a new operational burden.
IaC Best Practices
Section titled “IaC Best Practices”1. Everything in Git
Section titled “1. Everything in Git”infrastructure/├── terraform/│ ├── main.tf│ ├── variables.tf│ └── outputs.tf├── kubernetes/│ ├── deployments/│ └── services/└── ansible/ └── playbooks/2. Use Modules/Reusable Components
Section titled “2. Use Modules/Reusable Components”# Don't repeat yourselfmodule "web_server" { source = "./modules/ec2-instance"
name = "web-1" instance_type = "t2.micro"}
module "api_server" { source = "./modules/ec2-instance"
name = "api-1" instance_type = "t2.small"}3. Separate Environments
Section titled “3. Separate Environments”environments/├── dev/│ └── main.tf # Small instances, single replica├── staging/│ └── main.tf # Medium instances, testing└── prod/ └── main.tf # Large instances, high availability4. Never Edit Manually
Section titled “4. Never Edit Manually”Golden Rule: If it's not in code, it doesn't exist.
Manual changes = configuration drift = bugs at 3 AMThe IaC Workflow
Section titled “The IaC Workflow”┌─────────────────────────────────────────────────────────────┐│ IaC WORKFLOW │├─────────────────────────────────────────────────────────────┤│ ││ 1. Write ───► 2. Review ───► 3. Test ││ Code (PR/MR) (Plan) ││ │ │ ││ │ ▼ ││ 6. Monitor ◄─── 5. Apply ◄─── 4. Approve ││ State Changes (Merge) ││ ││ All changes go through code review ││ All changes are auditable ││ All changes are reversible ││ │└─────────────────────────────────────────────────────────────┘Did You Know?
Section titled “Did You Know?”- NASA uses Terraform to manage their cloud infrastructure. If it’s good enough for space, it’s good enough for your startup.
- Ansible’s name comes from Ursula K. Le Guin’s sci-fi novels, where an “ansible” is a device for instantaneous communication across space.
- “Cattle, not pets” is an IaC principle. Treat servers like cattle (replaceable, numbered), not pets (named, irreplaceable). You should be able to destroy and recreate any server without worry.
- “Configuration Drift” was originally a systems administration term describing the phenomenon where servers in a cluster become increasingly different over time due to ad-hoc, undocumented manual updates.
Common Mistakes
Section titled “Common Mistakes”| Mistake | Why It Hurts | Solution |
|---|---|---|
| Manual changes after IaC deploy | Configuration drift | Redeploy from code |
| Not using version control | No audit trail, no rollback | Git everything |
| Hardcoding secrets | Security breach | Use secret managers |
| Monolithic configs | Hard to maintain | Use modules |
| No state backup | Lost infrastructure state | Remote state storage |
| Not testing IaC in CI before apply | Broken syntax takes down production | Lint and run plan in CI/CD |
| Ignoring plan output | Accidentally deleting resources | Always read the diff before approving |
| Environment-specific hardcoding | Code can’t be reused for staging/prod | Use variables for environment differences |
Mini-Workshop: IaC with kubectl
Section titled “Mini-Workshop: IaC with kubectl”Before you practice, let’s walk through a worked example of Kubernetes IaC.
The Goal: Create a declarative configuration for a simple pod.
Step 1: The Code (Desired State)
apiVersion: v1kind: Podmetadata: name: my-web-podspec: containers: - name: nginx image: nginx:alpineStep 2: The Action (Apply)
Instead of running kubectl run my-web-pod --image=nginx:alpine (imperative), we apply the file (declarative):
kubectl apply -f pod.yamlStep 3: The Reconciliation (Idempotency)
If we run kubectl apply -f pod.yaml again, Kubernetes compares the desired state (our file) with the actual state running in the cluster. Since they exactly match, it does nothing.
Hands-On Exercise
Section titled “Hands-On Exercise”Task: Experience IaC principles with Kubernetes resources.
Step 1. Create a deployment declaratively
cat << 'EOF' > deployment.yamlapiVersion: apps/v1kind: Deploymentmetadata: name: iac-demospec: replicas: 2 selector: matchLabels: app: iac-demo template: metadata: labels: app: iac-demo spec: containers: - name: nginx image: nginx:1.25EOF
kubectl apply -f deployment.yamlStep 2. Test idempotency and modification
# 1. Apply again (idempotency)kubectl apply -f deployment.yaml# Notice the output says "deployment.apps/iac-demo unchanged"
# 2. Modify the codesed -i '' 's/replicas: 2/replicas: 4/' deployment.yaml
# 3. Apply changekubectl apply -f deployment.yaml
# 4. Verify changekubectl get deployment iac-demo# Now shows 4 replicasStep 3. Write from scratch
Now, without copying from above, write a new file called config.yaml that creates a Kubernetes ConfigMap named app-settings with a key theme set to "dark". Then apply it declaratively.
Solution for Step 3
- Create the declarative file:
cat << 'EOF' > config.yamlapiVersion: v1kind: ConfigMapmetadata: name: app-settingsdata: theme: "dark"EOF- Apply it using IaC principles:
kubectl apply -f config.yaml- Clean up the exercise resources:
kubectl delete -f deployment.yamlkubectl delete -f config.yamlrm deployment.yaml config.yaml-
You are running a deployment script for a critical database. The pipeline crashes halfway through. You trigger the pipeline again. Instead of creating a duplicate database, the tool recognizes the first one and simply finishes the configuration. What principle is at work here?
Answer
This demonstrates **idempotency**. Running an idempotent operation multiple times has the same effect as running it once. The tool checks the current state against the desired state and only makes necessary changes, rather than blindly executing commands. This prevents errors like duplicate resources. -
Your team needs to spin up 50 AWS EC2 instances, configure a VPC, and set up load balancers. Once the VMs are running, they need complex OS-level user configurations and specific application binaries installed. Which combination of tools is most appropriate?
Answer
Using **Terraform** for the infrastructure provisioning and **Ansible** for the configuration is the most appropriate approach. Terraform excels at creating and managing cloud resources (VPCs, EC2 instances) declaratively. Ansible excels at configuring the operating systems and software on those instances after they are created. Combining them leverages the strengths of both tools. -
A junior engineer writes a bash script with 15
if/elsestatements to check if Nginx is installed, installing it if missing, then starting the service if stopped. You suggest replacing it with a 5-line Kubernetes YAML file. Why is the YAML approach fundamentally different and safer?Answer
The bash script is **imperative**—it dictates the step-by-step instructions (the "how"). The Kubernetes YAML is **declarative**—it describes the desired end state (the "what"). Declarative approaches are safer because they rely on a controller (like Kubernetes) to continuously reconcile the actual state with the desired state. This eliminates the need for brittle `if/else` logic and handles unexpected starting conditions automatically. -
During an incident, an engineer SSHs into a production server and manually edits a configuration file to increase a timeout value. The issue is resolved. Two weeks later, the team deploys a new version of the app via their IaC pipeline, and the timeout issue immediately returns. What happened?
Answer
This is a textbook case of **configuration drift**. The manual change made during the incident was never recorded in the IaC repository. When the IaC pipeline ran two weeks later, it enforced the configuration defined in version control. This effectively overwrote the manual fix and brought back the timeout issue, proving why all changes must go through code. -
A critical production bug occurs at 3 AM. The on-call engineer discovers the database connection string was changed on the application server. Nobody knows who changed it or when. How does Infrastructure as Code solve this exact problem?
Answer
IaC relies on **version control** (like Git) as the single source of truth. If all changes are made through IaC, manual edits on the server are either impossible or automatically reverted. The engineer could simply look at the Git history (e.g., `git log` or `git blame`) to see exactly who changed the connection string. Furthermore, they could see when they did it and review the pull request that approved the change, providing a complete audit trail. -
Your organization mandates that all infrastructure changes must be auditable, reversible, and reviewed by a peer before applying. A developer complains that Kubernetes makes this impossible because they have to use
kubectl runcommands all day. How do you correct this misunderstanding?Answer
The developer is using Kubernetes imperatively via the CLI, which circumvents IaC principles. Kubernetes is fundamentally an IaC system when used correctly. By defining resources in YAML files and committing those files to Git, the organization can enforce reviews. Applying them via a CI/CD pipeline ensures Kubernetes fully supports auditable, reversible, and peer-reviewed infrastructure changes. -
You apply a Kubernetes Deployment YAML file to a cluster, creating 3 replicas of a web app. Ten minutes later, you accidentally hit “Up” and “Enter” in your terminal, running the exact same
kubectl apply -f deployment.yamlcommand again. What will the cluster do?Answer
The cluster will do **nothing**. Because the `apply` command is idempotent and declarative, Kubernetes compares the desired state in the YAML file with the current state in the cluster. Seeing that 3 replicas of the web app are already running with the exact correct configuration, it makes no changes. It simply reports that the resource is unchanged.
Summary
Section titled “Summary”Infrastructure as Code transforms infrastructure management:
Core principles:
- Declarative over imperative
- Idempotent operations
- Version controlled
- Reviewable changes
Key tools:
- Terraform: Provision cloud resources
- Ansible: Configure machines
- Kubernetes: Container orchestration (IaC built-in)
Why it matters:
- Reproducible environments
- Audit trail for all changes
- Disaster recovery (rebuild from code)
- Collaboration through code review
Kubernetes connection: Everything you do in Kubernetes follows IaC principles. YAML files are your infrastructure code.
Next Module
Section titled “Next Module”Module 1.2: GitOps - Using Git as the source of truth for infrastructure.