Skip to content

Module 1.1: Infrastructure as Code

Complexity: [MEDIUM] - Foundational concept

Time to Complete: 30-35 minutes

Prerequisites: Basic command line skills


After this module, you will be able to:

  • Explain what Infrastructure as Code means and why it replaced manual server configuration
  • Compare IaC tools (Terraform, Ansible, Pulumi) and explain when to use each
  • Write a simple declarative configuration and explain how it differs from a script
  • Identify IaC anti-patterns (clickops, imperative scripts for declarative problems, configuration drift)

In 2012, Knight Capital Group lost $460 million in just 45 minutes. Why? A technician manually deployed new software to 7 of their 8 servers, forgetting the 8th. The mismatch caused the system to aggressively buy high and sell low. A single manual configuration error destroyed a multi-billion dollar company.

Before Infrastructure as Code (IaC), setting up servers was manual, error-prone, and impossible to reproduce. “It works on my machine” was everyone’s excuse. IaC changed everything—infrastructure became versionable, testable, and repeatable. Understanding IaC is essential because Kubernetes itself is an IaC system.

Stop and think: How does your current organization track infrastructure changes? If your primary data center vanished today, could you rebuild it from a repository, or would you rely on someone’s memory?


Picture this: It’s 2005. You need to set up a web server.

Manual Process:
1. Order physical server (2-4 weeks)
2. Wait for data center to rack it (1 week)
3. SSH in and install packages
4. Configure by editing files
5. Hope you remember what you did
6. Pray nothing breaks
Documentation: "Ask Dave, he set it up"

Problems:

  • No record of what was done
  • Can’t reproduce the setup
  • Different “identical” servers behave differently
  • Dave goes on vacation; everything breaks

IaC means describing infrastructure in files that can be versioned, shared, and executed.

┌─────────────────────────────────────────────────────────────┐
│ INFRASTRUCTURE AS CODE │
├─────────────────────────────────────────────────────────────┤
│ │
│ Traditional: │
│ ┌─────────┐ ┌─────────┐ ┌─────────┐ │
│ │ Human │ ───► │ Console │ ───► │ Server │ │
│ │ │ │ (GUI) │ │ │ │
│ └─────────┘ └─────────┘ └─────────┘ │
│ │
│ With IaC: │
│ ┌─────────┐ ┌─────────┐ ┌─────────┐ │
│ │ Code │ ───► │ Tool │ ───► │ Server │ │
│ │ (files) │ │(Terraform)│ │ │ │
│ └─────────┘ └─────────┘ └─────────┘ │
│ │ │
│ ▼ │
│ ┌─────────┐ │
│ │ Git │ Version controlled, reviewable, repeatable │
│ └─────────┘ │
│ │
└─────────────────────────────────────────────────────────────┘

Imperative (How):
"Install nginx, then edit /etc/nginx/nginx.conf,
then restart nginx"
Declarative (What):
"I want nginx running with this configuration"

Declarative is preferred—you describe the desired state, the tool figures out how to get there.

Running the same code multiple times produces the same result:

Terminal window
# Running this 10 times creates 10 servers (BAD)
create_server web-1
# Running this 10 times ensures 1 server exists (GOOD)
ensure_server_exists web-1

Pause and predict: If you run an imperative bash script that creates a user twice, it will likely throw a fatal error the second time because the user already exists. What will an idempotent declarative system do?

Terminal window
git log --oneline infrastructure/
abc123 Add production database replica
def456 Increase web server count to 5
ghi789 Initial infrastructure setup
# "Who changed production?" - Just check git blame

┌─────────────────────────────────────────────────────────────┐
│ IaC TOOL CATEGORIES │
├─────────────────────────────────────────────────────────────┤
│ │
│ PROVISIONING (Create infrastructure) │
│ ├── Terraform (cloud-agnostic, most popular) │
│ ├── Pulumi (real programming languages) │
│ ├── CloudFormation (AWS only) │
│ └── ARM Templates (Azure only) │
│ │
│ CONFIGURATION (Configure existing machines) │
│ ├── Ansible (agentless, SSH-based) │
│ ├── Chef (Ruby DSL, agent-based) │
│ ├── Puppet (agent-based, enterprise) │
│ └── Salt (Python-based) │
│ │
│ KUBERNETES-NATIVE (Both provisions and configures K8s) │
│ ├── Helm (package manager for K8s) │
│ ├── Kustomize (patch-based customization) │
│ └── kubectl apply (direct YAML application) │
│ │
└─────────────────────────────────────────────────────────────┘

Terraform by HashiCorp is the most widely used IaC tool:

# main.tf - Terraform configuration
# Define provider (where to create resources)
provider "aws" {
region = "us-west-2"
}
# Define a resource
resource "aws_instance" "web" {
ami = "ami-0c55b159cbfafe1f0"
instance_type = "t2.micro"
tags = {
Name = "web-server"
Environment = "production"
}
}
# Define output
output "public_ip" {
value = aws_instance.web.public_ip
}
Terminal window
# Terraform workflow
terraform init # Download providers
terraform plan # Preview changes
terraform apply # Create infrastructure
terraform destroy # Tear it all down
FeatureTerraformCloudFormation
Cloud supportAny cloudAWS only
State managementBuilt-inManaged by AWS
SyntaxHCL (readable)JSON/YAML (verbose)
Learning curveModerateAWS-specific
CommunityHugeAWS-limited

Ansible uses YAML “playbooks” to configure machines:

# playbook.yml - Ansible playbook
---
- name: Configure web server
hosts: webservers
become: yes # Run as root
tasks:
- name: Install nginx
apt:
name: nginx
state: present
update_cache: yes
- name: Copy configuration
template:
src: nginx.conf.j2
dest: /etc/nginx/nginx.conf
notify: Restart nginx
- name: Ensure nginx is running
service:
name: nginx
state: started
enabled: yes
handlers:
- name: Restart nginx
service:
name: nginx
state: restarted
Terminal window
# Run the playbook
ansible-playbook -i inventory.ini playbook.yml

Key advantage: Agentless. Just needs SSH access.


Kubernetes IS Infrastructure as Code:

# deployment.yaml - Desired state
apiVersion: apps/v1
kind: Deployment
metadata:
name: web
spec:
replicas: 3
selector:
matchLabels:
app: web
template:
metadata:
labels:
app: web
spec:
containers:
- name: nginx
image: nginx:1.25
Terminal window
# Apply desired state
kubectl apply -f deployment.yaml
# Kubernetes reconciles actual state to match desired state
# This is IaC in action!

The connection: Kubernetes uses the same declarative, idempotent principles as Terraform and Ansible.


While IaC is essential for modern engineering, it comes with specific trade-offs:

  • Speed vs. Structure: Clicking through a cloud console (ClickOps) is much faster for a quick, one-off experiment. IaC requires writing code, planning, and applying, which introduces overhead for simple tasks.
  • Learning Curve: Teams cannot simply provision servers; they must learn domain-specific languages (like HCL for Terraform) and understand state management principles.
  • State Management Complexity: Tools like Terraform store the environment’s state in a file. Managing this state file securely (locking it to prevent concurrent runs, encrypting it to hide secrets) becomes a new operational burden.

Terminal window
infrastructure/
├── terraform/
├── main.tf
├── variables.tf
└── outputs.tf
├── kubernetes/
├── deployments/
└── services/
└── ansible/
└── playbooks/
# Don't repeat yourself
module "web_server" {
source = "./modules/ec2-instance"
name = "web-1"
instance_type = "t2.micro"
}
module "api_server" {
source = "./modules/ec2-instance"
name = "api-1"
instance_type = "t2.small"
}
Terminal window
environments/
├── dev/
└── main.tf # Small instances, single replica
├── staging/
└── main.tf # Medium instances, testing
└── prod/
└── main.tf # Large instances, high availability
Golden Rule: If it's not in code, it doesn't exist.
Manual changes = configuration drift = bugs at 3 AM

┌─────────────────────────────────────────────────────────────┐
│ IaC WORKFLOW │
├─────────────────────────────────────────────────────────────┤
│ │
│ 1. Write ───► 2. Review ───► 3. Test │
│ Code (PR/MR) (Plan) │
│ │ │ │
│ │ ▼ │
│ 6. Monitor ◄─── 5. Apply ◄─── 4. Approve │
│ State Changes (Merge) │
│ │
│ All changes go through code review │
│ All changes are auditable │
│ All changes are reversible │
│ │
└─────────────────────────────────────────────────────────────┘

  • NASA uses Terraform to manage their cloud infrastructure. If it’s good enough for space, it’s good enough for your startup.
  • Ansible’s name comes from Ursula K. Le Guin’s sci-fi novels, where an “ansible” is a device for instantaneous communication across space.
  • “Cattle, not pets” is an IaC principle. Treat servers like cattle (replaceable, numbered), not pets (named, irreplaceable). You should be able to destroy and recreate any server without worry.
  • “Configuration Drift” was originally a systems administration term describing the phenomenon where servers in a cluster become increasingly different over time due to ad-hoc, undocumented manual updates.

MistakeWhy It HurtsSolution
Manual changes after IaC deployConfiguration driftRedeploy from code
Not using version controlNo audit trail, no rollbackGit everything
Hardcoding secretsSecurity breachUse secret managers
Monolithic configsHard to maintainUse modules
No state backupLost infrastructure stateRemote state storage
Not testing IaC in CI before applyBroken syntax takes down productionLint and run plan in CI/CD
Ignoring plan outputAccidentally deleting resourcesAlways read the diff before approving
Environment-specific hardcodingCode can’t be reused for staging/prodUse variables for environment differences

Before you practice, let’s walk through a worked example of Kubernetes IaC.

The Goal: Create a declarative configuration for a simple pod.

Step 1: The Code (Desired State)

apiVersion: v1
kind: Pod
metadata:
name: my-web-pod
spec:
containers:
- name: nginx
image: nginx:alpine

Step 2: The Action (Apply) Instead of running kubectl run my-web-pod --image=nginx:alpine (imperative), we apply the file (declarative):

Terminal window
kubectl apply -f pod.yaml

Step 3: The Reconciliation (Idempotency) If we run kubectl apply -f pod.yaml again, Kubernetes compares the desired state (our file) with the actual state running in the cluster. Since they exactly match, it does nothing.


Task: Experience IaC principles with Kubernetes resources.

Step 1. Create a deployment declaratively

Terminal window
cat << 'EOF' > deployment.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
name: iac-demo
spec:
replicas: 2
selector:
matchLabels:
app: iac-demo
template:
metadata:
labels:
app: iac-demo
spec:
containers:
- name: nginx
image: nginx:1.25
EOF
kubectl apply -f deployment.yaml

Step 2. Test idempotency and modification

Terminal window
# 1. Apply again (idempotency)
kubectl apply -f deployment.yaml
# Notice the output says "deployment.apps/iac-demo unchanged"
# 2. Modify the code
sed -i '' 's/replicas: 2/replicas: 4/' deployment.yaml
# 3. Apply change
kubectl apply -f deployment.yaml
# 4. Verify change
kubectl get deployment iac-demo
# Now shows 4 replicas

Step 3. Write from scratch Now, without copying from above, write a new file called config.yaml that creates a Kubernetes ConfigMap named app-settings with a key theme set to "dark". Then apply it declaratively.

Solution for Step 3
  1. Create the declarative file:
Terminal window
cat << 'EOF' > config.yaml
apiVersion: v1
kind: ConfigMap
metadata:
name: app-settings
data:
theme: "dark"
EOF
  1. Apply it using IaC principles:
Terminal window
kubectl apply -f config.yaml
  1. Clean up the exercise resources:
Terminal window
kubectl delete -f deployment.yaml
kubectl delete -f config.yaml
rm deployment.yaml config.yaml

  1. You are running a deployment script for a critical database. The pipeline crashes halfway through. You trigger the pipeline again. Instead of creating a duplicate database, the tool recognizes the first one and simply finishes the configuration. What principle is at work here?

    Answer This demonstrates **idempotency**. Running an idempotent operation multiple times has the same effect as running it once. The tool checks the current state against the desired state and only makes necessary changes, rather than blindly executing commands. This prevents errors like duplicate resources.
  2. Your team needs to spin up 50 AWS EC2 instances, configure a VPC, and set up load balancers. Once the VMs are running, they need complex OS-level user configurations and specific application binaries installed. Which combination of tools is most appropriate?

    Answer Using **Terraform** for the infrastructure provisioning and **Ansible** for the configuration is the most appropriate approach. Terraform excels at creating and managing cloud resources (VPCs, EC2 instances) declaratively. Ansible excels at configuring the operating systems and software on those instances after they are created. Combining them leverages the strengths of both tools.
  3. A junior engineer writes a bash script with 15 if/else statements to check if Nginx is installed, installing it if missing, then starting the service if stopped. You suggest replacing it with a 5-line Kubernetes YAML file. Why is the YAML approach fundamentally different and safer?

    Answer The bash script is **imperative**—it dictates the step-by-step instructions (the "how"). The Kubernetes YAML is **declarative**—it describes the desired end state (the "what"). Declarative approaches are safer because they rely on a controller (like Kubernetes) to continuously reconcile the actual state with the desired state. This eliminates the need for brittle `if/else` logic and handles unexpected starting conditions automatically.
  4. During an incident, an engineer SSHs into a production server and manually edits a configuration file to increase a timeout value. The issue is resolved. Two weeks later, the team deploys a new version of the app via their IaC pipeline, and the timeout issue immediately returns. What happened?

    Answer This is a textbook case of **configuration drift**. The manual change made during the incident was never recorded in the IaC repository. When the IaC pipeline ran two weeks later, it enforced the configuration defined in version control. This effectively overwrote the manual fix and brought back the timeout issue, proving why all changes must go through code.
  5. A critical production bug occurs at 3 AM. The on-call engineer discovers the database connection string was changed on the application server. Nobody knows who changed it or when. How does Infrastructure as Code solve this exact problem?

    Answer IaC relies on **version control** (like Git) as the single source of truth. If all changes are made through IaC, manual edits on the server are either impossible or automatically reverted. The engineer could simply look at the Git history (e.g., `git log` or `git blame`) to see exactly who changed the connection string. Furthermore, they could see when they did it and review the pull request that approved the change, providing a complete audit trail.
  6. Your organization mandates that all infrastructure changes must be auditable, reversible, and reviewed by a peer before applying. A developer complains that Kubernetes makes this impossible because they have to use kubectl run commands all day. How do you correct this misunderstanding?

    Answer The developer is using Kubernetes imperatively via the CLI, which circumvents IaC principles. Kubernetes is fundamentally an IaC system when used correctly. By defining resources in YAML files and committing those files to Git, the organization can enforce reviews. Applying them via a CI/CD pipeline ensures Kubernetes fully supports auditable, reversible, and peer-reviewed infrastructure changes.
  7. You apply a Kubernetes Deployment YAML file to a cluster, creating 3 replicas of a web app. Ten minutes later, you accidentally hit “Up” and “Enter” in your terminal, running the exact same kubectl apply -f deployment.yaml command again. What will the cluster do?

    Answer The cluster will do **nothing**. Because the `apply` command is idempotent and declarative, Kubernetes compares the desired state in the YAML file with the current state in the cluster. Seeing that 3 replicas of the web app are already running with the exact correct configuration, it makes no changes. It simply reports that the resource is unchanged.

Infrastructure as Code transforms infrastructure management:

Core principles:

  • Declarative over imperative
  • Idempotent operations
  • Version controlled
  • Reviewable changes

Key tools:

  • Terraform: Provision cloud resources
  • Ansible: Configure machines
  • Kubernetes: Container orchestration (IaC built-in)

Why it matters:

  • Reproducible environments
  • Audit trail for all changes
  • Disaster recovery (rebuild from code)
  • Collaboration through code review

Kubernetes connection: Everything you do in Kubernetes follows IaC principles. YAML files are your infrastructure code.


Module 1.2: GitOps - Using Git as the source of truth for infrastructure.