Skip to content

Module 1.12: Infrastructure as Code on AWS

Complexity: [MEDIUM] | Time to Complete: 1.5 hours | Track: AWS DevOps Essentials

Before starting this module, ensure you have:

  • Familiarity with Infrastructure as Code concepts (what IaC solves, declarative vs. imperative)
  • Experience creating AWS resources via CLI (from previous modules)
  • AWS CLI v2 installed and configured
  • An AWS account with permissions to create VPCs, subnets, security groups, and CloudFormation stacks
  • Comfortable reading YAML (JSON also works, but YAML is the standard for CloudFormation templates)

After completing this module, you will be able to:

  • Deploy multi-resource CloudFormation stacks with parameters, mappings, and conditional logic
  • Implement CloudFormation change sets and stack policies to prevent accidental deletion of stateful resources
  • Design nested stacks and cross-stack references to modularize infrastructure templates at scale
  • Diagnose CloudFormation rollback failures and resolve dependency conflicts in stack updates

In March 2017, an engineer at a major US company was debugging a billing system issue in S3. The intended fix was to remove a small number of servers from a subsystem. Due to a typo in a manual command, a much larger set of servers was removed than intended. The cascading failure took down significant portions of the internet for nearly four hours, affecting companies from Slack to Trello to the SEC. AWS later published a detailed post-mortem and committed to adding safeguards. One of those safeguards? Better tooling around infrastructure changes so that a single mistyped command cannot cause region-wide impact.

This is what Infrastructure as Code solves at its core. When your infrastructure is defined in a template file, changes go through version control, code review, and automated validation before they touch production. A typo in a CloudFormation template fails at validation time, not at execution time. A bad change is caught in a pull request diff, not in a 4-hour outage post-mortem. And rollback is automatic — CloudFormation undoes changes if a stack update fails, returning to the last known good state.

In this module, you will learn AWS CloudFormation — the native IaC service that has been part of AWS since 2011. You will understand template structure, parameters, outputs, intrinsic functions, and how stacks manage the lifecycle of your resources. You will also learn when CloudFormation is the right choice versus Terraform, and where the AWS Cloud Development Kit (CDK) fits in.


  • CloudFormation manages over 750 AWS resource types as of 2026, covering virtually every AWS service. When AWS launches a new service, CloudFormation support typically follows within weeks, often on launch day. The full resource specification is published as a JSON schema that is over 80 MB uncompressed.

  • A single CloudFormation stack can contain up to 500 resources. For larger architectures, you use nested stacks or stack sets. The 500-resource limit has caught many teams who started with a monolithic template — planning your stack boundaries early saves painful refactoring later.

  • CloudFormation drift detection, launched in 2018, can tell you when someone has manually changed a resource that CloudFormation manages. This solves the “who touched production?” problem — if a security group rule was added via the console, drift detection flags the discrepancy so you can decide whether to update the template or revert the manual change.

  • The AWS CDK (Cloud Development Kit) generates CloudFormation templates under the hood. When you write CDK in TypeScript, Python, or Go, the cdk synth command produces a standard CloudFormation YAML template. This means CDK is not a replacement for CloudFormation — it is a higher-level authoring tool that compiles down to it.


A CloudFormation template is a YAML (or JSON) file that declares the desired state of your infrastructure. Here is the full structure:

AWSTemplateFormatVersion: "2010-09-09"
Description: "What this template creates and why"
# Parameters: User-provided values at deploy time
Parameters:
EnvironmentName:
Type: String
Default: production
AllowedValues: [development, staging, production]
# Mappings: Static lookup tables
Mappings:
RegionAMI:
us-east-1:
HVM64: ami-0abc123def456789
eu-west-1:
HVM64: ami-0def456abc789012
# Conditions: Conditional resource creation
Conditions:
IsProduction: !Equals [!Ref EnvironmentName, production]
# Resources: The actual AWS resources (REQUIRED - only mandatory section)
Resources:
MyVPC:
Type: AWS::EC2::VPC
Properties:
CidrBlock: "10.0.0.0/16"
# Outputs: Values to export or display
Outputs:
VPCId:
Value: !Ref MyVPC
Export:
Name: !Sub "${EnvironmentName}-VPCId"

Only the Resources section is required. Everything else is optional but strongly recommended for production templates.

Each resource has a logical name (your label), a type (the AWS resource), and properties (configuration):

Resources:
# Logical name: WebServerSecurityGroup
WebServerSecurityGroup:
Type: AWS::EC2::SecurityGroup
Properties:
GroupDescription: "Allow HTTP and SSH"
VpcId: !Ref MyVPC
SecurityGroupIngress:
- IpProtocol: tcp
FromPort: 80
ToPort: 80
CidrIp: 0.0.0.0/0
- IpProtocol: tcp
FromPort: 22
ToPort: 22
CidrIp: 10.0.0.0/8

The logical name (WebServerSecurityGroup) is how you reference this resource elsewhere in the template. The physical name (the actual AWS resource ID) is generated by CloudFormation unless you explicitly set it — and you usually should not, because explicit names prevent replacement updates.

Stop and think: If CloudFormation automatically appends random alphanumeric suffixes to your physical resource names, how can you efficiently locate a specific DynamoDB table or S3 bucket in the AWS Console without manually searching through dozens of similarly named resources?

Parameters let you customize a template at deploy time without editing the file:

Parameters:
VPCCidr:
Type: String
Default: "10.0.0.0/16"
Description: "CIDR block for the VPC"
AllowedPattern: "^(\\d{1,3}\\.){3}\\d{1,3}/\\d{1,2}$"
ConstraintDescription: "Must be a valid CIDR (e.g., 10.0.0.0/16)"
InstanceType:
Type: String
Default: t3.micro
AllowedValues:
- t3.micro
- t3.small
- t3.medium
Description: "EC2 instance type"
KeyPairName:
Type: AWS::EC2::KeyPair::KeyName
Description: "Name of an existing EC2 key pair"
EnableNATGateway:
Type: String
Default: "false"
AllowedValues: ["true", "false"]
Description: "Whether to create a NAT Gateway (adds cost)"

AWS-specific parameter types like AWS::EC2::KeyPair::KeyName provide dropdown validation in the console and catch errors before deployment starts.

Outputs: Sharing Information Between Stacks

Section titled “Outputs: Sharing Information Between Stacks”

Outputs expose values from your stack — either for human consumption or for cross-stack references:

Outputs:
VPCId:
Description: "The VPC ID"
Value: !Ref VPC
Export:
Name: !Sub "${AWS::StackName}-VPCId"
PublicSubnet1Id:
Description: "Public subnet in AZ1"
Value: !Ref PublicSubnet1
Export:
Name: !Sub "${AWS::StackName}-PublicSubnet1Id"
ALBDNSName:
Description: "Application Load Balancer DNS name"
Value: !GetAtt ApplicationLoadBalancer.DNSName

The Export field makes the value available to other stacks via Fn::ImportValue. This is how you share a VPC ID from a network stack with an application stack.

Pause and predict: If Stack B uses !ImportValue to consume a VPC ID explicitly exported by Stack A, what exactly happens at the API level if an administrator mistakenly attempts to delete Stack A?


Intrinsic Functions: The Template Programming Language

Section titled “Intrinsic Functions: The Template Programming Language”

CloudFormation templates are declarative, but intrinsic functions add dynamic behavior. These are the functions you will use daily.

# !Ref returns the resource's primary identifier
# For an EC2 instance: the instance ID
# For a parameter: the parameter value
SecurityGroupId: !Ref WebServerSecurityGroup
# !GetAtt returns a specific attribute of a resource
# Different from !Ref -- GetAtt accesses secondary attributes
LoadBalancerDNS: !GetAtt ApplicationLoadBalancer.DNSName
SecurityGroupId: !GetAtt WebServerSecurityGroup.GroupId
# Variable substitution in strings
# ${AWS::StackName} and ${AWS::Region} are pseudo-parameters
BucketName: !Sub "${AWS::StackName}-artifacts-${AWS::Region}"
# Reference resource attributes
UserData:
Fn::Base64: !Sub |
#!/bin/bash
echo "VPC ID is ${VPC}" >> /var/log/setup.log
echo "Region is ${AWS::Region}" >> /var/log/setup.log
aws s3 cp s3://${ArtifactBucket}/config.yml /opt/app/config.yml
# Pick an item from a list
AZ: !Select [0, !GetAZs ""] # First AZ in the region
# Split a string into a list
# If "10.0.0.0/16" --> ["10.0.0.0", "16"]
CidrParts: !Split ["/", !Ref VPCCidr]
# Join list items into a string
SubnetIds: !Join [",", [!Ref Subnet1, !Ref Subnet2, !Ref Subnet3]]
Conditions:
IsProduction: !Equals [!Ref EnvironmentName, production]
CreateNAT: !Equals [!Ref EnableNATGateway, "true"]
ProdWithNAT: !And [!Condition IsProduction, !Condition CreateNAT]
Resources:
NATGateway:
Type: AWS::EC2::NatGateway
Condition: CreateNAT # Only created if condition is true
Properties:
SubnetId: !Ref PublicSubnet1
AllocationId: !GetAtt NATElasticIP.AllocationId
NATElasticIP:
Type: AWS::EC2::EIP
Condition: CreateNAT
Properties:
Domain: vpc
# Use If to set property values conditionally
WebServer:
Type: AWS::EC2::Instance
Properties:
InstanceType: !If [IsProduction, t3.medium, t3.micro]
Monitoring: !If [IsProduction, true, false]
FunctionPurposeExample
!RefResource ID or parameter value!Ref MyVPC
!GetAttResource attribute!GetAtt ALB.DNSName
!SubString interpolation!Sub "${AWS::StackName}-bucket"
!SelectPick from list!Select [0, !GetAZs ""]
!SplitString to list!Split [",", "a,b,c"]
!JoinList to string!Join ["-", ["my", "stack"]]
!IfConditional value!If [IsProd, t3.large, t3.micro]
!EqualsCompare values!Equals [!Ref Env, prod]
!FindInMapLookup in Mappings!FindInMap [RegionAMI, !Ref "AWS::Region", HVM64]
!ImportValueCross-stack reference!ImportValue "network-stack-VPCId"
!GetAZsList AZs in region!GetAZs "" (current region)
!CidrGenerate CIDR blocks!Cidr [!Ref VPCCidr, 6, 8]

A stack is an instance of a template. When you create a stack, CloudFormation provisions all the resources defined in the template. When you update a stack, it calculates the diff and applies only the changes. When you delete a stack, it tears down all resources in reverse dependency order.

Terminal window
# Create a stack from a local template
aws cloudformation create-stack \
--stack-name my-network \
--template-body file://network.yaml \
--parameters \
ParameterKey=EnvironmentName,ParameterValue=production \
ParameterKey=VPCCidr,ParameterValue=10.0.0.0/16
# Create a stack that creates IAM resources (requires explicit capability)
aws cloudformation create-stack \
--stack-name my-app \
--template-body file://app.yaml \
--capabilities CAPABILITY_NAMED_IAM
# Wait for completion
aws cloudformation wait stack-create-complete --stack-name my-network
# Check stack status
aws cloudformation describe-stacks \
--stack-name my-network \
--query 'Stacks[0].[StackName,StackStatus]' \
--output text

Update Behavior: The Three Types of Resource Changes

Section titled “Update Behavior: The Three Types of Resource Changes”

When you update a stack, each resource change falls into one of three categories:

+-------------------------------------------------------------------+
| Update with No Interruption |
| - Resource stays running, updated in-place |
| - Example: Changing a security group description |
| - Example: Adding a tag to an instance |
+-------------------------------------------------------------------+
| Update with Some Interruption |
| - Resource may restart or briefly disconnect |
| - Example: Changing an EC2 instance type (requires stop/start) |
| - Example: Modifying RDS parameter group |
+-------------------------------------------------------------------+
| Replacement |
| - Old resource deleted, new one created |
| - Example: Changing a VPC CIDR block |
| - Example: Changing an RDS engine type |
| - WARNING: Data loss if not handled carefully! |
+-------------------------------------------------------------------+

Always check the AWS documentation for a resource type to understand which property changes trigger replacement. The CloudFormation docs mark each property with “Update requires: No interruption,” “Some interruption,” or “Replacement.”

Stop and think: During a stack update, CloudFormation determines that an EC2 instance must be replaced. By default, it attempts to create the new instance before deleting the old one. If your template also provisions an Elastic IP address and attaches it directly to this instance, why might this “create-before-delete” replacement update immediately fail?

Never update a production stack blindly. Create a change set first:

Terminal window
# Create a change set (does NOT apply changes)
aws cloudformation create-change-set \
--stack-name my-network \
--change-set-name update-subnets \
--template-body file://network-v2.yaml \
--parameters \
ParameterKey=EnvironmentName,ParameterValue=production
# Review what will change
aws cloudformation describe-change-set \
--stack-name my-network \
--change-set-name update-subnets \
--query 'Changes[*].ResourceChange.{Action:Action,Resource:LogicalResourceId,Type:ResourceType,Replacement:Replacement}' \
--output table
# If changes look safe, execute
aws cloudformation execute-change-set \
--stack-name my-network \
--change-set-name update-subnets
# If changes are wrong, delete the change set (no effect on stack)
aws cloudformation delete-change-set \
--stack-name my-network \
--change-set-name update-subnets

The change set output tells you whether each resource will be Added, Modified, or Removed, and whether modification requires Replacement. This is your safety net.

If a stack creation or update fails, CloudFormation rolls back automatically:

  • Create failure: All resources created so far are deleted
  • Update failure: All changes are reverted to the previous state
  • Delete failure: Stack enters DELETE_FAILED state (usually due to resources that cannot be deleted, like non-empty S3 buckets)

You can disable rollback for debugging (--disable-rollback), but never do this in production.


When your infrastructure grows beyond 200-300 resources, a single template becomes unwieldy. Nested stacks let you compose multiple templates:

main.yaml
Resources:
NetworkStack:
Type: AWS::CloudFormation::Stack
Properties:
TemplateURL: https://s3.amazonaws.com/my-templates/network.yaml
Parameters:
EnvironmentName: !Ref EnvironmentName
VPCCidr: !Ref VPCCidr
DatabaseStack:
Type: AWS::CloudFormation::Stack
DependsOn: NetworkStack
Properties:
TemplateURL: https://s3.amazonaws.com/my-templates/database.yaml
Parameters:
VPCId: !GetAtt NetworkStack.Outputs.VPCId
PrivateSubnetIds: !GetAtt NetworkStack.Outputs.PrivateSubnetIds
ApplicationStack:
Type: AWS::CloudFormation::Stack
DependsOn: [NetworkStack, DatabaseStack]
Properties:
TemplateURL: https://s3.amazonaws.com/my-templates/application.yaml
Parameters:
VPCId: !GetAtt NetworkStack.Outputs.VPCId
DatabaseEndpoint: !GetAtt DatabaseStack.Outputs.DatabaseEndpoint

Common patterns for stack boundaries:

Option A: By Layer Option B: By Service
+-------------------+ +----------+----------+
| Application | | Service A| Service B|
+-------------------+ | (App+DB+ | (App+DB+ |
| Database | | Network)| Network) |
+-------------------+ +----------+----------+
| Network | | Shared Network |
+-------------------+ +---------------------+

Option A (layer-based) works well for monolithic applications. Option B (service-based) works better for microservices where each team owns their full stack.


CloudFormation vs Terraform: When to Use What

Section titled “CloudFormation vs Terraform: When to Use What”

This is one of the most debated topics in DevOps. Here is an honest comparison:

FactorCloudFormationTerraform
AWS-onlyNative, first-classExcellent support via AWS provider
Multi-cloudAWS onlyMulti-cloud, multi-provider
State managementManaged by AWS (no state file)State file (local or remote, you manage)
Drift detectionBuilt-interraform plan shows drift
RollbackAutomatic on failureManual (apply previous state)
LanguageYAML/JSON (declarative)HCL (declarative with loops, modules)
ModularityNested stacks, StackSetsModules (more flexible)
Learning curveModerate (verbose but predictable)Moderate (more features to learn)
CostFreeFree (OSS), paid for Terraform Cloud
Community modulesLimited (AWS Samples)Vast (Terraform Registry)
SpeedSlower (sequential by default)Faster (parallel by default)

Use CloudFormation when:

  • Your organization is AWS-only and will stay that way
  • You want zero state management overhead
  • You need automatic rollback guarantees
  • You are using AWS services that require CloudFormation (Service Catalog, Control Tower)

Use Terraform when:

  • You use multiple cloud providers or SaaS services
  • You need HCL’s programming features (for-each loops, dynamic blocks)
  • Your team already knows Terraform
  • You want a rich ecosystem of community modules

Many teams use both: CloudFormation for AWS-native constructs (especially in landing zones and governance) and Terraform for application infrastructure.


The AWS Cloud Development Kit lets you define CloudFormation resources using real programming languages (TypeScript, Python, Java, C#, Go). Under the hood, CDK generates a CloudFormation template:

# CDK Python example -- this generates a CloudFormation template
from aws_cdk import Stack, aws_ec2 as ec2
from constructs import Construct
class NetworkStack(Stack):
def __init__(self, scope: Construct, id: str, **kwargs):
super().__init__(scope, id, **kwargs)
self.vpc = ec2.Vpc(self, "MainVPC",
max_azs=3,
nat_gateways=1,
subnet_configuration=[
ec2.SubnetConfiguration(
name="Public",
subnet_type=ec2.SubnetType.PUBLIC,
cidr_mask=24
),
ec2.SubnetConfiguration(
name="Private",
subnet_type=ec2.SubnetType.PRIVATE_WITH_EGRESS,
cidr_mask=24
)
]
)

This 20-line Python class generates a CloudFormation template with a VPC, 6 subnets (public + private in 3 AZs), route tables, a NAT Gateway, and an Internet Gateway — roughly 200 lines of YAML. CDK is worth learning if you find yourself writing repetitive CloudFormation templates, but understanding raw CloudFormation first is essential because CDK debugging requires reading the generated template.


MistakeWhy It HappensHow to Fix It
Hardcoding resource namesWanting predictable namesLet CloudFormation generate names; hardcoded names prevent replacement updates and cause conflicts across environments
Not using change sets for production updates”I know what changed” confidenceAlways create and review a change set; the 30 seconds it takes has prevented countless outages
Monolithic templates with 400+ resourcesStarting small and never splittingPlan stack boundaries early; split by layer (network/app/data) or by service boundary
Forgetting --capabilities CAPABILITY_NAMED_IAMTemplate creates IAM roles but deploy command omits the flagAdd CAPABILITY_NAMED_IAM (or CAPABILITY_IAM) whenever your template creates IAM resources
Not setting DeletionPolicy: Retain on databasesAssuming delete protection is enoughSet DeletionPolicy: Retain on RDS instances, S3 buckets with data, and DynamoDB tables so accidental stack deletion does not destroy data
Using !Ref when !GetAtt is neededConfusion about which function returns which value!Ref returns the primary identifier (e.g., instance ID); !GetAtt returns other attributes (e.g., DNS name, ARN); check the docs for each resource type
Manual console changes to CloudFormation-managed resources”Just this one quick fix”Run drift detection regularly; treat manual changes as tech debt that must be reconciled with the template
Not exporting outputs from shared stacksCopy-pasting resource IDs between templatesUse Export on outputs and Fn::ImportValue in consuming stacks; this creates explicit dependencies and prevents accidental deletion

1. You are deploying a massive infrastructure update involving 50 new resources. During the deployment, the 45th resource fails to create due to an insufficient permissions error. What state will the first 44 resources be in after the deployment process fully concludes?

CloudFormation will automatically roll back the entire deployment, meaning the first 44 resources will be completely deleted if this was a new stack, or reverted to their previous state if this was an update. This “all-or-nothing” transaction model ensures your infrastructure never gets stuck in an inconsistent, partially deployed state. Once the rollback finishes, the stack will reach the UPDATE_ROLLBACK_COMPLETE or ROLLBACK_COMPLETE state, representing the last known good configuration. This automatic safety mechanism is a key differentiator from tools like Terraform, where a failed apply often leaves resources in a partial state that requires manual cleanup.

2. Your company is expanding its network and you need to increase the size of an existing production VPC. You update the `CidrBlock` property in your CloudFormation template from `10.0.0.0/16` to `10.0.0.0/15` and execute the update. What is the immediate impact on the resources currently running inside this VPC?

Updating the CIDR block of an existing VPC is a change that strictly requires replacement, meaning CloudFormation will attempt to create a brand new VPC and delete the old one. Because a VPC cannot be deleted while it still contains active subnets, instances, and network interfaces, the update will almost certainly fail and roll back unless you have orchestrated a complex migration strategy. This destructive behavior occurs because the fundamental networking boundary of the resources is changing, preventing an in-place modification. You should always use change sets to catch Replacement: True actions on foundational resources before they cause widespread outages or failed updates.

3. You are writing a CloudFormation template that deploys an EC2 instance and a configuration script. You need to pass the instance's private IP address to the script as an environment variable, but using `!Ref MyInstance` is causing the script to fail. Why is this happening, and how do you resolve it?

The script is failing because !Ref applied to an EC2 instance returns the instance’s primary identifier, which is its Instance ID (e.g., i-0abc123def456789), not its IP address. To retrieve secondary attributes like IP addresses or DNS names, you must use the !GetAtt intrinsic function instead. By changing your template to use !GetAtt MyInstance.PrivateIp, CloudFormation will correctly resolve and inject the private IP address into your configuration script. Always consult the CloudFormation resource reference documentation, as each resource type defines exactly what !Ref returns and which specific attributes are exposed via !GetAtt.

4. You have explicitly named a production S3 bucket `my-app-data-bucket` in your CloudFormation template. Months later, you modify the template to change the bucket's physical location (a property requiring replacement) and execute the update. Why does the update immediately fail and roll back?

The update fails because explicitly hardcoded names prevent CloudFormation from performing its standard “create-before-delete” replacement process. When CloudFormation attempts to create the new replacement bucket, it tries to use the exact same name (my-app-data-bucket), which collides with the existing bucket that has not been deleted yet. Because S3 bucket names must be globally unique, AWS rejects the creation request, causing the entire stack update to abort and roll back. To avoid this lifecycle deadlock, you should allow CloudFormation to auto-generate physical names or use dynamic names incorporating the stack name, ensuring replacement resources get a unique identifier before the old resource is destroyed.

5. Your platform team manages the core VPC network, while three independent product teams manage their own application stacks that need to deploy resources into that VPC. Should you use nested stacks to connect the applications to the network, or cross-stack references (Exports/ImportValue)? Why?

You should use cross-stack references (Outputs with Export and !ImportValue) because the network and the applications have completely independent lifecycles and are managed by different teams. Nested stacks are designed for tightly coupled resources that share a single lifecycle and are deployed together by a single owner as a monolithic unit. By using cross-stack references, you establish a hard dependency graph at the AWS level, ensuring the platform team cannot accidentally delete the core VPC while the product teams’ applications are still actively relying on its exported subnets. This loosely coupled approach perfectly aligns with the organizational boundary between the platform and product teams.

6. Your team is adopting AWS CDK to replace raw YAML templates. A developer argues that since CDK uses TypeScript, they no longer need to understand CloudFormation concepts like logical IDs, stack rollbacks, or change sets. How would you correct this architectural misunderstanding?

You must correct this misunderstanding by explaining that CDK is not an alternative infrastructure engine, but rather a higher-level abstraction layer that compiles directly down into standard CloudFormation templates. When you run cdk deploy, AWS is still executing a CloudFormation stack under the hood, meaning all the fundamental rules of CloudFormation—including resource replacement behaviors, stack state machines, and drift detection—still entirely govern your deployment. Furthermore, when deployments fail, AWS returns errors referencing the generated CloudFormation logical IDs and property structures, making it impossible to effectively debug CDK applications without a solid understanding of the underlying CloudFormation engine.

7. A junior engineer accidentally deletes the CloudFormation stack that manages your production RDS database. After the stack deletion successfully completes, you find that the database instance is still running normally and the data is completely intact. What specific template configuration prevented a catastrophic data loss, and how does it alter the standard stack lifecycle?

The template utilized the DeletionPolicy: Retain attribute on the RDS database resource, which explicitly overrides CloudFormation’s default behavior of destroying managed resources during stack deletion. When the stack was deleted, CloudFormation simply removed the database from its internal tracking state, leaving the physical AWS resource abandoned but completely operational. This safeguard is critical for any stateful resource containing persistent data, as it decouples the lifecycle of the data from the lifecycle of the infrastructure automation code. To resume managing the database with IaC, you would need to import the retained resource back into a new CloudFormation stack.


Hands-On Exercise: Deploy a VPC Architecture from CloudFormation

Section titled “Hands-On Exercise: Deploy a VPC Architecture from CloudFormation”

Create a production-ready VPC with public and private subnets across two availability zones, an Internet Gateway, and a NAT Gateway — all defined in a single CloudFormation template. Then update the stack using a change set.

Create a template that defines a VPC with public and private subnets.

Solution

Save this as vpc-stack.yaml:

AWSTemplateFormatVersion: "2010-09-09"
Description: "Production VPC with public and private subnets in 2 AZs"
Parameters:
EnvironmentName:
Type: String
Default: cfn-lab
Description: "Environment name prefixed to resources"
VPCCidr:
Type: String
Default: "10.100.0.0/16"
Description: "CIDR block for the VPC"
PublicSubnet1Cidr:
Type: String
Default: "10.100.1.0/24"
PublicSubnet2Cidr:
Type: String
Default: "10.100.2.0/24"
PrivateSubnet1Cidr:
Type: String
Default: "10.100.10.0/24"
PrivateSubnet2Cidr:
Type: String
Default: "10.100.20.0/24"
EnableNATGateway:
Type: String
Default: "true"
AllowedValues: ["true", "false"]
Description: "Create a NAT Gateway for private subnet internet access"
Conditions:
CreateNAT: !Equals [!Ref EnableNATGateway, "true"]
Resources:
# ============ VPC ============
VPC:
Type: AWS::EC2::VPC
Properties:
CidrBlock: !Ref VPCCidr
EnableDnsSupport: true
EnableDnsHostnames: true
Tags:
- Key: Name
Value: !Sub "${EnvironmentName}-vpc"
# ============ Internet Gateway ============
InternetGateway:
Type: AWS::EC2::InternetGateway
Properties:
Tags:
- Key: Name
Value: !Sub "${EnvironmentName}-igw"
InternetGatewayAttachment:
Type: AWS::EC2::VPCGatewayAttachment
Properties:
InternetGatewayId: !Ref InternetGateway
VpcId: !Ref VPC
# ============ Public Subnets ============
PublicSubnet1:
Type: AWS::EC2::Subnet
Properties:
VpcId: !Ref VPC
AvailabilityZone: !Select [0, !GetAZs ""]
CidrBlock: !Ref PublicSubnet1Cidr
MapPublicIpOnLaunch: true
Tags:
- Key: Name
Value: !Sub "${EnvironmentName}-public-1"
PublicSubnet2:
Type: AWS::EC2::Subnet
Properties:
VpcId: !Ref VPC
AvailabilityZone: !Select [1, !GetAZs ""]
CidrBlock: !Ref PublicSubnet2Cidr
MapPublicIpOnLaunch: true
Tags:
- Key: Name
Value: !Sub "${EnvironmentName}-public-2"
# ============ Private Subnets ============
PrivateSubnet1:
Type: AWS::EC2::Subnet
Properties:
VpcId: !Ref VPC
AvailabilityZone: !Select [0, !GetAZs ""]
CidrBlock: !Ref PrivateSubnet1Cidr
Tags:
- Key: Name
Value: !Sub "${EnvironmentName}-private-1"
PrivateSubnet2:
Type: AWS::EC2::Subnet
Properties:
VpcId: !Ref VPC
AvailabilityZone: !Select [1, !GetAZs ""]
CidrBlock: !Ref PrivateSubnet2Cidr
Tags:
- Key: Name
Value: !Sub "${EnvironmentName}-private-2"
# ============ Public Route Table ============
PublicRouteTable:
Type: AWS::EC2::RouteTable
Properties:
VpcId: !Ref VPC
Tags:
- Key: Name
Value: !Sub "${EnvironmentName}-public-rt"
DefaultPublicRoute:
Type: AWS::EC2::Route
DependsOn: InternetGatewayAttachment
Properties:
RouteTableId: !Ref PublicRouteTable
DestinationCidrBlock: 0.0.0.0/0
GatewayId: !Ref InternetGateway
PublicSubnet1RouteTableAssoc:
Type: AWS::EC2::SubnetRouteTableAssociation
Properties:
SubnetId: !Ref PublicSubnet1
RouteTableId: !Ref PublicRouteTable
PublicSubnet2RouteTableAssoc:
Type: AWS::EC2::SubnetRouteTableAssociation
Properties:
SubnetId: !Ref PublicSubnet2
RouteTableId: !Ref PublicRouteTable
# ============ NAT Gateway (Conditional) ============
NATElasticIP:
Type: AWS::EC2::EIP
Condition: CreateNAT
DependsOn: InternetGatewayAttachment
Properties:
Domain: vpc
Tags:
- Key: Name
Value: !Sub "${EnvironmentName}-nat-eip"
NATGateway:
Type: AWS::EC2::NatGateway
Condition: CreateNAT
Properties:
AllocationId: !GetAtt NATElasticIP.AllocationId
SubnetId: !Ref PublicSubnet1
Tags:
- Key: Name
Value: !Sub "${EnvironmentName}-nat"
# ============ Private Route Table ============
PrivateRouteTable:
Type: AWS::EC2::RouteTable
Properties:
VpcId: !Ref VPC
Tags:
- Key: Name
Value: !Sub "${EnvironmentName}-private-rt"
DefaultPrivateRoute:
Type: AWS::EC2::Route
Condition: CreateNAT
Properties:
RouteTableId: !Ref PrivateRouteTable
DestinationCidrBlock: 0.0.0.0/0
NatGatewayId: !Ref NATGateway
PrivateSubnet1RouteTableAssoc:
Type: AWS::EC2::SubnetRouteTableAssociation
Properties:
SubnetId: !Ref PrivateSubnet1
RouteTableId: !Ref PrivateRouteTable
PrivateSubnet2RouteTableAssoc:
Type: AWS::EC2::SubnetRouteTableAssociation
Properties:
SubnetId: !Ref PrivateSubnet2
RouteTableId: !Ref PrivateRouteTable
Outputs:
VPCId:
Description: "VPC ID"
Value: !Ref VPC
Export:
Name: !Sub "${EnvironmentName}-VPCId"
PublicSubnet1Id:
Description: "Public Subnet 1 ID"
Value: !Ref PublicSubnet1
Export:
Name: !Sub "${EnvironmentName}-PublicSubnet1Id"
PublicSubnet2Id:
Description: "Public Subnet 2 ID"
Value: !Ref PublicSubnet2
Export:
Name: !Sub "${EnvironmentName}-PublicSubnet2Id"
PrivateSubnet1Id:
Description: "Private Subnet 1 ID"
Value: !Ref PrivateSubnet1
Export:
Name: !Sub "${EnvironmentName}-PrivateSubnet1Id"
PrivateSubnet2Id:
Description: "Private Subnet 2 ID"
Value: !Ref PrivateSubnet2
Export:
Name: !Sub "${EnvironmentName}-PrivateSubnet2Id"
PublicSubnetIds:
Description: "Comma-separated public subnet IDs"
Value: !Join [",", [!Ref PublicSubnet1, !Ref PublicSubnet2]]
PrivateSubnetIds:
Description: "Comma-separated private subnet IDs"
Value: !Join [",", [!Ref PrivateSubnet1, !Ref PrivateSubnet2]]

Validate the template syntax, then create the stack.

Solution
Terminal window
# Validate the template (catches syntax errors)
aws cloudformation validate-template \
--template-body file://vpc-stack.yaml
# Create the stack (without NAT Gateway to save cost)
aws cloudformation create-stack \
--stack-name cfn-lab-network \
--template-body file://vpc-stack.yaml \
--parameters \
ParameterKey=EnvironmentName,ParameterValue=cfn-lab \
ParameterKey=EnableNATGateway,ParameterValue=false
# Wait for creation to complete
aws cloudformation wait stack-create-complete --stack-name cfn-lab-network
# Check status
aws cloudformation describe-stacks \
--stack-name cfn-lab-network \
--query 'Stacks[0].[StackName,StackStatus,CreationTime]' \
--output table
# View the outputs
aws cloudformation describe-stacks \
--stack-name cfn-lab-network \
--query 'Stacks[0].Outputs[*].[OutputKey,OutputValue]' \
--output table

Task 3: Update the Stack Using a Change Set

Section titled “Task 3: Update the Stack Using a Change Set”

Enable the NAT Gateway by updating the EnableNATGateway parameter.

Solution
Terminal window
# Create a change set to preview the update
aws cloudformation create-change-set \
--stack-name cfn-lab-network \
--change-set-name enable-nat-gateway \
--template-body file://vpc-stack.yaml \
--parameters \
ParameterKey=EnvironmentName,ParameterValue=cfn-lab \
ParameterKey=EnableNATGateway,ParameterValue=true
# Wait for change set to be created
aws cloudformation wait change-set-create-complete \
--stack-name cfn-lab-network \
--change-set-name enable-nat-gateway
# Review what will change
aws cloudformation describe-change-set \
--stack-name cfn-lab-network \
--change-set-name enable-nat-gateway \
--query 'Changes[*].ResourceChange.{Action:Action,LogicalId:LogicalResourceId,Type:ResourceType}' \
--output table
# You should see: Add NATElasticIP, Add NATGateway, Add DefaultPrivateRoute
# Execute the change set
aws cloudformation execute-change-set \
--stack-name cfn-lab-network \
--change-set-name enable-nat-gateway
# Wait for update to complete
aws cloudformation wait stack-update-complete --stack-name cfn-lab-network
# Verify NAT Gateway was created
aws ec2 describe-nat-gateways \
--filter "Name=tag:Name,Values=cfn-lab-nat" \
--query 'NatGateways[*].[NatGatewayId,State,SubnetId]' \
--output table

Simulate a manual change, then detect the drift.

Solution
Terminal window
# Get the VPC ID from the stack outputs
VPC_ID=$(aws cloudformation describe-stacks \
--stack-name cfn-lab-network \
--query 'Stacks[0].Outputs[?OutputKey==`VPCId`].OutputValue' --output text)
# Make a manual change (add a tag via console or CLI)
aws ec2 create-tags \
--resources $VPC_ID \
--tags Key=ManualChange,Value=SomeoneUsedTheConsole
# Detect drift
DRIFT_ID=$(aws cloudformation detect-stack-drift \
--stack-name cfn-lab-network \
--query 'StackDriftDetectionId' --output text)
# Wait a moment for detection to complete
sleep 15
# Check drift status
aws cloudformation describe-stack-drift-detection-status \
--stack-drift-detection-id $DRIFT_ID \
--query '[StackDriftStatus,DetectionStatus]' --output text
# See which resources drifted
aws cloudformation describe-stack-resource-drifts \
--stack-name cfn-lab-network \
--stack-resource-drift-status-filters MODIFIED \
--query 'StackResourceDrifts[*].[LogicalResourceId,StackResourceDriftStatus]' \
--output table
Solution
Terminal window
# Delete the stack (this removes all resources)
aws cloudformation delete-stack --stack-name cfn-lab-network
# Wait for deletion
aws cloudformation wait stack-delete-complete --stack-name cfn-lab-network
# Verify the stack is gone
aws cloudformation list-stacks \
--stack-status-filter DELETE_COMPLETE \
--query 'StackSummaries[?StackName==`cfn-lab-network`].[StackName,StackStatus,DeletionTime]' \
--output table
  • Template validates without errors (validate-template passes)
  • Stack creates successfully with VPC, 4 subnets, Internet Gateway, and route tables
  • Stack outputs show correct VPC ID and subnet IDs
  • Change set correctly previews the NAT Gateway addition (3 new resources)
  • Stack update adds NAT Gateway and private route successfully
  • Drift detection identifies the manual tag change on the VPC
  • Stack deletes cleanly with all resources removed

You have completed the AWS DevOps Essentials infrastructure modules. Return to the AWS Essentials README to review your progress and explore advanced topics. From here, consider diving into the Platform Engineering Track to learn how these AWS building blocks fit into a broader platform strategy.