Skip to content

Module 1.6: FinOps Culture & Automation

Discipline Module | Complexity: [MEDIUM] | Time: 2h

Before starting this module:

  • Required: Module 1.5: Storage & Network Cost Management — Completing the technical FinOps modules
  • Required: Understanding of CI/CD pipelines (GitHub Actions, GitLab CI, or similar)
  • Required: Basic Terraform familiarity (for Infracost exercise)
  • Recommended: Experience with Infrastructure as Code workflows
  • Recommended: Involvement in team budgeting or capacity planning

After completing this module, you will be able to:

  • Lead organizational change that embeds cost awareness into engineering culture and daily practices
  • Design gamification and incentive programs that motivate teams to optimize cloud spending
  • Build cost review processes that integrate with sprint planning and architecture decision records
  • Implement cost anomaly detection that alerts teams to unexpected spending changes before they compound

You’ve learned the technical side of FinOps: cost allocation, rightsizing, compute optimization, storage and network management. Impressive. But here’s the uncomfortable truth:

Technical optimization without cultural change is temporary.

You can rightsize every workload, migrate every volume to gp3, and add VPC endpoints everywhere — and six months later, costs will be right back where they started. Why? Because the behaviors that created the waste haven’t changed. Developers will still over-provision. Teams will still spin up resources without tags. Nobody will look at the dashboards you built.

FinOps is 30% technology and 70% culture. This module covers the 70%.

The FinOps Sustainability Problem:
┌─────────────────────────────────────────────────┐
│ │
│ Month 1: Big optimization push ↓ -35% cost │
│ Month 3: Teams drift back ↑ +12% │
│ Month 6: Nobody watching ↑ +22% │
│ Month 12: Back to original ↑ +35% │
│ │
│ Without culture change, optimization decays. │
│ │
│ With culture change: │
│ Month 1: Optimization push ↓ -35% cost │
│ Month 3: Automated guardrails → -2% more │
│ Month 6: Teams self-optimize → -5% more │
│ Month 12: Continuous improvement → -8% more │
│ │
└─────────────────────────────────────────────────┘

  • The FinOps Foundation’s 2024 survey found that the #1 challenge for FinOps practitioners is “getting engineers to take action” — not tooling, not data quality, not executive support. 73% of respondents cited cultural adoption as harder than technical implementation.

  • Spotify runs a “Cost Insights” plugin in their internal developer portal (Backstage) that shows every team their cloud costs alongside their service catalog. Engineers see cost data every time they open their service page — no separate dashboard required. This passive visibility has been credited with a 19% reduction in organic cost growth.

  • Etsy gamified cloud cost optimization by creating a “Cloud Leaderboard” showing which teams had the best cost-per-request metrics. Teams competed to optimize, and the program saved over $2 million in its first year. The leaderboard cost nothing to build — it was just a Grafana dashboard.


A FinOps guild (or community of practice) is a cross-functional group that drives cost awareness across the organization. Unlike a centralized FinOps team that does the work, a guild enables others to do the work.

FinOps Guild Structure:
┌─────────────────────────────────────────────────────┐
│ │
│ FinOps Lead (1 person, dedicated) │
│ ├── Finance Partner (CFO/FP&A representative) │
│ ├── Engineering Ambassadors (1 per team) │
│ │ ├── Payments team ambassador │
│ │ ├── Search team ambassador │
│ │ ├── ML team ambassador │
│ │ ├── Platform team ambassador │
│ │ └── Data team ambassador │
│ ├── Infrastructure/SRE representative │
│ └── Product/Business representative │
│ │
│ Cadence: │
│ • Weekly: 15-min async cost update (Slack) │
│ • Monthly: 1-hour cost review meeting │
│ • Quarterly: Business-aligned cost planning │
│ │
└─────────────────────────────────────────────────────┘
RoleResponsibilityTime Commitment
FinOps LeadOwn the practice, tooling, dashboards, reportingFull-time or 50%
Finance PartnerTranslate cloud costs to business metrics, budgeting4-6 hours/month
Engineering AmbassadorsChampion cost awareness in their team, review reports2-3 hours/month
Infra/SRE RepTechnical optimization, tooling, automation8-10 hours/month
Product RepConnect cost to product value, prioritize optimization2-3 hours/month

Month 1: Foundation

  • Identify FinOps lead (can be part-time initially)
  • Recruit one ambassador per engineering team
  • Deploy basic cost visibility (OpenCost or Kubecost free tier)
  • Send first monthly cost report (showback)

Month 2-3: Awareness

  • Run first cost optimization sprint (pick top 5 waste items)
  • Create team-level dashboards
  • Start weekly Slack cost updates
  • Celebrate wins publicly (people love seeing “Team X saved $800/month”)

Month 4-6: Habits

  • Integrate cost into sprint planning (“this feature will cost ~$X/month”)
  • Launch cost anomaly alerts
  • Begin chargeback conversations
  • Train ambassadors on cost optimization techniques

The cheapest time to catch a cost problem is before it’s deployed. Infracost integrates into your Terraform/OpenTofu CI pipeline to estimate the cost of infrastructure changes before they’re applied.

Developer Workflow with Infracost:
┌──────────┐ ┌──────────┐ ┌──────────┐ ┌──────────┐
│ Developer│────▶│ PR with │────▶│ CI runs │────▶│ Cost │
│ changes │ │ Terraform│ │ Infracost│ │ estimate │
│ infra │ │ files │ │ diff │ │ in PR │
└──────────┘ └──────────┘ └──────────┘ └──────────┘
┌─────────▼──────────┐
│ "This PR increases │
│ monthly cost by │
│ $342 (+18%)" │
│ │
│ ✅ Under threshold │
│ ❌ Blocks if >10% │
└────────────────────┘
## 💰 Infracost Cost Estimate
| Project | Previous | New | Diff |
|---------|----------|-----|------|
| production/eks | $1,847/mo | $2,189/mo | +$342 (+18.5%) |
### Cost Breakdown
| Resource | Monthly Cost | Change |
|----------|-------------|--------|
| aws_eks_node_group.workers | $1,420 → $1,762 | +$342 |
| (added 2× m6i.xlarge nodes) | | |
### Details
- Node group scaled from 5 to 7 instances
- Instance type: m6i.xlarge ($140.16/mo each)
- Consider: Could Karpenter autoscaling handle this instead?
⚠️ **This PR exceeds the 10% cost increase threshold.**
Approval from @finops-team required.

GitHub Actions: Infracost with Cost Threshold

Section titled “GitHub Actions: Infracost with Cost Threshold”
.github/workflows/infracost.yml
name: Infracost Cost Check
on:
pull_request:
paths:
- 'terraform/**'
- '*.tf'
permissions:
contents: read
pull-requests: write
jobs:
infracost:
runs-on: ubuntu-latest
name: Infracost Cost Estimate
steps:
- name: Checkout base branch
uses: actions/checkout@v4
with:
ref: ${{ github.event.pull_request.base.sha }}
- name: Setup Infracost
uses: infracost/actions/setup@v3
with:
api-key: ${{ secrets.INFRACOST_API_KEY }}
- name: Generate base cost
run: |
infracost breakdown \
--path=terraform/ \
--format=json \
--out-file=/tmp/infracost-base.json
- name: Checkout PR branch
uses: actions/checkout@v4
- name: Generate PR cost diff
run: |
infracost diff \
--path=terraform/ \
--compare-to=/tmp/infracost-base.json \
--format=json \
--out-file=/tmp/infracost-diff.json
- name: Post PR comment
uses: infracost/actions/comment@v3
with:
path: /tmp/infracost-diff.json
behavior: update
- name: Check cost threshold
run: |
# Extract the cost difference (positive = increase, negative = decrease)
DIFF=$(cat /tmp/infracost-diff.json | \
jq -r '.diffTotalMonthlyCost // "0"')
BASE=$(cat /tmp/infracost-diff.json | \
jq -r '.pastTotalMonthlyCost // "1"')
# Only check increases (skip if cost decreased)
if [ "$(echo "$DIFF <= 0" | bc)" -eq 1 ]; then
echo "Cost change: \$${DIFF}/mo (decrease or zero). No threshold check needed."
exit 0
fi
# Calculate percentage increase
if [ "$(echo "$BASE > 0" | bc)" -eq 1 ]; then
PCT=$(echo "scale=1; ($DIFF / $BASE) * 100" | bc)
else
PCT="0"
fi
echo "Cost increase: ${PCT}%"
# Fail if cost increase exceeds 10%
THRESHOLD=10
if [ "$(echo "$PCT > $THRESHOLD" | bc)" -eq 1 ]; then
echo "::error::Cost increase of ${PCT}% exceeds ${THRESHOLD}% threshold"
echo "This PR requires FinOps team approval."
exit 1
fi
echo "Cost change within threshold."
.gitlab-ci.yml
infracost:
stage: validate
image: infracost/infracost:ci-latest
script:
- git checkout $CI_MERGE_REQUEST_TARGET_BRANCH_NAME
- infracost breakdown --path=terraform/ --format=json --out-file=base.json
- git checkout $CI_COMMIT_SHA
- infracost diff --path=terraform/ --compare-to=base.json --format=json --out-file=diff.json
- infracost comment gitlab --path=diff.json --gitlab-token=$GITLAB_TOKEN
--repo=$CI_PROJECT_PATH --merge-request=$CI_MERGE_REQUEST_IID
--behavior=update
rules:
- if: '$CI_PIPELINE_SOURCE == "merge_request_event"'
changes:
- terraform/**/*

Anomaly detection catches unexpected cost spikes before they become budget-busting surprises.

Anomaly TypeExampleDetection Method
SpikeCost jumps 40% day-over-dayThreshold alerting
TrendGradual 5% weekly increase for 6 weeksTrend analysis
New resourceUnfamiliar service appears on billService allowlist
Rate changeRI expires, On-Demand kicks inCommitment monitoring
Data transferCross-region traffic surgeNetwork flow analysis
# Terraform: AWS Cost Anomaly Detection
resource "aws_ce_anomaly_monitor" "service_monitor" {
name = "service-cost-monitor"
monitor_type = "DIMENSIONAL"
monitor_dimension = "SERVICE"
}
resource "aws_ce_anomaly_monitor" "team_monitor" {
name = "team-cost-monitor"
monitor_type = "CUSTOM"
monitor_specification = jsonencode({
And = null
Or = null
Not = null
Tags = {
Key = "team"
Values = []
MatchOptions = ["ABSENT"]
}
})
}
resource "aws_ce_anomaly_subscription" "alerts" {
name = "cost-anomaly-alerts"
frequency = "DAILY"
monitor_arn_list = [
aws_ce_anomaly_monitor.service_monitor.arn,
aws_ce_anomaly_monitor.team_monitor.arn,
]
subscriber {
type = "EMAIL"
address = "finops-team@company.com"
}
subscriber {
type = "SNS"
address = aws_sns_topic.cost_alerts.arn
}
threshold_expression {
dimension {
key = "ANOMALY_TOTAL_IMPACT_ABSOLUTE"
values = ["100"] # Alert if impact > $100
match_options = ["GREATER_THAN_OR_EQUAL"]
}
}
}

Prometheus-Based Anomaly Alerting (Kubernetes)

Section titled “Prometheus-Based Anomaly Alerting (Kubernetes)”
# PrometheusRule for cost anomaly alerts
apiVersion: monitoring.coreos.com/v1
kind: PrometheusRule
metadata:
name: finops-anomaly-alerts
namespace: monitoring
spec:
groups:
- name: finops.anomalies
rules:
# Alert when namespace cost increases >30% vs 7-day average
- alert: CostAnomalySpike
expr: |
(
sum by (namespace) (
kubecost_allocation_cpu_cost + kubecost_allocation_memory_cost
)
/
avg_over_time(
(sum by (namespace) (
kubecost_allocation_cpu_cost + kubecost_allocation_memory_cost
))[7d:1h]
)
) > 1.3
for: 2h
labels:
severity: warning
team: finops
annotations:
summary: "Cost anomaly: {{ $labels.namespace }} cost increased >30%"
description: "Namespace {{ $labels.namespace }} cost is {{ $value | humanize }}x its 7-day average."
# Alert when new untagged resources appear
- alert: UntaggedResourcesGrowing
expr: |
count(kube_pod_labels{label_team=""}) > 10
for: 1h
labels:
severity: warning
annotations:
summary: "{{ $value }} pods running without team label"

Making cost optimization fun and competitive is surprisingly effective.

Monthly Cloud Efficiency Leaderboard:
┌──────────────────────────────────────────────────────────┐
│ │
│ 🏆 Team Rankings (Cost per 1K Requests) │
│ │
│ 1. Search Team $0.031/1K req ↓ 22% vs last mo │
│ 2. Payments Team $0.044/1K req ↓ 15% vs last mo │
│ 3. Platform Team $0.052/1K req ↓ 8% vs last mo │
│ 4. ML Pipeline $0.189/1K req ↓ 3% vs last mo │
│ 5. Data Team $0.210/1K req ↑ 5% vs last mo │
│ │
│ Biggest Improver: Search Team (-22%!) │
│ Most Improved Resource: search-indexer rightsizing │
│ │
│ Total org savings this month: $4,280 │
└──────────────────────────────────────────────────────────┘
IdeaHow It WorksCost to Implement
Efficiency LeaderboardRank teams by cost-per-unit metric monthlyGrafana dashboard (free)
Savings SpotlightFeature one team’s optimization in company Slack5 minutes/week
Cost Champion BadgeBadge on internal profile for completing FinOps trainingInternal tool update
Optimization Bounty$50 gift card for each $500+/month savings identifiedMinimal vs savings
Waste Hunter WeekDedicated week where teams compete to find/fix wasteTeam time only
CFO ShoutoutCFO mentions top saver in all-handsFree, high impact
Anti-patternWhy It Fails
Publicly shame high-spending teamsCreates resentment, hides real costs
Reward absolute spend reductionPenalizes growing teams, incentivizes under-investing
Individual bonuses for savingsCreates perverse incentives (cut needed resources)
Mandatory cost reports without contextInformation without action = noise
Punishing cost overrunsTeams stop experimenting, innovation dies

Budget Structure:
┌──────────────────────────────────────────────────────────┐
│ │
│ Organization Budget: $180,000/month │
│ │ │
│ ├── Production: $130,000/month (72%) │
│ │ ├── Payments: $32,000 │
│ │ ├── Search: $41,000 │
│ │ ├── Data: $28,000 │
│ │ ├── ML: $18,000 │
│ │ └── Platform: $11,000 │
│ │ │
│ ├── Non-Production: $35,000/month (19%) │
│ │ ├── Staging: $15,000 │
│ │ ├── Dev: $12,000 │
│ │ └── Sandbox: $8,000 │
│ │ │
│ └── Shared/Platform: $15,000/month (9%) │
│ ├── Networking: $6,000 │
│ ├── Monitoring: $5,000 │
│ └── Security: $4,000 │
│ │
│ Buffer: 10% ($18,000) for unexpected growth │
└──────────────────────────────────────────────────────────┘
# Terraform: AWS Budget with alerts at multiple thresholds
resource "aws_budgets_budget" "team_payments" {
name = "payments-team-monthly"
budget_type = "COST"
limit_amount = "32000"
limit_unit = "USD"
time_unit = "MONTHLY"
cost_filter {
name = "TagKeyValue"
values = ["user:team$payments"]
}
notification {
comparison_operator = "GREATER_THAN"
threshold = 75
threshold_type = "PERCENTAGE"
notification_type = "FORECASTED"
subscriber_email_addresses = ["payments-lead@company.com"]
}
notification {
comparison_operator = "GREATER_THAN"
threshold = 90
threshold_type = "PERCENTAGE"
notification_type = "ACTUAL"
subscriber_email_addresses = [
"payments-lead@company.com",
"finops@company.com"
]
}
notification {
comparison_operator = "GREATER_THAN"
threshold = 100
threshold_type = "PERCENTAGE"
notification_type = "ACTUAL"
subscriber_email_addresses = [
"payments-lead@company.com",
"finops@company.com",
"vp-engineering@company.com"
]
}
}
TechniqueAccuracyComplexityBest For
Linear extrapolationLow-mediumSimpleStable, predictable workloads
Moving averageMediumSimpleGradually changing workloads
Seasonal decompositionHighMediumWorkloads with weekly/monthly patterns
ML-based (Prophet, etc.)HighComplexLarge datasets with multiple signals
Commitment-adjustedHighMediumIncorporating RI/SP changes

Engineers think in CPU cores and gigabytes. Finance thinks in dollars, margins, and ROI. You need to translate.

Engineering MetricFinance Translation
”We need 20 more CPU cores""We need $1,400/month more compute to handle 30% growth"
"Our p99 latency improved 40ms""Response time improvement retains ~2% more users ($50K ARR)"
"We rightsized 30 pods""We reduced infrastructure cost by $2,100/month without service impact"
"We migrated to Spot""We reduced compute costs 65% with <0.1% availability impact"
"Our cluster utilization is 18%""We’re paying for 5.5x more capacity than we use — $110K annual waste”
Monthly FinOps Review — March 2026
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
Executive Summary:
• Total cloud spend: $172,400 (budget: $180,000) ✅ Under budget
• Month-over-month change: +3.2% ($5,400)
• Cost per customer: $0.68 (target: <$0.75) ✅
• Savings realized this month: $8,300
Key Wins:
• Payments team rightsizing: -$2,800/month (ongoing)
• gp2 → gp3 migration: -$1,400/month (one-time, permanent)
• Spot adoption for CI/CD: -$4,100/month (ongoing)
Risks:
• ML training costs growing 12% month-over-month (investigating)
• Untagged resources: $7,200/month (down from $9,800 — improving)
Next Month Focus:
• Complete cross-AZ traffic optimization (est. -$3,000/month)
• Begin Reserved Instance planning for Q3 commitments
• Launch team cost dashboards in Backstage
┌──────────────────────────────────────────────────────────┐
│ KPI │ This Month │ Last Month │ Trend│
├─────────────────────────┼────────────┼────────────┼──────┤
│ Total cloud spend │ $172,400 │ $167,000 │ ↑ │
│ Budget variance │ -$7,600 │ -$11,200 │ ↑ │
│ Cost per customer │ $0.68 │ $0.71 │ ↓ │
│ Revenue per customer │ $4.20 │ $4.15 │ ↑ │
│ Infra as % of revenue │ 16.2% │ 17.1% │ ↓ │
│ Savings this month │ $8,300 │ $5,100 │ ↑ │
│ RI/SP utilization │ 87% │ 82% │ ↑ │
│ Tagging compliance │ 91% │ 88% │ ↑ │
│ Waste (est.) │ $22,100 │ $29,400 │ ↓ │
└──────────────────────────────────────────────────────────┘

MistakeWhy It HappensHow to Fix It
Making FinOps = cost cuttingPressure from leadershipFrame as “cost efficiency” and unit economics, not austerity
No engineering buy-inFinOps imposed top-downStart with showback, empower teams with data
Overcomplicating cost modelsPerfectionism in allocation80% accuracy is fine to start; iterate
Ignoring non-production costs”It’s just dev”Dev/staging can be 20-30% of total spend
No automation in CI/CDManual reviews don’t scaleDeploy Infracost and automate cost checks
Annual budgets for cloudCapEx thinking applied to OpExMonthly budgets with quarterly re-forecasting
FinOps team owns all optimizationBottleneck, single point of failureDecentralize with ambassadors, centralize tooling
No celebration of winsOptimization feels like a chorePublic recognition, leaderboards, CFO shoutouts

Your company’s FinOps practice has been running for 6 months but engineers still don’t look at cost dashboards. What would you change?

Show Answer

The dashboards are passive — engineers won’t seek them out. Instead, bring costs to where engineers already work:

  1. Embed cost in existing tools — Add cost data to the service catalog (Backstage), deployment pipeline, or PR comments
  2. Make it personal — Show each team their cost in their Slack channel weekly (automated)
  3. Make it actionable — Don’t just show cost; show specific optimization recommendations with estimated savings
  4. Make it competitive — Create a leaderboard ranking teams by cost-per-unit metrics
  5. Make it consequential — Start lightweight chargeback so costs affect team budgets
  6. Recruit ambassadors — One person per team who champions cost awareness from within

The goal is to make cost unavoidable, not optional.

A developer opens a PR that adds 3 new EC2 instances to a Terraform module. Infracost estimates a $420/month increase (+18%). Your policy blocks PRs with >10% increase. How should you handle this?

Show Answer

The block is working as designed — it creates a conversation, not a hard stop. The right process:

  1. Developer explains the need in the PR description: “Adding 3 nodes for the new search indexing service launching next sprint”
  2. FinOps review checks: Is the instance type optimal? Could Karpenter autoscaling handle this? Are Spot instances an option?
  3. Approve with conditions if justified: “Approved, but use c6g.large (ARM) instead of m6i.large for 20% savings, and enable Spot for 2 of 3 nodes”
  4. Track the exception in the FinOps review log for the monthly business review

The threshold isn’t meant to prevent all cost increases — growth requires spending. It’s meant to ensure every significant increase is intentional and optimized.

How do you translate “we migrated 40 pods to Spot instances” into language a CFO understands?

Show Answer

CFO translation: “We reduced compute costs by $6,200 per month ($74,400 annually) by using AWS’s spare capacity pricing for workloads that can tolerate brief interruptions. Service availability remained at 99.97%. This optimization requires no ongoing effort — it’s automatic and permanent. The annualized savings exceed one mid-level engineer’s fully loaded cost.”

Key elements for finance communication:

  • Dollar amounts (not CPU cores or pod counts)
  • Annualized impact (finance thinks in years, not months)
  • Risk context (availability wasn’t impacted)
  • Effort required (one-time change, no ongoing cost)
  • Relatable comparison (equivalent to a headcount)

What’s the difference between a FinOps team and a FinOps guild?

Show Answer

A FinOps team is a centralized group that owns and executes cost optimization. They analyze bills, make recommendations, implement changes, and report to leadership. Risk: becomes a bottleneck, engineering teams don’t develop cost awareness.

A FinOps guild (community of practice) is a cross-functional group that enables cost optimization across the organization. A small core team provides tooling, data, and frameworks, while ambassadors embedded in each engineering team drive local optimization. Risk: requires more coordination, depends on volunteer engagement.

Best practice: Start with a small FinOps team (1-2 people) for tooling and data, then build a guild structure around them. The team provides the “centralized tooling,” and the guild provides the “decentralized ownership.”

Why is gamification effective for FinOps, and what metrics should you use for a leaderboard?

Show Answer

Gamification works because it makes cost optimization visible, competitive, and social — turning an abstract duty into a tangible game. Engineers are naturally competitive, and public leaderboards create peer pressure to optimize.

Good leaderboard metrics (efficiency, not raw spend):

  • Cost per 1,000 requests (or cost per transaction)
  • Cost per customer served
  • Percentage improvement month-over-month
  • Resource utilization percentage (higher = better)

Bad leaderboard metrics (penalize growth):

  • Total spend (punishes high-growth teams)
  • Absolute cost reduction (penalizes already-optimized teams)
  • Number of resources (some workloads legitimately need more)

Always rank by efficiency metrics, never by absolute metrics. A team spending $50K/month serving 10M users is more efficient than a team spending $5K/month serving 100K users.


Hands-On Exercise: Infracost in a Terraform Pipeline

Section titled “Hands-On Exercise: Infracost in a Terraform Pipeline”

Set up Infracost to estimate costs of Terraform changes and enforce a cost threshold.

Terminal window
# macOS
brew install infracost
# Linux
curl -fsSL https://raw.githubusercontent.com/infracost/infracost/master/scripts/install.sh | sh
# Verify installation
infracost --version
Terminal window
# Register for a free API key (required for pricing data)
infracost auth login
# Or set manually if you already have one
# export INFRACOST_API_KEY="ico-your-key-here"
Terminal window
mkdir -p ~/finops-infracost-lab && cd ~/finops-infracost-lab
cat > main.tf << 'EOF'
terraform {
required_providers {
aws = {
source = "hashicorp/aws"
version = "~> 5.0"
}
}
}
provider "aws" {
region = "us-east-1"
}
# Base infrastructure — current state
resource "aws_instance" "api_server" {
count = 3
ami = "ami-0c55b159cbfafe1f0"
instance_type = "m6i.large"
tags = {
Name = "api-server-${count.index}"
team = "payments"
environment = "production"
}
}
resource "aws_instance" "worker" {
count = 2
ami = "ami-0c55b159cbfafe1f0"
instance_type = "c6i.xlarge"
tags = {
Name = "worker-${count.index}"
team = "payments"
environment = "production"
}
}
resource "aws_db_instance" "database" {
allocated_storage = 100
engine = "postgres"
engine_version = "15.4"
instance_class = "db.r6g.large"
db_name = "payments"
skip_final_snapshot = true
tags = {
team = "payments"
environment = "production"
}
}
resource "aws_ebs_volume" "data" {
count = 3
availability_zone = "us-east-1a"
size = 200
type = "gp3"
tags = {
team = "payments"
environment = "production"
}
}
EOF
echo "Base Terraform configuration created."
Terminal window
cd ~/finops-infracost-lab
# Generate baseline cost estimate
infracost breakdown --path=. --format=table
# Save JSON for later comparison
infracost breakdown --path=. --format=json --out-file=/tmp/base-cost.json
echo ""
echo "Base cost estimate generated."
Terminal window
cd ~/finops-infracost-lab
# Simulate a PR that adds more infrastructure
cat > scaling_change.tf << 'EOF'
# New resources added in this "PR"
resource "aws_instance" "search_server" {
count = 4
ami = "ami-0c55b159cbfafe1f0"
instance_type = "c6i.2xlarge"
tags = {
Name = "search-server-${count.index}"
team = "search"
environment = "production"
}
}
resource "aws_elasticache_cluster" "search_cache" {
cluster_id = "search-cache"
engine = "redis"
node_type = "cache.r6g.large"
num_cache_nodes = 3
parameter_group_name = "default.redis7"
}
resource "aws_nat_gateway" "main" {
allocation_id = "eipalloc-placeholder"
subnet_id = "subnet-placeholder"
tags = {
Name = "main-nat"
}
}
EOF
echo "Scaling change created (simulating a PR)."
Terminal window
cd ~/finops-infracost-lab
# Generate diff against baseline
infracost diff --path=. --compare-to=/tmp/base-cost.json --format=table
# Save JSON diff
infracost diff --path=. --compare-to=/tmp/base-cost.json --format=json --out-file=/tmp/cost-diff.json
cat > ~/finops-infracost-lab/check_threshold.sh << 'SCRIPT'
#!/bin/bash
THRESHOLD_PCT=10
DIFF_FILE="/tmp/cost-diff.json"
if [ ! -f "$DIFF_FILE" ]; then
echo "ERROR: Diff file not found. Run infracost diff first."
exit 1
fi
# Extract costs
PAST_COST=$(cat "$DIFF_FILE" | python3 -c "
import json, sys
data = json.load(sys.stdin)
past = data.get('pastTotalMonthlyCost', '0')
print(past if past else '0')
" 2>/dev/null)
NEW_COST=$(cat "$DIFF_FILE" | python3 -c "
import json, sys
data = json.load(sys.stdin)
new = data.get('totalMonthlyCost', '0')
print(new if new else '0')
" 2>/dev/null)
DIFF_COST=$(cat "$DIFF_FILE" | python3 -c "
import json, sys
data = json.load(sys.stdin)
diff = data.get('diffTotalMonthlyCost', '0')
print(diff if diff else '0')
" 2>/dev/null)
echo "============================================"
echo " Cost Threshold Check"
echo "============================================"
echo ""
echo "Previous monthly cost: \$$PAST_COST"
echo "New monthly cost: \$$NEW_COST"
echo "Difference: \$$DIFF_COST"
if [ "$(echo "$PAST_COST > 0" | bc)" -eq 1 ]; then
PCT=$(echo "scale=1; ($DIFF_COST / $PAST_COST) * 100" | bc 2>/dev/null)
echo "Percentage change: ${PCT}%"
echo "Threshold: ${THRESHOLD_PCT}%"
echo ""
if [ "$(echo "$PCT > $THRESHOLD_PCT" | bc)" -eq 1 ]; then
echo "❌ FAILED: Cost increase of ${PCT}% exceeds ${THRESHOLD_PCT}% threshold."
echo ""
echo "Actions required:"
echo " 1. Review if all new resources are necessary"
echo " 2. Check for cheaper alternatives (Spot, ARM, smaller instances)"
echo " 3. Get FinOps team approval before merging"
exit 1
else
echo "✅ PASSED: Cost change within acceptable threshold."
fi
else
echo "No base cost for comparison (new infrastructure)."
echo "✅ PASSED: New infrastructure, no threshold to compare."
fi
SCRIPT
chmod +x ~/finops-infracost-lab/check_threshold.sh
bash ~/finops-infracost-lab/check_threshold.sh
Terminal window
rm -rf ~/finops-infracost-lab

You’ve completed this exercise when you:

  • Installed Infracost and obtained an API key
  • Created a base Terraform configuration with cost estimate
  • Added new resources simulating a cost-increasing PR
  • Generated a cost diff showing the increase
  • Ran the threshold check script and saw it fail (>10% increase)
  • Understood how this integrates into GitHub Actions or GitLab CI

  1. FinOps is 70% culture, 30% technology — tools without behavior change produce temporary savings
  2. Build a guild, not a team — decentralize ownership, centralize tooling
  3. Shift cost left into CI/CD — Infracost catches expensive changes before deployment
  4. Gamification works — leaderboards and recognition drive optimization naturally
  5. Speak finance’s language — dollars, margins, and ROI, not CPU cores and gigabytes
  6. Automate everything — anomaly detection, budget alerts, and cost thresholds prevent surprises

Books:

  • “Cloud FinOps” — J.R. Storment & Mike Fuller (O’Reilly, 2nd edition)
  • “Team Topologies” — Matthew Skelton & Manuel Pais (for org design that supports FinOps)

Tools:

  • Infracost — infracost.io (cost estimation for Terraform in CI/CD)
  • Backstage Cost Insights — backstage.io/docs/features/cost-insights (developer portal cost plugin)
  • AWS Cost Anomaly Detection — docs.aws.amazon.com/cost-management

Talks:

  • “Building a FinOps Culture” — FinOps X Conference (YouTube)
  • “How We Saved $5M with Cloud Cost Engineering” — Spotify, QCon (YouTube)
  • “Infracost: Cloud Cost Estimates for Terraform” — HashiConf (YouTube)

Technical FinOps without cultural FinOps is a project. Technical FinOps with cultural FinOps is a practice. By building a FinOps guild with engineering ambassadors, embedding cost into CI/CD pipelines via Infracost, automating anomaly detection and budget alerts, gamifying optimization through leaderboards, and communicating in finance’s language, organizations transform cloud cost management from a periodic cleanup into a continuous, self-sustaining discipline. The result: costs stay optimized not because someone is watching, but because everyone is.


You’ve completed the FinOps discipline track. To continue your learning:

  • Review the FinOps modules for a summary of all topics and additional resources
  • Apply what you’ve learned — start with Module 1.1’s exercise on your own cloud bill
  • Join the FinOps Foundation at finops.org for community, certifications, and frameworks
  • Explore related disciplines: SRE, Platform Engineering, GitOps

“Culture eats cost optimization for breakfast.” — Adapted from Peter Drucker