Skip to content

Platform Engineering

Principles, practices, and disciplines for running production systems on Kubernetes.

Kubernetes certifications teach you how to use Kubernetes. This track teaches you how to run production systems — the theory, disciplines, and leadership that separate operators from practitioners.

This is for people who:

  • Have Kubernetes fundamentals (or certifications)
  • Want to understand theory, not just tools
  • Need to make technology decisions at work
  • Want to implement best practices, not just pass exams

Looking for tool-specific guides? See Cloud Native Tools.


platform/
├── foundations/ # Theory that doesn't change (32 modules)
│ ├── systems-thinking/ # Mental models for complex systems
│ ├── reliability-engineering/ # Failure theory, redundancy, risk
│ ├── observability-theory/ # What to measure and why
│ ├── security-principles/ # Zero trust, threat modeling
│ ├── distributed-systems/ # CAP, consensus, consistency
│ ├── advanced-networking/ # Network theory, protocols, design
│ └── engineering-leadership/ # Technical leadership, org design
└── disciplines/ # Applied practices (81 modules)
├── core-platform/
│ ├── sre/ # Operations, reliability, on-call
│ ├── platform-engineering/ # Developer experience, self-service
│ └── platform-leadership/ # Strategy, adoption, evangelism
├── delivery-automation/
│ ├── release-engineering/ # Build, release, deploy lifecycle
│ ├── gitops/ # Deployment, reconciliation
│ └── iac/ # IaC patterns, testing, drift
├── reliability-security/
│ ├── networking/ # Network architecture, policy
│ ├── chaos-engineering/ # Failure injection, resilience
│ └── devsecops/ # Security integration, compliance
├── data-ai/
│ ├── data-engineering/ # Pipelines, streaming, storage
│ ├── mlops/ # ML lifecycle, model serving
│ ├── aiops/ # AI-driven operations
│ └── ai-infrastructure/ # GPU scheduling, model hosting
└── business-value/
└── finops/ # Cloud cost optimization

Theory that applies everywhere. Read these first — they don’t change.

TrackWhy Start Here
Systems ThinkingMental models for complex systems
Reliability EngineeringFailure theory, redundancy, risk
Distributed SystemsCAP, consensus, consistency
Observability TheoryWhat to measure and why
Security PrinciplesZero trust, threat modeling
Advanced NetworkingNetwork theory, protocols, design
Engineering LeadershipTechnical leadership, org design

Applied practices — how to do the work.

DisciplineModulesBest For
SRE7Operations, reliability, on-call
Platform Engineering6Developer experience, self-service
Platform Leadership5Strategy, adoption, evangelism
DisciplineModulesBest For
Release Engineering5Build, release, deploy lifecycle
GitOps6Deployment, reconciliation
Infrastructure as Code6IaC patterns, testing, drift management
DisciplineModulesBest For
Networking5Network architecture, policy, design
Chaos Engineering5Failure injection, resilience
DevSecOps7Security integration, compliance
DisciplineModulesBest For
Data Engineering6Pipelines, streaming, storage
MLOps6ML lifecycle, model serving
AIOps6AI-driven operations, automation
AI Infrastructure6GPU scheduling, model hosting
DisciplineModulesBest For
FinOps6Cloud cost optimization

Every module includes:

  • Why This Matters — Real-world motivation
  • Theory — Principles and mental models
  • Current Landscape — Tools that implement this
  • Hands-On — Practical implementation
  • Best Practices — What good looks like
  • Common Mistakes — Anti-patterns to avoid
  • Further Reading — Books, talks, papers

SectionModulesDescription
Foundations327 sections: Systems Thinking, Reliability Engineering, Observability Theory, Security Principles, Distributed Systems, Advanced Networking, Engineering Leadership
Disciplines8114 disciplines across Core Platform, Delivery & Automation, Reliability & Security, Data & AI, and Business Value
Total113

Tool-specific implementation guides (96 modules) are in Cloud Native Tools.


Before starting this track, you should have:

  • Kubernetes basics (or completed Prerequisites)
  • Some production experience (helpful but not required)
  • Curiosity about “why” not just “how”

“Tools change. Principles don’t.”