Skip to content

On-Premises Kubernetes

12 modules are currently being reworked. Watch this section over the next few days.

Run Kubernetes where the cloud can’t go.

Not every workload belongs in the cloud. Data sovereignty, latency requirements, regulatory constraints, and economics drive enterprises to run Kubernetes on their own hardware. This track covers everything from datacenter planning to day-2 operations — the knowledge that most free resources skip because it’s not glamorous, but it’s where a massive share of production Kubernetes actually runs.


Planning & Economics (5 modules)
Bare Metal Provisioning (4 modules)
├── Networking (6 modules)
├── Storage (5 modules)
└── Multi-Cluster & Platform (5 modules)
Security & Compliance (8 modules)
Day-2 Operations (9 modules)
Resilience & Migration (3 modules)
AI/ML Infrastructure (6 modules)
SectionModulesFocus
Planning & Economics5Server sizing, cluster topology, TCO, cloud vs on-prem, FinOps & chargeback
Bare Metal Provisioning4PXE, MAAS, Talos, Sidero/Metal3
Networking6Spine-leaf, BGP, MetalLB, DNS/certs, cross-cluster, service mesh
Storage5Ceph/Rook, local storage, object storage (MinIO), database operators
Multi-Cluster & Platform5vSphere/OpenStack, vCluster/Kamaji, Cluster API, fleet management, active-active
Security & Compliance8Air-gapped, HSM/TPM, AD/LDAP, SPIFFE, Vault, policy-as-code, zero-trust
Day-2 Operations9Upgrades, firmware, observability, capacity, self-hosted CI/CD & registry, serverless
Resilience & Migration3Multi-site DR, hybrid connectivity, cloud repatriation
AI/ML Infrastructure6GPU nodes, private training, LLM serving, MLOps, AIOps, HPC storage

51 modules total (30 existing + 21 new from #197). From “should we go on-prem?” to “how do we train LLMs on our own GPUs.”


This is an advanced track. You should already be comfortable with:

  • core Kubernetes concepts and troubleshooting
  • Linux networking, storage, and security basics
  • day-2 operational thinking such as upgrades, observability, and failure handling

If you are not there yet, strengthen those areas first through Prerequisites, Linux, Kubernetes Certifications, and optionally Platform Engineering.

Prerequisites
|
Linux
|
CKA-level Kubernetes understanding
|
Platform / SRE thinking
|
On-Premises

You do not need to finish every module in those tracks first, but you do need the operational maturity they represent.

  • you still struggle with Linux networking, storage, or service management basics
  • you have not yet operated Kubernetes under failure, upgrade, or capacity pressure
  • you mainly want a simpler first cluster rather than private-platform design

If those are true, stay in Prerequisites, Linux, or Kubernetes Certifications a bit longer.

Coming fromSafest bridgeWhat to prove before going deeper
Kubernetes CertificationsCKA -> Linux depth -> On-Prem Planning/Provisioningkubeadm, networking, storage, and troubleshooting are not fragile for you
Cloudmanaged Kubernetes -> Architecture Patterns -> On-Prem Planningyou can reason about tradeoffs without assuming the cloud will always provide the control plane around you
Platform EngineeringSRE / GitOps / Networking -> On-Prem Operationsyou already think in day-2 systems, not only cluster setup
AI/ML Engineeringlocal-first AI -> AI Infrastructure -> On-Prem AI/ML Infrastructureyou understand the jump from a workstation or notebook workflow to private GPU fleet operations

  • Infrastructure engineers building private Kubernetes platforms
  • Platform teams evaluating on-prem vs cloud for their organization
  • SREs operating bare metal or private cloud Kubernetes clusters
  • Architects designing multi-site, air-gapped, or hybrid environments
  • Budget owners calculating TCO and making build-vs-buy decisions
  • go to Platform Engineering if you want deeper reliability, delivery, and platform-discipline context around private infrastructure
  • go to AI/ML Engineering if your on-prem goal includes private model training or local-first AI systems
  • go back to Cloud if you need a cleaner contrast between managed-cloud assumptions and private-infrastructure tradeoffs before committing to this path