AI/ML Engineering

AI/ML Engineering Track | 120+ Modules | 14 Phases | ~230-310 hours

Overview

A complete curriculum for engineers building AI and ML systems in production. Covers everything from AI-native development with Claude Code, through generative AI and RAG, to deep learning, machine learning, reinforcement learning, MLOps, and AI infrastructure on Kubernetes.

This track is for engineers who need to understand AI/ML deeply enough to build, deploy, and operate it — not just call APIs.

If your goal is AI literacy, safer AI use, practical AI work habits, or a bridge from AI use into AI product thinking rather than system building, start with the top-level AI track first.

What This Track Does Not Repeat

This track assumes you are now moving past AI literacy and bridge content.

It does not try to reteach the top-level AI track’s main job:

beginner AI literacy
basic prompting habits
general-use trust and verification habits
introductory workflow discipline
lightweight practitioner bridge material

Those live in:

This track starts where the work becomes engineering:

reproducible environments
local and remote runtimes as systems
framework implementation
model behavior in depth
deployment and operations

Do Not Start Here First If

you still need basic terminal, Git, or software-installation confidence
Python environments and local tooling still drift out of control for you
you want platform or infrastructure depth before you understand AI/ML workflows themselves

In those cases, strengthen Prerequisites or the AI/ML Prerequisites phase first.

If you need a non-engineering front door first, use AI Foundations, AI-Native Work, AI Building, and Open Models & Local Inference before returning here.

Start Here

If you are unsure where to begin, use one of these entry routes:

Goal	Start With	Then Go To
Build AI apps with strong engineering habits	Prerequisites	AI-Native Development -> Generative AI -> Vector Search & RAG
Learn local-first AI from a laptop or workstation	Prerequisites	AI-Native Development -> AI Infrastructure -> Advanced GenAI & Safety
Move into MLOps / AI platform work	Prerequisites	MLOps & LLMOps -> AI Infrastructure -> Platform Engineering: Data & AI
Build LLM-native apps on Kubernetes	Vector Search & RAG + AI Infrastructure	Synthesis Apps -> Advanced GenAI & Safety
Understand model training and tuning deeply	Generative AI	Deep Learning Foundations -> Advanced GenAI & Safety

What Makes This Track Different

it treats local-first and home-scale AI as valid starting points
it teaches the bridge from notebooks to reproducible systems explicitly
it links application engineering, model work, and infrastructure instead of treating them as separate worlds
it cross-links into platform and on-prem sections instead of duplicating advanced ops material

Phases

The track is organized as one main spine with several valid learner routes.

#	Phase	Focus
0	Prerequisites	Environment setup, Python, dev tools
1	AI-Native Development	Claude Code, Cursor, prompt engineering, AI coding agents
2	Generative AI	LLMs, tokenization, embeddings, text generation, reasoning models
3	Vector Search & RAG	Vector spaces, vector databases, RAG patterns, long-context
4	Frameworks & Agents	LangChain, LangGraph, LlamaIndex, agentic AI, MCP
5	MLOps & LLMOps	Kubernetes for ML, experiment tracking, pipelines, deployment
6	AI Infrastructure	Cloud management, AIOps, vLLM, GPU scheduling
7	Synthesis Apps	Building LLM-native applications from inference, vector memory, orchestration, failure handling, and production gates
8	Advanced GenAI & Safety	Fine-tuning, RLHF, diffusion, alignment, red teaming, evaluation
9	Multimodal AI	Speech, vision, video, native multimodal models
10	Deep Learning Foundations	PyTorch, neural networks, CNNs, transformers, backprop
11	Machine Learning	Tabular ML practitioner essentials: sklearn API, regression, evaluation, feature engineering, trees, boosting, clustering, anomaly detection, dimensionality reduction, HPO, time series — plus Tier-2 imbalance, interpretability, recommenders, conformal prediction, fairness, causal inference
12	Reinforcement Learning	RL practitioner foundations (PPO/DQN/SAC, SB3, Gymnasium) and offline RL / imitation learning
A	History of AI/ML	Historical context (appendix); for the deep version, read the 73-chapter AI History book

Recommended Default Route

For most learners, this is the safest progression:

Prerequisites
   |
AI-Native Development
   |
Generative AI
   |
Vector Search & RAG
   |
Frameworks & Agents
   |
MLOps & LLMOps
   |
AI Infrastructure
   |
Synthesis Apps

After that, branch based on interest:

go to Advanced GenAI & Safety if you want tuning, evaluation, and alignment depth
go to Deep Learning Foundations if you want stronger model-building fundamentals
go to Platform Engineering: Data & AI if your focus is operating AI systems at scale

Common Persona Routes

AI application builder: Prerequisites -> AI-Native Development -> Generative AI -> Vector Search & RAG
Local-first builder: Prerequisites -> AI-Native Development -> AI Infrastructure -> Advanced GenAI
MLOps / AI platform: Prerequisites -> MLOps & LLMOps -> AI Infrastructure -> Platform Data & AI
Model-focused learner: Generative AI -> Deep Learning Foundations -> Advanced GenAI

When Another Track Becomes Necessary

If your blocker is…	Go to…	Why
weak cluster, YAML, or workload fundamentals	Kubernetes Certifications	the MLOps and infrastructure phases assume real Kubernetes comfort
reproducibility, deployment workflow, service ownership, or team operations	Platform Engineering	that is no longer just app-building; it is platform and operations work
private GPUs, air-gapped environments, datacenter economics, or bare-metal serving	On-Premises	local-first intuition does not automatically transfer to private infrastructure
Linux process, package, and service management pain	Linux	a surprising amount of AI/ML failure is really systems failure

Who This Is For

AI/ML Engineers building production ML systems
Platform Engineers supporting ML workloads on Kubernetes
Backend Engineers integrating LLMs and generative AI into products
MLOps Specialists operating model pipelines at scale
DevOps Engineers moving into AI infrastructure roles

Prerequisites

Programming: Python proficiency required (Phase 0 covers setup)
Kubernetes basics: helpful for MLOps phases — see CKA track if needed
Linux fundamentals: see Linux track if needed
Math intuition: linear algebra and statistics helpful for deep learning phases

Platform Engineering: Data & AI — production deployment of ML systems
CKA / CKAD — Kubernetes fundamentals for the MLOps phases
On-Premises — running ML infrastructure on bare metal

When To Leave This Track For Another One

go to Platform Engineering when your main problem becomes operating systems and teams, not just building models or apps
go to On-Premises when local-first or private AI work grows into real private infrastructure concerns
go to Kubernetes Certifications if your MLOps path is blocked by weak cluster fundamentals

Common Failure Modes In This Track

starting in advanced infrastructure before local environments and workflows are reproducible
treating notebooks as a permanent workflow when the real problem is packaging, deployment, or operations
jumping to private AI infrastructure because the hardware sounds interesting before the application path is solid

Good First Clicks

Home AI Workstation Fundamentals if you want a realistic local setup path
AI Coding Tools Landscape if you want to orient around the modern agent/tooling ecosystem
Building RAG Systems if your goal is practical LLM applications
Notebooks to Production for ML/LLMs if you already experiment in notebooks and need a more serious workflow
Local Inference Stack for Learners if you want to run models yourself without jumping straight into datacenter thinking

“The best AI engineers understand both the model and the infrastructure it runs on.”