Skip to content

AI/ML Engineering

AI/ML Engineering Track | 60+ Modules | 14 Phases | ~230-310 hours

A complete curriculum for engineers building AI and ML systems in production. Covers everything from AI-native development with Claude Code, through generative AI and RAG, to deep learning, machine learning, reinforcement learning, MLOps, and AI infrastructure on Kubernetes.

This track is for engineers who need to understand AI/ML deeply enough to build, deploy, and operate it — not just call APIs.

If your goal is AI literacy, safer AI use, practical AI work habits, or a bridge from AI use into AI product thinking rather than system building, start with the top-level AI track first.

This track assumes you are now moving past AI literacy and bridge content.

It does not try to reteach the top-level AI track’s main job:

  • beginner AI literacy
  • basic prompting habits
  • general-use trust and verification habits
  • introductory workflow discipline
  • lightweight practitioner bridge material

Those live in:

This track starts where the work becomes engineering:

  • reproducible environments
  • local and remote runtimes as systems
  • framework implementation
  • model behavior in depth
  • deployment and operations
  • you still need basic terminal, Git, or software-installation confidence
  • Python environments and local tooling still drift out of control for you
  • you want platform or infrastructure depth before you understand AI/ML workflows themselves

In those cases, strengthen Prerequisites or the AI/ML Prerequisites phase first.

If you need a non-engineering front door first, use AI Foundations, AI-Native Work, AI Building, and Open Models & Local Inference before returning here.

If you are unsure where to begin, use one of these entry routes:

GoalStart WithThen Go To
Build AI apps with strong engineering habitsPrerequisitesAI-Native Development -> Generative AI -> Vector Search & RAG
Learn local-first AI from a laptop or workstationPrerequisitesAI-Native Development -> AI Infrastructure -> Advanced GenAI & Safety
Move into MLOps / AI platform workPrerequisitesMLOps & LLMOps -> AI Infrastructure -> Platform Engineering: Data & AI
Build LLM-native apps on KubernetesVector Search & RAG + AI InfrastructureSynthesis Apps -> Advanced GenAI & Safety
Understand model training and tuning deeplyGenerative AIDeep Learning Foundations -> Advanced GenAI & Safety
  • it treats local-first and home-scale AI as valid starting points
  • it teaches the bridge from notebooks to reproducible systems explicitly
  • it links application engineering, model work, and infrastructure instead of treating them as separate worlds
  • it cross-links into platform and on-prem sections instead of duplicating advanced ops material

The track is organized as one main spine with several valid learner routes.

#PhaseFocus
0PrerequisitesEnvironment setup, Python, dev tools
1AI-Native DevelopmentClaude Code, Cursor, prompt engineering, AI coding agents
2Generative AILLMs, tokenization, embeddings, text generation, reasoning models
3Vector Search & RAGVector spaces, vector databases, RAG patterns, long-context
4Frameworks & AgentsLangChain, LangGraph, LlamaIndex, agentic AI, MCP
5MLOps & LLMOpsKubernetes for ML, experiment tracking, pipelines, deployment
6AI InfrastructureCloud management, AIOps, vLLM, GPU scheduling
7Synthesis AppsBuilding LLM-native applications from inference, vector memory, orchestration, failure handling, and production gates
8Advanced GenAI & SafetyFine-tuning, RLHF, diffusion, alignment, red teaming, evaluation
9Multimodal AISpeech, vision, video, native multimodal models
10Deep Learning FoundationsPyTorch, neural networks, CNNs, transformers, backprop
11Machine LearningTabular ML practitioner essentials: sklearn API, regression, evaluation, feature engineering, trees, boosting, clustering, anomaly detection, dimensionality reduction, HPO, time series — plus Tier-2 imbalance, interpretability, recommenders, conformal prediction, fairness, causal inference
12Reinforcement LearningRL practitioner foundations (PPO/DQN/SAC, SB3, Gymnasium) and offline RL / imitation learning
AHistory of AI/MLHistorical context (appendix)

For most learners, this is the safest progression:

Prerequisites
|
AI-Native Development
|
Generative AI
|
Vector Search & RAG
|
Frameworks & Agents
|
MLOps & LLMOps
|
AI Infrastructure
|
Synthesis Apps

After that, branch based on interest:

  • AI application builder: Prerequisites -> AI-Native Development -> Generative AI -> Vector Search & RAG
  • Local-first builder: Prerequisites -> AI-Native Development -> AI Infrastructure -> Advanced GenAI
  • MLOps / AI platform: Prerequisites -> MLOps & LLMOps -> AI Infrastructure -> Platform Data & AI
  • Model-focused learner: Generative AI -> Deep Learning Foundations -> Advanced GenAI
If your blocker is…Go to…Why
weak cluster, YAML, or workload fundamentalsKubernetes Certificationsthe MLOps and infrastructure phases assume real Kubernetes comfort
reproducibility, deployment workflow, service ownership, or team operationsPlatform Engineeringthat is no longer just app-building; it is platform and operations work
private GPUs, air-gapped environments, datacenter economics, or bare-metal servingOn-Premiseslocal-first intuition does not automatically transfer to private infrastructure
Linux process, package, and service management painLinuxa surprising amount of AI/ML failure is really systems failure
  • AI/ML Engineers building production ML systems
  • Platform Engineers supporting ML workloads on Kubernetes
  • Backend Engineers integrating LLMs and generative AI into products
  • MLOps Specialists operating model pipelines at scale
  • DevOps Engineers moving into AI infrastructure roles
  • Programming: Python proficiency required (Phase 0 covers setup)
  • Kubernetes basics: helpful for MLOps phases — see CKA track if needed
  • Linux fundamentals: see Linux track if needed
  • Math intuition: linear algebra and statistics helpful for deep learning phases
  • go to Platform Engineering when your main problem becomes operating systems and teams, not just building models or apps
  • go to On-Premises when local-first or private AI work grows into real private infrastructure concerns
  • go to Kubernetes Certifications if your MLOps path is blocked by weak cluster fundamentals
  • starting in advanced infrastructure before local environments and workflows are reproducible
  • treating notebooks as a permanent workflow when the real problem is packaging, deployment, or operations
  • jumping to private AI infrastructure because the hardware sounds interesting before the application path is solid

“The best AI engineers understand both the model and the infrastructure it runs on.”