Module 1.1: Container Images
Complexity:
[MEDIUM]- Requires understanding of Dockerfile and image registriesTime to Complete: 45-60 minutes
Prerequisites: Module 0.2 (Developer Workflow), basic Docker knowledge
Learning Outcomes
Section titled “Learning Outcomes”After completing this module, you will be able to:
- Analyze and optimize a Dockerfile that follows best practices for size, security, and layer caching
- Configure image pull policies and registry credentials for pod specifications
- Debug image pull errors including
ImagePullBackOffand authentication failures - Explain image tagging strategies and why
:latestis dangerous in production
Why This Module Matters
Section titled “Why This Module Matters”Kubernetes doesn’t run source code—it runs container images. Before any application reaches a cluster, it must be packaged into an image. The CKAD expects you to understand how images are built, tagged, pushed, and referenced.
While you won’t build complex images during the exam (no time), you need to:
- Understand Dockerfile basics
- Know image naming conventions
- Fix common image-related issues
- Modify existing images when needed
The Shipping Container Analogy
Before containerization, shipping goods was chaos. Each port handled cargo differently. Then came the standardized shipping container—same dimensions everywhere, stackable, works on any ship. Container images are the same idea for software. Your application, its dependencies, its config—all packaged into a standard format that runs identically everywhere.
Image Naming Convention
Section titled “Image Naming Convention”Understanding image names is critical. Every Kubernetes Pod spec references images:
[registry/][namespace/]image[:tag][@digest]| Component | Required | Example | Default |
|---|---|---|---|
| Registry | No | docker.io, gcr.io, quay.io | docker.io |
| Namespace | No | library, mycompany | library |
| Image | Yes | nginx, myapp | - |
| Tag | No | latest, 1.19.0, alpine | latest |
| Digest | No | sha256:abc123... | - |
Examples
Section titled “Examples”# Full specificationimage: docker.io/library/nginx:1.21.0
# Equivalent short form (docker.io/library implied)image: nginx:1.21.0
# Different registryimage: gcr.io/google-containers/nginx:1.21.0
# Custom namespaceimage: myregistry.com/myteam/myapp:v2.0.0
# With digest (immutable reference)image: nginx@sha256:abc123def456...
# Latest tag (avoid in production)image: nginx:latestimage: nginx # same as aboveWhy Tags Matter
Section titled “Why Tags Matter”# BAD: latest can change unexpectedlyimage: nginx:latest
# GOOD: specific version, reproducibleimage: nginx:1.21.0
# BETTER: specific version with Alpine base (smaller)image: nginx:1.21.0-alpineDockerfile Basics
Section titled “Dockerfile Basics”A Dockerfile defines how to build an image. CKAD may ask you to understand or modify simple Dockerfiles.
Minimal Dockerfile
Section titled “Minimal Dockerfile”# Base imageFROM python:3.9-slim
# Set working directoryWORKDIR /app
# Copy requirements first (layer caching)COPY requirements.txt .RUN pip install -r requirements.txt
# Copy application codeCOPY . .
# Expose port (documentation)EXPOSE 8080
# Command to runCMD ["python", "app.py"]Common Instructions
Section titled “Common Instructions”| Instruction | Purpose | Example |
|---|---|---|
FROM | Base image | FROM nginx:alpine |
WORKDIR | Set working directory | WORKDIR /app |
COPY | Copy files from build context | COPY src/ /app/ |
RUN | Execute command during build | RUN apt-get update |
ENV | Set environment variable | ENV PORT=8080 |
EXPOSE | Document port (doesn’t publish) | EXPOSE 8080 |
CMD | Default command to run | CMD ["nginx", "-g", "daemon off;"] |
ENTRYPOINT | Main executable | ENTRYPOINT ["python"] |
Pause and predict: In a Kubernetes Pod spec,
commandoverrides one Dockerfile instruction andargsoverrides another. Which is which? Many developers get this backwards. Think about it before reading the mapping below.
CMD vs ENTRYPOINT
Section titled “CMD vs ENTRYPOINT”# CMD: Easily overriddenFROM nginxCMD ["nginx", "-g", "daemon off;"]# Can run: docker run myimage sleep 10 (replaces CMD)
# ENTRYPOINT: Hard to overrideFROM pythonENTRYPOINT ["python"]CMD ["app.py"]# Runs: python app.py# Can run: docker run myimage script.py (only replaces CMD)In Kubernetes Pod specs:
ENTRYPOINTmaps tocommand:CMDmaps toargs:
spec: containers: - name: app image: python:3.9 command: ["python"] # Overrides ENTRYPOINT args: ["myapp.py"] # Overrides CMDBuilding Images
Section titled “Building Images”While you won’t build images in the exam environment (no Docker daemon), understanding the process helps debug issues.
Basic Build
Section titled “Basic Build”# Build in current directorydocker build -t myapp:v1.0.0 .
# Build with specific Dockerfiledocker build -t myapp:v1.0.0 -f Dockerfile.prod .
# Build with build argumentsdocker build --build-arg VERSION=1.0.0 -t myapp:v1.0.0 .Tagging and Pushing
Section titled “Tagging and Pushing”# Tag an existing imagedocker tag myapp:v1.0.0 myregistry.com/team/myapp:v1.0.0
# Push to registrydocker push myregistry.com/team/myapp:v1.0.0
# Push all tagsdocker push myregistry.com/team/myapp --all-tagsImage Pull Policy
Section titled “Image Pull Policy”Kubernetes decides when to pull images based on imagePullPolicy:
spec: containers: - name: app image: nginx:1.21.0 imagePullPolicy: Always # IfNotPresent | Never | Always| Policy | Behavior | Use When |
|---|---|---|
Always | Pull every time | Using latest tag, need freshest image |
IfNotPresent | Pull only if not cached | Specific tags, save bandwidth |
Never | Never pull, use cached | Local development, air-gapped |
Stop and think: If you specify
image: nginx(no tag) in a pod spec, whatimagePullPolicydoes Kubernetes use by default? What aboutimage: nginx:1.21.0? The defaults are different — why does that make sense?
Default Behavior
Section titled “Default Behavior”| Image Tag | Default Policy |
|---|---|
No tag (implies :latest) | Always |
:latest | Always |
Specific tag (:v1.0.0) | IfNotPresent |
Digest (@sha256:...) | IfNotPresent |
Private Registries
Section titled “Private Registries”To pull from private registries, you need authentication:
Step 1: Create a Secret
Section titled “Step 1: Create a Secret”# Create docker-registry secretk create secret docker-registry regcred \ --docker-server=myregistry.com \ --docker-username=user \ --docker-password=password \ --docker-email=user@example.comStep 2: Reference in Pod
Section titled “Step 2: Reference in Pod”apiVersion: v1kind: Podmetadata: name: private-appspec: containers: - name: app image: myregistry.com/team/myapp:v1.0.0 imagePullSecrets: - name: regcredAlternative: ServiceAccount Default
Section titled “Alternative: ServiceAccount Default”apiVersion: v1kind: ServiceAccountmetadata: name: myapp-saimagePullSecrets:- name: regcred---apiVersion: v1kind: Podmetadata: name: private-appspec: serviceAccountName: myapp-sa containers: - name: app image: myregistry.com/team/myapp:v1.0.0Troubleshooting Image Issues
Section titled “Troubleshooting Image Issues”Common Errors
Section titled “Common Errors”| Error | Cause | Solution |
|---|---|---|
ImagePullBackOff | Can’t pull image | Check image name, registry access |
ErrImagePull | Pull failed | Verify image exists, check credentials |
InvalidImageName | Malformed image reference | Fix image name format |
ImageInspectError | Image inspection failed | Check image manifest |
Debugging Steps
Section titled “Debugging Steps”# Check pod eventsk describe pod myapp | grep -A10 Events
# Check image namek get pod myapp -o jsonpath='{.spec.containers[0].image}'
# Verify secret existsk get secret regcred
# Test pull manually (if docker available)docker pull myregistry.com/team/myapp:v1.0.0What would happen if: A pod references a private registry image but has no
imagePullSecrets. The image exists and is correctly tagged. What error would you see, and how would you distinguish it from a simple typo in the image name?
Example: Fixing ImagePullBackOff
Section titled “Example: Fixing ImagePullBackOff”# Pod stuck in ImagePullBackOffk get pods# NAME READY STATUS RESTARTS AGE# myapp 0/1 ImagePullBackOff 0 5m
# Check eventsk describe pod myapp# Events:# Failed to pull image "nginx:latst": rpc error: ...not found
# Found it: typo in tag (latst instead of latest)
# Fix: Edit the pod or delete and recreatek delete pod myappk run myapp --image=nginx:latestImage Security Best Practices
Section titled “Image Security Best Practices”While not always tested, understanding these makes you a better developer:
1. Use Specific Tags
Section titled “1. Use Specific Tags”# BADimage: nginx:latest
# GOODimage: nginx:1.21.0-alpine2. Use Minimal Base Images
Section titled “2. Use Minimal Base Images”# 133MBFROM python:3.9
# 45MB - much smallerFROM python:3.9-slim
# 17MB - even smallerFROM python:3.9-alpine3. Run as Non-Root
Section titled “3. Run as Non-Root”FROM python:3.9-slimRUN useradd -m appuserUSER appuserCOPY --chown=appuser:appuser . /appIn Kubernetes:
spec: securityContext: runAsNonRoot: true runAsUser: 1000 containers: - name: app image: myapp:v1.0.04. Use Read-Only Filesystem
Section titled “4. Use Read-Only Filesystem”spec: containers: - name: app image: myapp:v1.0.0 securityContext: readOnlyRootFilesystem: true volumeMounts: - name: tmp mountPath: /tmp volumes: - name: tmp emptyDir: {}Did You Know?
Section titled “Did You Know?”-
Container images are layered. Each Dockerfile instruction creates a layer. Layers are cached and shared between images, saving disk space and build time. That’s why you put frequently changing content (like
COPY . .) at the end. -
The
latesttag is just a convention. It’s not actually “latest” by time—it’s whatever was last pushed without a specific tag. Many projects pushlatestwith each build, but some never update it. -
Image digests (sha256:…) are immutable. Tags can be moved to point to different images, but a digest always refers to the exact same image content. Use digests for maximum reproducibility in production.
Common Mistakes
Section titled “Common Mistakes”| Mistake | Why It Hurts | Solution |
|---|---|---|
Using latest in production | Unpredictable updates | Always use specific tags |
| Typos in image names | ImagePullBackOff | Double-check spelling |
Forgetting imagePullSecrets | Can’t pull private images | Add secret reference to pod |
Wrong imagePullPolicy | Cache issues or unnecessary pulls | Set explicitly based on needs |
| Large base images | Slow pulls, security surface | Use -slim or -alpine variants |
-
A developer pushes a fix to their app and deploys it using
image: myapp(no tag). The pod restarts, but the old version is still running. They swear they pushed the new image. What’s going on?Answer
Without a tag, Kubernetes defaults to `:latest` and sets `imagePullPolicy: Always`. However, the developer likely pushed without tagging as `latest`, or the node has a cached version. The real problem is using `latest` in the first place -- it's ambiguous and unreproducible. The fix is to use specific version tags (e.g., `myapp:v1.2.3`) so each deployment references an exact image. This also makes rollbacks predictable since you know exactly which version each revision used. -
Your colleague deployed a pod that’s stuck in
ImagePullBackOff. They say the image name is correct because they candocker pullit on their laptop. What are the three most likely causes, and how do you systematically diagnose which one?Answer
Run `kubectl describe pod` and check the Events section. The three most likely causes are: (1) the image name has a typo (e.g., `ngix` instead of `nginx`) -- the Events will say "not found"; (2) it's a private registry and the pod is missing `imagePullSecrets` -- the Events will show "unauthorized" or "authentication required"; (3) the tag doesn't exist in the registry -- Events will say "manifest unknown". Their laptop works because Docker is logged into the registry locally. The cluster nodes need separate authentication via `imagePullSecrets` or a ServiceAccount with registry credentials. -
You have a Dockerfile with
ENTRYPOINT ["python"]andCMD ["app.py"]. In your Kubernetes pod spec, you want to runpython test.pyinstead. Should you overridecommand,args, or both?Answer
Override only `args: ["test.py"]`. In Kubernetes, `command` maps to Docker's `ENTRYPOINT` and `args` maps to `CMD`. Since you still want `python` as the entrypoint, leave `command` alone and just change `args`. If you set `command: ["python"]` AND `args: ["test.py"]`, it works but is redundant. If you only set `command: ["test.py"]`, it would try to execute `test.py` directly without the Python interpreter, which would fail. -
Your production cluster pulls images slowly because every pod restart re-downloads from the registry. All your images use specific version tags like
v2.1.0. A teammate suggests settingimagePullPolicy: Neverto fix it. Why is that dangerous, and what’s the correct solution?Answer
`Never` means pods will fail to start on any node that doesn't already have the image cached -- this breaks scaling to new nodes and disaster recovery. The correct solution is `imagePullPolicy: IfNotPresent`, which is actually the default for specific version tags. If pods are still re-pulling, check whether someone has overridden the policy to `Always` in the pod spec. With `IfNotPresent`, the image is pulled once per node and cached, giving you fast restarts without the risk of `Never`. -
A developer shows you a Dockerfile that builds successfully, but the resulting image is 800MB and takes 5 minutes to build every time they change a single line of application code. The Dockerfile starts with
FROM ubuntu:latest, runs aCOPY . ., and then usesRUNto install heavily dependent packages. Why is this Dockerfile inefficient, and what are the two most impactful changes you can make to fix it?Answer
This Dockerfile suffers from poor layer caching and an overly large base image. Because `COPY . .` copies all application code before installing dependencies, any change to the source code invalidates the cache for the subsequent `RUN` commands, forcing a full dependency reinstallation on every build. Furthermore, `ubuntu:latest` is massive and contains tools unnecessary for most runtimes. The two most impactful changes are: 1) Switch to a minimal base image like an `-alpine` or `-slim` variant to drastically reduce the initial footprint. 2) Move the copying of dependency files (like `requirements.txt` or `package.json`) and the associated `RUN` install command above the `COPY . .` instruction so that dependencies remain cached unless the dependency manifest itself changes.
Hands-On Exercise
Section titled “Hands-On Exercise”Task: Fix a broken deployment with image issues.
Setup:
# Create a deployment with intentional image problemsk create deploy broken-app --image=nginx:nonexistentYour Tasks:
- Check why the pods aren’t running
- Find the correct image tag
- Fix the deployment
Solution:
# Check pod statusk get pods# Shows ImagePullBackOff
# Get detailsk describe pod -l app=broken-app | grep -A5 Events# Shows: nginx:nonexistent not found
# Fix by patching the deploymentk set image deploy/broken-app nginx=nginx:1.21.0
# Verifyk get pods# Should show Running
# Cleanupk delete deploy broken-appSuccess Criteria:
- Identified the image issue
- Fixed the image reference
- Pod is now running
Practice Drills
Section titled “Practice Drills”Drill 1: Image Name Parsing (Target: 2 minutes)
Section titled “Drill 1: Image Name Parsing (Target: 2 minutes)”Identify the components of these image references:
1. nginx Registry: docker.io (default) Namespace: library (default) Image: nginx Tag: latest (default)
2. gcr.io/google-containers/pause:3.2 Registry: gcr.io Namespace: google-containers Image: pause Tag: 3.2
3. mycompany.com/team/app:v2.0.0-alpine Registry: mycompany.com Namespace: team Image: app Tag: v2.0.0-alpineDrill 2: Fix ImagePullBackOff (Target: 3 minutes)
Section titled “Drill 2: Fix ImagePullBackOff (Target: 3 minutes)”# Create broken podk run broken --image=nginx:1.999.0
# Diagnosek describe pod broken | grep -A5 Events
# Fixk delete pod brokenk run broken --image=nginx:1.21.0
# Verifyk get pod broken
# Cleanupk delete pod brokenDrill 3: Private Registry Secret (Target: 4 minutes)
Section titled “Drill 3: Private Registry Secret (Target: 4 minutes)”# Create registry secretk create secret docker-registry myregistry \ --docker-server=private.registry.io \ --docker-username=testuser \ --docker-password=testpass
# Create pod with secret referencecat << EOF | k apply -f -apiVersion: v1kind: Podmetadata: name: private-podspec: containers: - name: app image: private.registry.io/app:latest imagePullSecrets: - name: myregistryEOF
# Check if secret is referencedk get pod private-pod -o jsonpath='{.spec.imagePullSecrets}'
# Cleanupk delete pod private-podk delete secret myregistryDrill 4: Override Command and Args (Target: 3 minutes)
Section titled “Drill 4: Override Command and Args (Target: 3 minutes)”# Create pod that overrides CMDcat << EOF | k apply -f -apiVersion: v1kind: Podmetadata: name: custom-cmdspec: containers: - name: busybox image: busybox command: ["sh", "-c"] args: ["echo 'Custom command' && sleep 10"]EOF
# Check logsk logs custom-cmd
# Verify the commandk get pod custom-cmd -o jsonpath='{.spec.containers[0].command}'k get pod custom-cmd -o jsonpath='{.spec.containers[0].args}'
# Cleanupk delete pod custom-cmdDrill 5: imagePullPolicy Testing (Target: 3 minutes)
Section titled “Drill 5: imagePullPolicy Testing (Target: 3 minutes)”# Create pods with different policiescat << EOF | k apply -f -apiVersion: v1kind: Podmetadata: name: pull-alwaysspec: containers: - name: nginx image: nginx:1.21.0 imagePullPolicy: Always---apiVersion: v1kind: Podmetadata: name: pull-ifnotpresentspec: containers: - name: nginx image: nginx:1.21.0 imagePullPolicy: IfNotPresentEOF
# Check policiesk get pod pull-always -o jsonpath='{.spec.containers[0].imagePullPolicy}'k get pod pull-ifnotpresent -o jsonpath='{.spec.containers[0].imagePullPolicy}'
# Cleanupk delete pod pull-always pull-ifnotpresentDrill 6: Complete Image Troubleshooting (Target: 5 minutes)
Section titled “Drill 6: Complete Image Troubleshooting (Target: 5 minutes)”Scenario: A colleague pushed a deployment but pods won’t start.
# Setup (simulating the problem)k create deploy webapp --image=nginx:alpine-wrong-tag
# YOUR TASK: Find and fix the issue
# Step 1: Check deployment statusk get deploy webappk get pods -l app=webapp
# Step 2: Investigate the errork describe pods -l app=webapp | grep -A10 Events
# Step 3: Find correct image tag# (In real scenario, check registry or documentation)# The correct tag is nginx:alpine
# Step 4: Fixk set image deploy/webapp nginx=nginx:alpine
# Step 5: Verifyk rollout status deploy/webappk get pods -l app=webapp
# Cleanupk delete deploy webappDrill 7: Optimize a Dockerfile (Target: 5 minutes)
Section titled “Drill 7: Optimize a Dockerfile (Target: 5 minutes)”Scenario: A colleague hands you this Dockerfile. It works, but it takes forever to build and results in an unnecessarily large image.
FROM node:18WORKDIR /usr/src/appCOPY . .RUN npm installCMD ["node", "index.js"]Your Tasks:
- Identify the layer caching issue causing slow rebuilds.
- Identify the base image size issue.
- Rewrite the Dockerfile to optimize it.
Solution:
# 1. Switch to a smaller base image (alpine)FROM node:18-alpineWORKDIR /usr/src/app
# 2. Copy ONLY package files first for layer cachingCOPY package*.json ./RUN npm install
# 3. Copy the rest of the application code AFTER dependenciesCOPY . .CMD ["node", "index.js"]Next Module
Section titled “Next Module”Module 1.2: Jobs and CronJobs - Run one-time and scheduled batch workloads.