Skip to content

Module 3.3: DNS & CoreDNS

Hands-On Lab Available
K8s Cluster intermediate 35 min
Launch Lab ↗

Opens in Killercoda in a new tab

Complexity: [MEDIUM] - Critical infrastructure component

Time to Complete: 40-50 minutes

Prerequisites: Module 3.1 (Services), Module 3.2 (Endpoints)


After this module, you will be able to:

  • Resolve service names to IPs using Kubernetes DNS conventions (service.namespace.svc.cluster.local)
  • Debug DNS failures by checking CoreDNS pods, configmap, and testing resolution from pods
  • Configure custom DNS entries and upstream DNS forwarding in CoreDNS
  • Explain how DNS-based service discovery enables microservice communication

DNS is how pods find services. Every time a pod makes a request to my-service, DNS resolves that name to an IP address. If DNS breaks, your entire cluster’s service discovery breaks. Understanding CoreDNS is essential for troubleshooting connectivity issues.

The CKA exam tests DNS debugging, CoreDNS configuration, and understanding how Kubernetes names resolve. You’ll need to troubleshoot DNS issues and understand the resolution hierarchy.

The Phone Book Analogy

DNS is your cluster’s phone book. Instead of remembering that the “web-service” lives at IP 10.96.45.123, you just dial “web-service” and DNS looks up the number for you. CoreDNS is the phone operator who maintains this phone book and answers lookups.


By the end of this module, you’ll be able to:

  • Understand how Kubernetes DNS works
  • Troubleshoot DNS resolution issues
  • Configure CoreDNS
  • Use different DNS name formats
  • Debug pods with DNS problems

  • CoreDNS replaced kube-dns: Before Kubernetes 1.11, kube-dns handled DNS. CoreDNS is faster, more flexible, and uses plugins for extensibility.

  • DNS is the #1 troubleshooting target: Most “network issues” are actually DNS issues. When in doubt, check DNS first!

  • Pods get DNS configured automatically: The kubelet injects /etc/resolv.conf into every pod, pointing to the cluster DNS service.


┌────────────────────────────────────────────────────────────────┐
│ Kubernetes DNS Architecture │
│ │
│ ┌────────────────┐ │
│ │ Pod │ │
│ │ │ │
│ │ curl web-svc │ │
│ │ │ │ │
│ │ ▼ │ │
│ │ /etc/resolv.conf │
│ │ nameserver 10.96.0.10 ──────────────────────┐ │
│ │ search default.svc... │ │
│ └────────────────┘ │ │
│ │ │
│ ▼ │
│ ┌──────────────────────────────────────────────────────────┐│
│ │ CoreDNS Service (10.96.0.10) ││
│ │ ││
│ │ ┌─────────┐ ┌─────────┐ ││
│ │ │CoreDNS │ │CoreDNS │ (2 replicas by default) ││
│ │ │ Pod │ │ Pod │ ││
│ │ └────┬────┘ └────┬────┘ ││
│ │ │ │ ││
│ │ └─────┬─────┘ ││
│ │ ▼ ││
│ │ Query: web-svc.default.svc.cluster.local ││
│ │ │ ││
│ │ ▼ ││
│ │ Response: 10.96.45.123 (Service ClusterIP) ││
│ └──────────────────────────────────────────────────────────┘│
│ │
└────────────────────────────────────────────────────────────────┘
ComponentLocationPurpose
CoreDNS Deploymentkube-system namespaceRuns CoreDNS pods
CoreDNS Servicekube-system namespaceStable IP for DNS queries (usually 10.96.0.10)
Corefile ConfigMapkube-system namespaceCoreDNS configuration
Pod /etc/resolv.confEvery podPoints to CoreDNS service

Every pod gets this automatically:

Terminal window
# Inside any pod
cat /etc/resolv.conf
# Output:
nameserver 10.96.0.10 # CoreDNS service IP
search default.svc.cluster.local svc.cluster.local cluster.local
options ndots:5
FieldPurpose
nameserverIP of CoreDNS service
searchDomains to append when resolving short names
ndots:5If name has <5 dots, try search domains first

┌────────────────────────────────────────────────────────────────┐
│ Service DNS Naming │
│ │
│ Full format (FQDN): │
│ <service>.<namespace>.svc.<cluster-domain> │
│ │
│ Example: web-svc.production.svc.cluster.local │
│ ─────── ────────── ─── ───────────── │
│ │ │ │ │ │
│ service namespace fixed cluster domain │
│ suffix (default) │
│ │
└────────────────────────────────────────────────────────────────┘
Terminal window
# From pod in "default" namespace, reaching "web-svc" in "default":
curl web-svc # ✓ Works (same namespace)
curl web-svc.default # ✓ Works
curl web-svc.default.svc # ✓ Works
curl web-svc.default.svc.cluster.local # ✓ Works (FQDN)
# From pod in "default" namespace, reaching "api" in "production":
curl api # ✗ Fails (wrong namespace)
curl api.production # ✓ Works (cross-namespace)
curl api.production.svc.cluster.local # ✓ Works (FQDN)

Pause and predict: A pod in namespace staging runs curl api-service. The cluster has an api-service in both staging and production namespaces. Which one does the pod reach, and why?

┌────────────────────────────────────────────────────────────────┐
│ Search Domain Resolution │
│ │
│ Pod in namespace "default" resolves "web-svc": │
│ │
│ search default.svc.cluster.local svc.cluster.local ... │
│ │
│ Step 1: Try web-svc.default.svc.cluster.local │
│ └── Found! Returns IP │
│ │
│ If not found: │
│ Step 2: Try web-svc.svc.cluster.local │
│ Step 3: Try web-svc.cluster.local │
│ Step 4: Try web-svc (external DNS) │
│ │
└────────────────────────────────────────────────────────────────┘

Pods also get DNS names:

┌────────────────────────────────────────────────────────────────┐
│ Pod DNS Names │
│ │
│ Pod IP: 10.244.1.5 │
│ DNS: 10-244-1-5.default.pod.cluster.local │
│ ────────── ─────── ─── ───────────── │
│ IP with namespace pod cluster domain │
│ dashes │
│ │
│ For StatefulSet pods with headless service: │
│ DNS: web-0.web-svc.default.svc.cluster.local │
│ ───── ─────── ─────── ─── │
│ pod headless namespace │
│ name service │
│ │
└────────────────────────────────────────────────────────────────┘

Terminal window
# Check CoreDNS pods
k get pods -n kube-system -l k8s-app=kube-dns
# Check CoreDNS deployment
k get deployment coredns -n kube-system
# Check CoreDNS service
k get svc kube-dns -n kube-system
# Note: Service is named "kube-dns" for compatibility
# View CoreDNS configuration
k get configmap coredns -n kube-system -o yaml
# CoreDNS ConfigMap
apiVersion: v1
kind: ConfigMap
metadata:
name: coredns
namespace: kube-system
data:
Corefile: |
.:53 {
errors # Log errors
health { # Health check endpoint
lameduck 5s
}
ready # Readiness endpoint
kubernetes cluster.local in-addr.arpa ip6.arpa { # K8s plugin
pods insecure # Pod DNS resolution
fallthrough in-addr.arpa ip6.arpa
ttl 30 # Cache TTL
}
prometheus :9153 # Metrics
forward . /etc/resolv.conf { # External DNS forwarding
max_concurrent 1000
}
cache 30 # Response caching
loop # Detect loops
reload # Auto-reload config
loadbalance # Round-robin DNS
}
PluginPurpose
kubernetesResolves Kubernetes service/pod names
forwardForwards external queries to upstream DNS
cacheCaches responses to reduce load
errorsLogs DNS errors
healthProvides health check endpoint
prometheusExposes metrics
loopDetects and breaks DNS loops
# Add custom DNS entries
apiVersion: v1
kind: ConfigMap
metadata:
name: coredns
namespace: kube-system
data:
Corefile: |
.:53 {
# ... existing config ...
# Add custom hosts
hosts {
10.0.0.1 custom.example.com
fallthrough
}
# Forward specific domain to custom DNS
forward example.com 10.0.0.53
}
Terminal window
# After editing, restart CoreDNS
k rollout restart deployment coredns -n kube-system

Stop and think: A pod reports “connection timed out” when calling another service by name. Is this necessarily a DNS problem? What steps would you take to determine whether DNS or the network is at fault?

DNS Issue?
├── Step 1: Test from inside a pod
│ k run test --rm -it --image=busybox:1.36 -- nslookup <service>
│ │
│ ├── Works? → DNS is fine, issue is elsewhere
│ │
│ └── Fails? → Continue debugging
├── Step 2: Check CoreDNS is running
│ k get pods -n kube-system -l k8s-app=kube-dns
│ │
│ └── Not running? → Fix CoreDNS deployment
├── Step 3: Check CoreDNS logs
│ k logs -n kube-system -l k8s-app=kube-dns
│ │
│ └── Errors? → Check Corefile config
├── Step 4: Check pod resolv.conf
│ k exec <pod> -- cat /etc/resolv.conf
│ │
│ └── Wrong nameserver? → Check kubelet config
└── Step 5: Test external DNS
k run test --rm -it --image=busybox:1.36 -- nslookup google.com
└── Fails? → Check forward config in Corefile
Terminal window
# Test DNS from inside cluster
k run dns-test --rm -it --image=busybox:1.36 --restart=Never -- \
nslookup kubernetes
# Test specific service
k run dns-test --rm -it --image=busybox:1.36 --restart=Never -- \
nslookup web-svc.default.svc.cluster.local
# Test with specific DNS server
k run dns-test --rm -it --image=busybox:1.36 --restart=Never -- \
nslookup web-svc 10.96.0.10
# Check resolv.conf
k exec <pod> -- cat /etc/resolv.conf
# Check CoreDNS logs
k logs -n kube-system -l k8s-app=kube-dns --tail=50
# Verify CoreDNS is responding
k run dns-test --rm -it --image=busybox:1.36 --restart=Never -- \
nslookup kubernetes.default.svc.cluster.local

Use a debug pod with more tools:

Terminal window
# Create a debug pod
k run dns-debug --image=nicolaka/netshoot --restart=Never -- sleep 3600
# Use it for debugging
k exec -it dns-debug -- dig web-svc.default.svc.cluster.local
k exec -it dns-debug -- host web-svc
k exec -it dns-debug -- nslookup web-svc
# Cleanup
k delete pod dns-debug
SymptomCauseSolution
NXDOMAINService doesn’t existCheck service name/namespace
Server failureCoreDNS downCheck CoreDNS pods
TimeoutNetwork issue to CoreDNSCheck pod network, CNI
Wrong IP returnedStale cacheRestart CoreDNS, check cache TTL
External domains failForward config wrongCheck Corefile forward directive

What would happen if: You set dnsPolicy: Default on a pod running in your cluster. The pod tries to resolve my-service.default.svc.cluster.local. Does it succeed? Why or why not?

apiVersion: v1
kind: Pod
metadata:
name: dns-policy-demo
spec:
dnsPolicy: ClusterFirst # Default
containers:
- name: app
image: nginx
PolicyBehavior
ClusterFirst (default)Use cluster DNS, fall back to node DNS
DefaultUse node’s DNS settings (inherit from host)
ClusterFirstWithHostNetUse cluster DNS even with hostNetwork: true
NoneNo DNS config, must specify dnsConfig
apiVersion: v1
kind: Pod
metadata:
name: custom-dns
spec:
dnsPolicy: "None" # Required for custom config
dnsConfig:
nameservers:
- 1.1.1.1 # Custom DNS server
- 8.8.8.8
searches:
- custom.local # Custom search domain
- svc.cluster.local
options:
- name: ndots
value: "2" # Custom ndots
containers:
- name: app
image: nginx
# Pod using host network but still using cluster DNS
apiVersion: v1
kind: Pod
metadata:
name: host-network-pod
spec:
hostNetwork: true
dnsPolicy: ClusterFirstWithHostNet # Important!
containers:
- name: app
image: nginx

SRV records include port information along with IP:

Terminal window
# Query SRV record for service
dig SRV web-svc.default.svc.cluster.local
# Returns:
# _http._tcp.web-svc.default.svc.cluster.local. 30 IN SRV 0 100 80 web-svc.default.svc.cluster.local.
# Service with named port
apiVersion: v1
kind: Service
metadata:
name: web-svc
spec:
selector:
app: web
ports:
- name: http # Named port
port: 80
targetPort: 8080
Terminal window
# SRV record format: _<port-name>._<protocol>.<service>.<namespace>.svc.cluster.local
# Query:
dig SRV _http._tcp.web-svc.default.svc.cluster.local

MistakeProblemSolution
Using wrong namespaceNXDOMAIN errorUse FQDN or check namespace
Forgetting .svcResolution failsUse service.namespace or FQDN
CoreDNS not runningAll DNS failsCheck kube-system pods
Wrong dnsPolicyPod can’t resolveUse ClusterFirst for cluster services
Editing wrong ConfigMapConfig not appliedEdit coredns ConfigMap in kube-system

  1. After a cluster upgrade, all pods start failing with “could not resolve host” errors. You check and CoreDNS pods are running. What would you investigate next, and what commands would you use?

    Answer Running does not mean healthy. First, verify CoreDNS is actually responding: `k run test --rm -it --image=busybox:1.36 --restart=Never -- nslookup kubernetes.default`. If that fails, check CoreDNS logs for errors: `k logs -n kube-system -l k8s-app=kube-dns --tail=50`. Then verify the CoreDNS Service has endpoints: `k get endpoints kube-dns -n kube-system`. Also check if a pod's `/etc/resolv.conf` still points to the correct nameserver IP. The upgrade might have changed the CoreDNS ClusterIP or corrupted the Corefile ConfigMap.
  2. A pod in namespace team-a calls curl db and accidentally reaches a database in its own namespace instead of the one in namespace shared. The developer expected to reach the shared database. Explain what happened and how to prevent this.

    Answer The search domain in `/etc/resolv.conf` appends the pod's own namespace first, so `db` resolves to `db.team-a.svc.cluster.local`. Since a service named `db` exists in `team-a`, it matches before ever trying other namespaces. To reach the shared database, the developer must use `db.shared` or the full FQDN `db.shared.svc.cluster.local`. To prevent this, establish a naming convention where team-local services have prefixed names (e.g., `team-a-db`) and shared services use explicit cross-namespace references in application config.
  3. You need to add a custom DNS entry so that legacy-api.internal resolves to 10.0.5.100 for all pods in the cluster. Where do you make this change and what is the risk?

    Answer Edit the `coredns` ConfigMap in the `kube-system` namespace. Add a `hosts` block inside the Corefile: `hosts { 10.0.5.100 legacy-api.internal \n fallthrough }`. Then restart CoreDNS with `k rollout restart deployment coredns -n kube-system`. The risk is that editing the CoreDNS ConfigMap affects all DNS resolution cluster-wide. A syntax error in the Corefile will break ALL DNS, taking down service discovery for every pod. Always validate the config and have a rollback plan. Also note that `fallthrough` is essential -- without it, the hosts plugin will stop processing and other DNS queries will fail.
  4. A developer complains that API calls to api.external-partner.com from their pod take 2 seconds, but only 50ms from their laptop. Both are on the same network. What is happening and how do you fix it?

    Answer The `ndots:5` default in Kubernetes resolv.conf means `api.external-partner.com` (only 2 dots) is treated as a relative name. Before the actual query succeeds, the resolver tries: `api.external-partner.com.default.svc.cluster.local`, then `.svc.cluster.local`, then `.cluster.local` -- each returning NXDOMAIN after a timeout. This adds ~1.5 seconds of wasted DNS lookups. Fix options: set `dnsConfig.options.ndots: 2` in the pod spec, use a trailing dot in the URL (`api.external-partner.com.`), or configure the app to use the FQDN with trailing dot.
  5. You have a pod with hostNetwork: true that cannot resolve cluster service names. It can resolve external domains like google.com fine. What is the cause and fix?

    Answer When `hostNetwork: true` is set, the pod uses the node's network namespace, including its `/etc/resolv.conf`. The node's resolv.conf points to the node's DNS server (not CoreDNS), which knows nothing about cluster service names like `my-svc.default.svc.cluster.local`. External domains work because the node's DNS can resolve them. The fix is to set `dnsPolicy: ClusterFirstWithHostNet`, which tells the kubelet to inject the CoreDNS address into the pod's resolv.conf even though it uses the host network.

Task: Debug and understand DNS in Kubernetes.

Steps:

  1. Check CoreDNS is running:
Terminal window
k get pods -n kube-system -l k8s-app=kube-dns
k get svc -n kube-system kube-dns
  1. View CoreDNS configuration:
Terminal window
k get configmap coredns -n kube-system -o yaml
  1. Create test service:
Terminal window
k create deployment web --image=nginx
k expose deployment web --port=80
  1. Test DNS resolution:
Terminal window
# Short name
k run test --rm -it --image=busybox:1.36 --restart=Never -- \
nslookup web
# With namespace
k run test --rm -it --image=busybox:1.36 --restart=Never -- \
nslookup web.default
# FQDN
k run test --rm -it --image=busybox:1.36 --restart=Never -- \
nslookup web.default.svc.cluster.local
  1. Check pod resolv.conf:
Terminal window
k run test --rm -it --image=busybox:1.36 --restart=Never -- \
cat /etc/resolv.conf
  1. Test cross-namespace DNS:
Terminal window
# Create service in another namespace
k create namespace other
k create deployment db -n other --image=nginx
k expose deployment db -n other --port=80
# Resolve from default namespace
k run test --rm -it --image=busybox:1.36 --restart=Never -- \
nslookup db.other
  1. Test external DNS:
Terminal window
k run test --rm -it --image=busybox:1.36 --restart=Never -- \
nslookup google.com
  1. Check CoreDNS logs:
Terminal window
k logs -n kube-system -l k8s-app=kube-dns --tail=20
  1. Cleanup:
Terminal window
k delete deployment web
k delete svc web
k delete namespace other

Success Criteria:

  • Can verify CoreDNS is running
  • Understand DNS name formats
  • Can resolve services by short name and FQDN
  • Can resolve cross-namespace services
  • Can troubleshoot DNS issues

Terminal window
# Create a service
k create deployment dns-test --image=nginx
k expose deployment dns-test --port=80
# Test all name formats
k run test --rm -it --image=busybox:1.36 --restart=Never -- \
sh -c 'nslookup dns-test && nslookup dns-test.default && nslookup dns-test.default.svc.cluster.local'
# Cleanup
k delete deployment dns-test
k delete svc dns-test

Drill 2: Check CoreDNS Health (Target: 2 minutes)

Section titled “Drill 2: Check CoreDNS Health (Target: 2 minutes)”
Terminal window
# Check pods
k get pods -n kube-system -l k8s-app=kube-dns -o wide
# Check service
k get svc kube-dns -n kube-system
# Check deployment
k get deployment coredns -n kube-system
# View logs
k logs -n kube-system -l k8s-app=kube-dns --tail=10

Drill 3: Cross-Namespace Resolution (Target: 3 minutes)

Section titled “Drill 3: Cross-Namespace Resolution (Target: 3 minutes)”
Terminal window
# Create services in two namespaces
k create namespace ns1
k create namespace ns2
k create deployment app1 -n ns1 --image=nginx
k create deployment app2 -n ns2 --image=nginx
k expose deployment app1 -n ns1 --port=80
k expose deployment app2 -n ns2 --port=80
# From ns1, reach ns2 (and vice versa)
k run test -n ns1 --rm -it --image=busybox:1.36 --restart=Never -- \
nslookup app2.ns2
k run test -n ns2 --rm -it --image=busybox:1.36 --restart=Never -- \
nslookup app1.ns1
# Cleanup
k delete namespace ns1 ns2

Drill 4: Inspect Pod DNS Config (Target: 2 minutes)

Section titled “Drill 4: Inspect Pod DNS Config (Target: 2 minutes)”
Terminal window
# Create a pod
k run dns-check --image=busybox:1.36 --command -- sleep 3600
# Check its DNS config
k exec dns-check -- cat /etc/resolv.conf
# Verify the nameserver matches kube-dns service
k get svc kube-dns -n kube-system -o jsonpath='{.spec.clusterIP}'
# Cleanup
k delete pod dns-check

Drill 5: CoreDNS ConfigMap (Target: 3 minutes)

Section titled “Drill 5: CoreDNS ConfigMap (Target: 3 minutes)”
Terminal window
# View the Corefile
k get configmap coredns -n kube-system -o jsonpath='{.data.Corefile}'
# Describe the configmap
k describe configmap coredns -n kube-system
# Check what plugins are enabled
k get configmap coredns -n kube-system -o yaml | grep -E "kubernetes|forward|cache"

Drill 6: Headless Service DNS (Target: 4 minutes)

Section titled “Drill 6: Headless Service DNS (Target: 4 minutes)”
Terminal window
# Create deployment
k create deployment headless-test --image=nginx --replicas=3
# Create headless service
cat << 'EOF' | k apply -f -
apiVersion: v1
kind: Service
metadata:
name: headless-svc
spec:
clusterIP: None
selector:
app: headless-test
ports:
- port: 80
EOF
# Regular service returns single IP
# Headless returns all pod IPs
k run test --rm -it --image=busybox:1.36 --restart=Never -- \
nslookup headless-svc
# Should return multiple IPs
# Cleanup
k delete deployment headless-test
k delete svc headless-svc

Drill 7: Custom DNS Policy (Target: 4 minutes)

Section titled “Drill 7: Custom DNS Policy (Target: 4 minutes)”
Terminal window
# Create pod with custom DNS
cat << 'EOF' | k apply -f -
apiVersion: v1
kind: Pod
metadata:
name: custom-dns-pod
spec:
dnsPolicy: None
dnsConfig:
nameservers:
- 8.8.8.8
searches:
- custom.local
options:
- name: ndots
value: "2"
containers:
- name: app
image: busybox:1.36
command: ["sleep", "3600"]
EOF
# Check the custom resolv.conf
k exec custom-dns-pod -- cat /etc/resolv.conf
# Should show 8.8.8.8 and custom.local
# Note: won't resolve cluster services!
k exec custom-dns-pod -- nslookup kubernetes
# Will fail
# Cleanup
k delete pod custom-dns-pod

Drill 8: Debug DNS Failure (Target: 4 minutes)

Section titled “Drill 8: Debug DNS Failure (Target: 4 minutes)”
Terminal window
# Create service
k create deployment web --image=nginx
k expose deployment web --port=80
# Simulate debugging workflow
# Step 1: Test from pod
k run test --rm -it --image=busybox:1.36 --restart=Never -- \
nslookup web
# Should work
# Step 2: Test FQDN
k run test --rm -it --image=busybox:1.36 --restart=Never -- \
nslookup web.default.svc.cluster.local
# Should work
# Step 3: Check CoreDNS
k get pods -n kube-system -l k8s-app=kube-dns
# Step 4: Check logs
k logs -n kube-system -l k8s-app=kube-dns --tail=5
# Cleanup
k delete deployment web
k delete svc web

Drill 9: Challenge - Complete DNS Workflow

Section titled “Drill 9: Challenge - Complete DNS Workflow”

Without looking at solutions:

  1. Verify CoreDNS is running
  2. Create deployment challenge with nginx
  3. Expose it as a service
  4. Test DNS resolution with short name, namespace, and FQDN
  5. Create the same service in a new namespace test
  6. Resolve across namespaces
  7. View the CoreDNS logs
  8. Cleanup everything
Terminal window
# YOUR TASK: Complete in under 5 minutes
Solution
Terminal window
# 1. Verify CoreDNS
k get pods -n kube-system -l k8s-app=kube-dns
# 2. Create deployment
k create deployment challenge --image=nginx
# 3. Expose
k expose deployment challenge --port=80
# 4. Test DNS formats
k run test --rm -it --image=busybox:1.36 --restart=Never -- \
sh -c 'nslookup challenge; nslookup challenge.default; nslookup challenge.default.svc.cluster.local'
# 5. Create in new namespace
k create namespace test
k create deployment challenge -n test --image=nginx
k expose deployment challenge -n test --port=80
# 6. Cross-namespace resolution
k run test --rm -it --image=busybox:1.36 --restart=Never -- \
nslookup challenge.test
# 7. View logs
k logs -n kube-system -l k8s-app=kube-dns --tail=10
# 8. Cleanup
k delete deployment challenge
k delete svc challenge
k delete namespace test

Module 3.4: Ingress - HTTP routing and external access to services.