Skip to content

Module 10.9: Zero Trust Architecture in Hybrid Cloud

Complexity: [COMPLEX] | Time to Complete: 2.5h | Prerequisites: Kubernetes Networking, Identity & Access Management, Service Mesh Basics

After completing this module, you will be able to:

  • Implement zero trust network architectures for Kubernetes using service mesh mTLS, network policies, and SPIFFE identities
  • Configure workload identity verification with SPIFFE/SPIRE across multi-cluster and multi-cloud environments
  • Deploy micro-segmentation policies that enforce least-privilege network access at the pod and service level
  • Design end-to-end zero trust architectures that cover ingress, east-west, and egress traffic in Kubernetes clusters

In February 2024, a pharmaceutical company with 4,500 employees and a traditional perimeter-based security model was breached through a contractor’s compromised VPN credentials. The attacker used the VPN to access the internal network, then moved laterally across 14 systems over 18 days before being detected. They exfiltrated clinical trial data, patient records, and intellectual property valued at an estimated $340 million. The investigation revealed that once inside the VPN perimeter, the attacker had access to 83% of internal services because the security model assumed that anything inside the network was trusted.

This is the fundamental flaw of perimeter security: it creates a hard outer shell and a soft interior. A VPN gives you an all-or-nothing binary: you are either outside (no access) or inside (access to almost everything). In a world where contractors, remote employees, cloud services, and Kubernetes clusters all need varying levels of access, the perimeter model is dangerously inadequate.

Zero Trust flips this model. Instead of “trust everything inside the network,” Zero Trust says “trust nothing, verify everything.” Every request — whether it comes from inside your data center, from a Kubernetes pod, from an employee’s laptop, or from a cloud service — must prove its identity, demonstrate it is authorized, and pass through policy evaluation before being granted access. In this module, you will learn the principles of Zero Trust architecture, how BeyondCorp and Identity-Aware Proxies work, how to implement micro-segmentation in Kubernetes, how to replace VPNs with modern access patterns, and how SLSA frameworks secure your CI/CD supply chain.


flowchart TD
A[ZERO TRUST PILLARS]
A --> B[1. VERIFY EXPLICITLY]
A --> C[2. LEAST PRIVILEGE]
A --> D[3. ASSUME BREACH]
B --> B1[Identity]
B --> B2[Device health]
B --> B3[Location]
B --> B4[Service ID]
B --> B5[Risk score]
C --> C1[Just-in-time]
C --> C2[Just-enough]
C --> C3[Time-limited]
C --> C4[Scope-limited]
C --> C5[Reviewed]
D --> D1[Segment]
D --> D2[Encrypt]
D --> D3[Monitor]
D --> D4[Detect]
D --> D5[Respond]
AspectPerimeter SecurityZero Trust
Trust modelInside network = trustedNothing trusted by default
Network accessVPN grants broad accessPer-resource access based on identity + context
Lateral movementEasy once insideMicro-segmented, each service independently secured
AuthenticationOnce at VPN loginContinuous, per-request
AuthorizationNetwork-level (IP, VLAN)Application-level (identity, role, context)
EncryptionAt the perimeter (TLS termination)Everywhere (mTLS between all services)
MonitoringPerimeter logs (firewall)Every transaction logged and analyzed
Kubernetes impactCluster accessible via VPNEach pod/service independently authenticated

BeyondCorp: Google’s Zero Trust Implementation

Section titled “BeyondCorp: Google’s Zero Trust Implementation”

Stop and think: If there is no VPN, how do employees securely access internal applications without exposing those applications to the public internet?

Google pioneered Zero Trust at enterprise scale with BeyondCorp, their internal access model that eliminated the corporate VPN entirely. Every Google employee accesses internal applications the same way from any network — there is no “corporate network” that grants additional trust.

flowchart TD
A[Employee any network] -- "HTTPS (always encrypted)" --> B[Identity-Aware Proxy IAP]
B --> C["Checks:<br>1. Identity (OIDC/SAML)<br>2. Device trust (MDM enrolled?)<br>3. Context (location, time)<br>4. Risk score (behavioral)<br>5. Access policy (per-app)"]
C --> D{ALLOW?}
D -- Yes --> E[Proxy to backend]
D -- No --> F[403 Forbidden]
E --> G["Internal Application<br>(K8s Service, VM, SaaS)<br><br>No public endpoint needed<br>IAP handles all external access"]
ProviderServiceHow It Works
GCPCloud IAPBuilt-in proxy for GCE, GKE, App Engine. Checks Google Identity + device trust via Endpoint Verification.
AWSVerified AccessEvaluates identity (IAM Identity Center) + device posture (Jamf, CrowdStrike) per request. Runs at the VPC level.
AzureAzure AD Application ProxyProxies requests to on-prem/cloud apps. Evaluates Conditional Access policies per request.
Open SourcePomerium, OAuth2-proxy, TeleportSelf-hosted proxies with OIDC integration. Full control, requires operational effort.
Terminal window
# Create a Verified Access trust provider (connects to your IdP)
VA_TRUST=$(aws ec2 create-verified-access-trust-provider \
--trust-provider-type user \
--user-trust-provider-type oidc \
--oidc-options '{
"Issuer": "https://company.okta.com/oauth2/default",
"AuthorizationEndpoint": "https://company.okta.com/oauth2/default/v1/authorize",
"TokenEndpoint": "https://company.okta.com/oauth2/default/v1/token",
"UserInfoEndpoint": "https://company.okta.com/oauth2/default/v1/userinfo",
"ClientId": "0oa1234567abcdefg",
"ClientSecret": "secret123",
"Scope": "openid profile email groups"
}' \
--query 'VerifiedAccessTrustProvider.VerifiedAccessTrustProviderId' --output text)
# Create a Verified Access instance
VA_INSTANCE=$(aws ec2 create-verified-access-instance \
--query 'VerifiedAccessInstance.VerifiedAccessInstanceId' --output text)
# Attach the trust provider to the instance
aws ec2 attach-verified-access-trust-provider \
--verified-access-instance-id $VA_INSTANCE \
--verified-access-trust-provider-id $VA_TRUST
# Create an endpoint that points to your K8s ingress
VA_GROUP=$(aws ec2 create-verified-access-group \
--verified-access-instance-id $VA_INSTANCE \
--query 'VerifiedAccessGroup.VerifiedAccessGroupId' --output text)
aws ec2 create-verified-access-endpoint \
--verified-access-group-id $VA_GROUP \
--endpoint-type load-balancer \
--attachment-type vpc \
--domain-certificate-arn arn:aws:acm:us-east-1:123456789012:certificate/abc-123 \
--application-domain dashboard.company.com \
--endpoint-domain-prefix dashboard \
--load-balancer-options '{
"LoadBalancerArn": "arn:aws:elasticloadbalancing:us-east-1:123456789012:loadbalancer/app/k8s-ingress/abc123",
"Port": 443,
"Protocol": "https",
"SubnetIds": ["subnet-aaa", "subnet-bbb"]
}' \
--policy-document '{
"Version": "2012-10-17",
"Statement": [{
"Effect": "Allow",
"Principal": "*",
"Action": "ec2:*",
"Resource": "*",
"Condition": {
"StringEquals": {
"verified_access.groups": ["engineering"]
}
}
}]
}'

Pomerium: Open-Source Identity-Aware Proxy for Kubernetes

Section titled “Pomerium: Open-Source Identity-Aware Proxy for Kubernetes”
# Deploy Pomerium as an IAP in front of Kubernetes services
apiVersion: v1
kind: ConfigMap
metadata:
name: pomerium-config
namespace: pomerium
data:
config.yaml: |
authenticate_service_url: https://authenticate.company.com
identity_provider: oidc
identity_provider_url: https://company.okta.com/oauth2/default
identity_provider_client_id: 0oa1234567abcdefg
identity_provider_client_secret_file: /secrets/idp-client-secret
policy:
# ArgoCD: only platform engineers
- from: https://argocd.company.com
to: http://argocd-server.argocd.svc.cluster.local:80
allowed_groups:
- platform-engineers
cors_allow_preflight: true
preserve_host_header: true
# Grafana: all engineers, read-only for non-SRE
- from: https://grafana.company.com
to: http://grafana.monitoring.svc.cluster.local:3000
allowed_groups:
- all-engineers
set_request_headers:
X-Grafana-Role: |
{{- if .Groups | has "sre-team" -}}Admin{{- else -}}Viewer{{- end -}}
# Backstage: all engineers
- from: https://backstage.company.com
to: http://backstage.backstage.svc.cluster.local:7007
allowed_groups:
- all-engineers
# Kubernetes Dashboard: platform team only, with device trust
- from: https://k8s-dashboard.company.com
to: http://kubernetes-dashboard.kubernetes-dashboard.svc.cluster.local:443
tls_skip_verify: true
allowed_groups:
- platform-engineers
allowed_idp_claims:
device_trust:
- "managed"

Pause and predict: If an attacker compromises a frontend pod in a default Kubernetes cluster, what prevents them from reaching the database pod directly?

Micro-segmentation applies the Zero Trust principle of “assume breach” at the network level. Instead of a flat network where any pod can talk to any other pod, micro-segmentation restricts communication to only explicitly allowed paths.

flowchart TD
subgraph "Layer 1: Namespace Isolation"
A["payments NS<br>(default deny all)"]
B["identity NS<br>(default deny all)"]
C["search NS<br>(default deny all)"]
end
subgraph "Layer 2: Service-Level Policies"
D["frontend<br>(port 80)"] -- "Only frontend can reach backend" --> E["backend<br>(port 8080)"]
E -- "Only backend can reach database" --> F["database<br>(port 5432)"]
end
subgraph "Layer 3: mTLS (Service Mesh)"
G["Every connection authenticated + encrypted<br>SPIFFE identities verified per request"]
end
subgraph "Layer 4: Application-Level Authorization"
H["HTTP method + path + headers checked per request<br>Istio AuthorizationPolicy or OPA"]
end
# Layer 1: Default deny all ingress and egress in every namespace
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
name: default-deny-all
namespace: payments
spec:
podSelector: {}
policyTypes:
- Ingress
- Egress
---
# Layer 2: Allow DNS resolution (required for all pods)
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
name: allow-dns
namespace: payments
spec:
podSelector: {}
policyTypes:
- Egress
egress:
- to: []
ports:
- protocol: TCP
port: 53
- protocol: UDP
port: 53
---
# Layer 2: Frontend can receive traffic from ingress controller
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
name: allow-ingress-to-frontend
namespace: payments
spec:
podSelector:
matchLabels:
app: payment-frontend
ingress:
- from:
- namespaceSelector:
matchLabels:
kubernetes.io/metadata.name: ingress-nginx
podSelector:
matchLabels:
app.kubernetes.io/name: ingress-nginx
ports:
- protocol: TCP
port: 8080
---
# Layer 2: Frontend can talk to backend API only
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
name: frontend-to-backend
namespace: payments
spec:
podSelector:
matchLabels:
app: payment-frontend
policyTypes:
- Egress
egress:
- to:
- podSelector:
matchLabels:
app: payment-backend
ports:
- protocol: TCP
port: 8080
---
# Layer 2: Backend can talk to database only
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
name: backend-to-database
namespace: payments
spec:
podSelector:
matchLabels:
app: payment-backend
policyTypes:
- Egress
egress:
- to:
- podSelector:
matchLabels:
app: payment-database
ports:
- protocol: TCP
port: 5432
---
# Layer 2: Backend can talk to external payment gateway
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
name: backend-to-payment-gateway
namespace: payments
spec:
podSelector:
matchLabels:
app: payment-backend
policyTypes:
- Egress
egress:
- to:
- ipBlock:
cidr: 203.0.113.0/24 # Payment gateway IP range
ports:
- protocol: TCP
port: 443
# Only the payment-frontend service account can call the payment-backend
apiVersion: security.istio.io/v1
kind: AuthorizationPolicy
metadata:
name: payment-backend-authz
namespace: payments
spec:
selector:
matchLabels:
app: payment-backend
action: ALLOW
rules:
- from:
- source:
principals:
- "cluster.local/ns/payments/sa/payment-frontend"
to:
- operation:
methods: ["GET", "POST"]
paths: ["/api/v1/payments/*", "/api/v1/refunds/*"]
- from:
- source:
principals:
- "cluster.local/ns/monitoring/sa/prometheus"
to:
- operation:
methods: ["GET"]
paths: ["/metrics"]
---
# Deny all other access to payment-backend
apiVersion: security.istio.io/v1
kind: AuthorizationPolicy
metadata:
name: payment-backend-deny-all
namespace: payments
spec:
selector:
matchLabels:
app: payment-backend
action: DENY
rules:
- from:
- source:
notPrincipals:
- "cluster.local/ns/payments/sa/payment-frontend"
- "cluster.local/ns/monitoring/sa/prometheus"

Removing VPNs: The Path to Zero Trust Access

Section titled “Removing VPNs: The Path to Zero Trust Access”
flowchart LR
subgraph "BEFORE (VPN)"
A["Employee Laptop"] -- "VPN Gateway" --> B["FLAT NETWORK<br>(access to 83% of internal services)"]
end
subgraph "AFTER (Zero Trust)"
C["Employee Laptop<br><br>Checks:<br>- Device<br>- Posture<br>- Cert"] -- "Identity-Aware Proxy<br><br>Checks:<br>- Identity<br>- Authorization<br>- Context" --> D["Only the ONE service<br>they need access to<br><br>mTLS, logged per-request"]
end
teleport-kube-agent.yaml
# Teleport for Zero Trust Kubernetes access
apiVersion: apps/v1
kind: Deployment
metadata:
name: teleport-kube-agent
namespace: teleport
spec:
replicas: 2
selector:
matchLabels:
app: teleport-kube-agent
template:
metadata:
labels:
app: teleport-kube-agent
spec:
serviceAccountName: teleport-kube-agent
containers:
- name: teleport
image: public.ecr.aws/gravitational/teleport-distroless:16
args:
- "--config=/etc/teleport/teleport.yaml"
volumeMounts:
- name: config
mountPath: /etc/teleport
volumes:
- name: config
configMap:
name: teleport-config
---
apiVersion: v1
kind: ConfigMap
metadata:
name: teleport-config
namespace: teleport
data:
teleport.yaml: |
version: v3
teleport:
join_params:
token_name: kube-agent-token
method: kubernetes
proxy_server: teleport.company.com:443
kubernetes_service:
enabled: true
listen_addr: 0.0.0.0:3027
kube_cluster_name: eks-prod-east
labels:
environment: production
provider: aws
region: us-east-1
Terminal window
# Developer workflow: access kubectl without VPN
# 1. Login via browser-based SSO
tsh login --proxy=teleport.company.com
# 2. List available clusters
tsh kube ls
# Cluster Labels
# ------------------- ----------------------------------
# eks-prod-east environment=production provider=aws
# aks-staging-west environment=staging provider=azure
# onprem-legacy environment=production provider=onprem
# 3. Connect to a cluster
tsh kube login eks-prod-east
# 4. Use kubectl normally (proxied through Teleport)
kubectl get pods -n payments
# Every command is:
# - Authenticated via SSO (no static kubeconfig)
# - Authorized per Teleport RBAC (namespace/verb restrictions)
# - Logged with session recording
# - Time-limited (session expires after configured duration)

Stop and think: Even with perfect network security, how could an attacker compromise a workload before it is even deployed to Kubernetes?

Supply chain security is a critical component of Zero Trust. SLSA (Supply-chain Levels for Software Artifacts) provides a framework for securing the CI/CD pipeline.

LevelRequirementWhat It Prevents
SLSA 1Build process documented”How was this built?” is answerable
SLSA 2Version-controlled build, authenticated provenanceSource tampering, build reproducibility
SLSA 3Hardened build platform, non-falsifiable provenanceCompromised build system, forged attestations
SLSA 4Two-person review, hermetic buildsInsider threats, dependency confusion

Implementing SLSA for Kubernetes Deployments

Section titled “Implementing SLSA for Kubernetes Deployments”
# GitHub Actions pipeline with SLSA provenance
name: Build and Deploy with SLSA
on:
push:
branches: [main]
permissions:
contents: read
packages: write
id-token: write # Required for OIDC-based signing
jobs:
build:
runs-on: ubuntu-latest
outputs:
digest: ${{ steps.build.outputs.digest }}
steps:
- uses: actions/checkout@v4
- name: Build container image
id: build
run: |
docker build -t ghcr.io/company/payment-service:${{ github.sha }} .
DIGEST=$(docker inspect --format='{{index .RepoDigests 0}}' ghcr.io/company/payment-service:${{ github.sha }} | cut -d@ -f2)
echo "digest=$DIGEST" >> $GITHUB_OUTPUT
- name: Push to registry
run: |
echo "${{ secrets.GITHUB_TOKEN }}" | docker login ghcr.io -u ${{ github.actor }} --password-stdin
docker push ghcr.io/company/payment-service:${{ github.sha }}
- name: Sign image with cosign (keyless)
uses: sigstore/cosign-installer@v3
- run: |
cosign sign --yes \
ghcr.io/company/payment-service@${{ steps.build.outputs.digest }}
- name: Generate SLSA provenance
uses: slsa-framework/slsa-github-generator/.github/workflows/generator_container_slsa3.yml@v2.0.0
with:
image: ghcr.io/company/payment-service
digest: ${{ steps.build.outputs.digest }}
deploy:
needs: build
runs-on: ubuntu-latest
steps:
- name: Verify signature before deploy
run: |
cosign verify \
--certificate-identity-regexp='https://github.com/company/.*' \
--certificate-oidc-issuer='https://token.actions.githubusercontent.com' \
ghcr.io/company/payment-service@${{ needs.build.outputs.digest }}
- name: Deploy to Kubernetes
run: |
kubectl set image deployment/payment-service \
payment-service=ghcr.io/company/payment-service@${{ needs.build.outputs.digest }} \
-n payments
# Kyverno policy: only allow signed images from our CI/CD
apiVersion: kyverno.io/v1
kind: ClusterPolicy
metadata:
name: verify-slsa-provenance
spec:
validationFailureAction: Enforce
webhookTimeoutSeconds: 30
rules:
- name: verify-signature
match:
any:
- resources:
kinds:
- Pod
verifyImages:
- imageReferences:
- "ghcr.io/company/*"
attestors:
- entries:
- keyless:
subject: "https://github.com/company/*"
issuer: "https://token.actions.githubusercontent.com"
rekor:
url: "https://rekor.sigstore.dev"
mutateDigest: true
verifyDigest: true
required: true

  1. Google’s BeyondCorp project started in 2011 after Operation Aurora, a sophisticated cyberattack from China that compromised Google’s internal systems through a VPN vulnerability. Google spent 8 years migrating from perimeter security to BeyondCorp, making the transition for over 100,000 employees. By 2019, no Google employee used a VPN for internal access. The total cost of the migration was estimated at over $500 million, but Google calculated it saved them $4 billion in prevented breach costs over the following 5 years.

  2. The SLSA framework was created by Google in 2021 based on their internal “Binary Authorization for Borg” (BAB) system, which has been mandatory for all Google production deployments since 2013. Every binary running in Google’s production environment must have verifiable provenance — a cryptographically signed attestation of how, when, and where it was built. This prevented multiple insider threats and supply chain attacks that would have otherwise succeeded.

  3. Network Policies in Kubernetes are implemented by the CNI plugin, not by Kubernetes itself. This means that if your CNI does not support Network Policies (like the default kubenet in some managed services or Flannel without extension), your NetworkPolicy resources are silently ignored — they exist as objects but have zero enforcement. Calico, Cilium, and Azure CNI all support Network Policies. Always verify enforcement, not just resource creation.

  4. Pomerium, the open-source Identity-Aware Proxy, was created by engineers who found that Google’s BeyondCorp papers described a brilliant architecture but provided no open-source implementation. Pomerium reached 10,000 GitHub stars in 2024 and is used by organizations ranging from 50-person startups to Fortune 500 companies. The average Pomerium deployment replaces 3-5 VPN appliances, saving approximately $120,000/year in licensing costs.


MistakeWhy It HappensHow to Fix It
Zero Trust without identity foundationTeams jump to micro-segmentation and IAP without first establishing strong identity (OIDC, device trust, service accounts).Start with identity: deploy OIDC for humans, SPIFFE for services, device trust for endpoints. Then layer on micro-segmentation and IAP.
Network Policies without default denyTeams add “allow” policies but never set the default deny baseline. Pods can still communicate freely on paths without explicit policies.Always start with a default-deny NetworkPolicy in every namespace. Then add explicit allow policies for each legitimate communication path.
mTLS in the mesh but plaintext sidecarsService mesh provides mTLS between proxies, but the connection from the proxy to the application container inside the same pod is plaintext on localhost.This is expected behavior — localhost traffic within a pod is considered trusted. If you need end-to-end encryption (e.g., for FIPS compliance), the application itself must implement TLS.
VPN removal without alternativeSecurity team removes the VPN before deploying IAP or Teleport. Developers cannot access anything. Shadow IT VPN tunnels appear.Deploy the Zero Trust access layer first (IAP, Teleport). Run it in parallel with the VPN for 3-6 months. Only decommission the VPN after all access patterns are migrated.
Image signing without admission enforcementCI/CD pipeline signs images with cosign, but no admission webhook verifies signatures. Unsigned images can still be deployed.Deploy Kyverno or Gatekeeper with image verification policies. Signing without enforcement is security theater.
Overly broad Istio AuthorizationPoliciesTeams write policies with action: ALLOW that match too broadly, effectively allowing everything. The policy exists but does not restrict.Use deny-by-default: start with an AuthorizationPolicy that denies all, then add specific allow rules for each legitimate path. Test with istioctl analyze.

Question 1: A developer's laptop is stolen while logged into the corporate VPN with a valid kubeconfig file. Under a traditional perimeter security model, what happens next compared to a Zero Trust architecture using an Identity-Aware Proxy (IAP) like Teleport?

Under a perimeter security model, the attacker now has full network access to the corporate environment and the Kubernetes API server because the VPN provides a binary “inside/trusted” state. The valid kubeconfig file allows the attacker to authenticate to the cluster and execute commands with the developer’s broad RBAC permissions, potentially compromising the entire environment.

In a Zero Trust architecture, the stolen laptop and VPN connection are useless on their own. The IAP continuously verifies identity and context per request. Even if the attacker has the laptop, they would need the developer’s SSO credentials and physical MFA token to establish a new session. Furthermore, the IAP enforces device health checks (which might fail if the device is reported stolen) and limits access strictly to the namespaces the developer needs, minimizing the blast radius. Trust is never binary; it is continuously evaluated.

Question 2: A team has deployed Network Policies with a default-deny rule, but pods can still communicate freely. What is the most likely cause?

The most likely cause is that the CNI plugin does not support Network Policies. NetworkPolicy resources are processed by the CNI plugin, not by the Kubernetes API server. If the cluster uses a CNI that does not implement the NetworkPolicy API (like Flannel without the Calico integration, or AWS VPC CNI without the network policy controller), the NetworkPolicy objects are stored in etcd but have no enforcement. The pods see no firewalling because there is no component enforcing the rules. To diagnose: (1) Check which CNI is installed (kubectl get pods -n kube-system | grep -E 'calico|cilium|weave'). (2) Verify the CNI supports Network Policies (check documentation). (3) Test enforcement: create a default-deny policy and verify that pods actually cannot communicate. On EKS, you need to enable the VPC CNI network policy feature or install Calico alongside VPC CNI.

Question 3: A sophisticated attacker compromises your CI/CD worker node and injects malicious code during the build process of your payment service. How does SLSA Level 3 prevent this compromised container image from running in your production Kubernetes cluster?

SLSA Level 3 requires a hardened build platform and non-falsifiable provenance. The build platform is isolated so that individual builds cannot influence each other or tamper with the build process. Provenance is generated by the build platform itself (not by the build script), and it is cryptographically signed in a way that the build script cannot forge. If an attacker compromises a CI/CD worker (e.g., injects malicious code into a build), the provenance will either: (1) accurately reflect that the build used a modified source (because provenance is generated independently of the build script), or (2) be absent (if the attacker tries to skip provenance generation, the admission webhook rejects the artifact). The key insight is that at SLSA 3, provenance is a property of the build platform, not of the build. The build cannot lie about its own origin.

Question 4: The CISO mandates the removal of the corporate VPN within 6 months in favor of a Zero Trust architecture. The infrastructure team proposes shutting down the VPN next weekend and routing all traffic through a newly installed Identity-Aware Proxy (IAP) to force adoption. Why is this approach likely to fail, and what sequence of steps should be taken instead?

This “rip and replace” approach is highly likely to fail and cause a massive business disruption because it assumes all applications and access patterns are immediately compatible with the IAP. Without a strong identity foundation already in place, users will be locked out of critical services, leading to shadow IT workarounds and halted productivity.

Instead, the migration must be incremental and run in parallel. First, you must establish a strong identity foundation (SSO, MFA, device MDM). Second, deploy the IAP alongside the existing VPN without disrupting current workflows. Third, incrementally migrate applications starting with low-risk internal tools (like Grafana or Backstage) to test the IAP, before moving to production Kubernetes access. You must monitor access patterns over 3-6 months to ensure all legitimate traffic has shifted to the IAP before finally decommissioning the VPN. Rushing the VPN shutdown is the most common failure mode in Zero Trust migrations.

Question 5: A security auditor reviews your cluster and notices you are using Istio Authorization Policies to restrict traffic between services, but you have no Kubernetes Network Policies. They flag this as a vulnerability. Why would they require both if Istio already controls access?

Network Policies operate at Layer 3/4 (IP addresses and ports). They control which pods can establish TCP/UDP connections to which other pods. They are enforced by the CNI plugin and work without a service mesh. Istio Authorization Policies operate at Layer 7 (HTTP methods, paths, headers, service identities). They control what requests are allowed within an established connection. They require the Istio sidecar proxy.

You need both for defense in depth. Network Policies prevent unauthorized network connections from being established at all — even if Istio is misconfigured or the sidecar is bypassed. Istio Authorization Policies provide fine-grained control that Network Policies cannot: allowing GET but denying DELETE, or allowing /api/v1/payments but denying /api/v1/admin. Network Policies are the coarse guard at the door; Istio policies are the fine-grained access control inside the room.

Question 6: An engineer argues that implementing mTLS in your Istio service mesh makes Network Policies unnecessary because "mTLS already verifies identity and encrypts traffic." Why is this assertion dangerous in a Zero Trust environment?

mTLS verifies the identity of the communicating parties (via SPIFFE certificates) and encrypts the traffic. But it does not restrict which communications can happen. By default, Istio’s mTLS allows any service with a valid mesh certificate to communicate with any other service. mTLS ensures that the caller is who they claim to be; it does not ensure the caller is authorized for that specific action. You still need: (1) AuthorizationPolicies to restrict which identities can call which services (Layer 7). (2) Network Policies as a backup in case the Istio sidecar is bypassed (e.g., host-networked pods, init containers that run before the sidecar, or pods without the sidecar injected). mTLS is an authentication mechanism, not an authorization mechanism. Confusing the two is a common and dangerous mistake.


Hands-On Exercise: Implement Zero Trust Micro-Segmentation

Section titled “Hands-On Exercise: Implement Zero Trust Micro-Segmentation”

In this exercise, you will implement a multi-layered Zero Trust architecture in a kind cluster with Network Policies, RBAC, and simulated identity-aware access.

Solution
Terminal window
# Create a cluster with Calico CNI for Network Policy enforcement
cat <<'EOF' > /tmp/zero-trust-cluster.yaml
kind: Cluster
apiVersion: kind.x-k8s.io/v1alpha4
name: zero-trust-lab
networking:
disableDefaultCNI: true
podSubnet: 192.168.0.0/16
nodes:
- role: control-plane
- role: worker
- role: worker
EOF
kind create cluster --config /tmp/zero-trust-cluster.yaml
# Install Calico for Network Policy enforcement
kubectl apply -f https://raw.githubusercontent.com/projectcalico/calico/v3.28.0/manifests/calico.yaml
# Wait for Calico to be ready
kubectl wait --for=condition=ready pod -l k8s-app=calico-node -n kube-system --timeout=120s
kubectl wait --for=condition=ready pod -l k8s-app=calico-kube-controllers -n kube-system --timeout=120s
echo "Cluster ready with Calico CNI (Network Policy support enabled)"

Task 2: Deploy a Multi-Service Application

Section titled “Task 2: Deploy a Multi-Service Application”
Solution
Terminal window
# Create namespaces
kubectl create namespace payments
kubectl create namespace monitoring
# Deploy a 3-tier application
cat <<'EOF' | kubectl apply -f -
# Frontend
apiVersion: apps/v1
kind: Deployment
metadata:
name: frontend
namespace: payments
spec:
replicas: 2
selector:
matchLabels:
app: frontend
template:
metadata:
labels:
app: frontend
tier: frontend
spec:
containers:
- name: frontend
image: nginx:1.27.3
ports:
- containerPort: 80
resources:
limits:
cpu: 100m
memory: 128Mi
---
apiVersion: v1
kind: Service
metadata:
name: frontend
namespace: payments
spec:
selector:
app: frontend
ports:
- port: 80
---
# Backend API
apiVersion: apps/v1
kind: Deployment
metadata:
name: backend
namespace: payments
spec:
replicas: 2
selector:
matchLabels:
app: backend
template:
metadata:
labels:
app: backend
tier: backend
spec:
containers:
- name: backend
image: nginx:1.27.3
ports:
- containerPort: 80
resources:
limits:
cpu: 100m
memory: 128Mi
---
apiVersion: v1
kind: Service
metadata:
name: backend
namespace: payments
spec:
selector:
app: backend
ports:
- port: 80
---
# Database
apiVersion: apps/v1
kind: Deployment
metadata:
name: database
namespace: payments
spec:
replicas: 1
selector:
matchLabels:
app: database
template:
metadata:
labels:
app: database
tier: database
spec:
containers:
- name: database
image: nginx:1.27.3
ports:
- containerPort: 80
resources:
limits:
cpu: 100m
memory: 128Mi
---
apiVersion: v1
kind: Service
metadata:
name: database
namespace: payments
spec:
selector:
app: database
ports:
- port: 80
EOF
kubectl wait --for=condition=ready pod -l app=frontend -n payments --timeout=60s
kubectl wait --for=condition=ready pod -l app=backend -n payments --timeout=60s
kubectl wait --for=condition=ready pod -l app=database -n payments --timeout=60s

Task 3: Verify Flat Network (Before Zero Trust)

Section titled “Task 3: Verify Flat Network (Before Zero Trust)”
Solution
Terminal window
echo "=== BEFORE ZERO TRUST: Flat Network ==="
echo ""
echo "Test: Frontend → Backend (should succeed - legitimate)"
kubectl exec -n payments deploy/frontend -- curl -s --max-time 3 backend.payments.svc.cluster.local || echo "FAILED"
echo ""
echo "Test: Frontend → Database (should succeed - PROBLEM: frontend should not access DB directly)"
kubectl exec -n payments deploy/frontend -- curl -s --max-time 3 database.payments.svc.cluster.local || echo "FAILED"
echo ""
echo "Test: Database → Frontend (should succeed - PROBLEM: DB should not call frontend)"
kubectl exec -n payments deploy/database -- curl -s --max-time 3 frontend.payments.svc.cluster.local || echo "FAILED"
echo ""
echo "CONCLUSION: Without Network Policies, every pod can talk to every other pod."
echo "This is the 'soft interior' problem of perimeter security."
Solution
Terminal window
cat <<'EOF' | kubectl apply -f -
# Step 1: Default deny ALL traffic
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
name: default-deny-all
namespace: payments
spec:
podSelector: {}
policyTypes:
- Ingress
- Egress
---
# Step 2: Allow DNS for all pods
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
name: allow-dns
namespace: payments
spec:
podSelector: {}
policyTypes:
- Egress
egress:
- ports:
- protocol: TCP
port: 53
- protocol: UDP
port: 53
---
# Step 3: Frontend can receive from outside and send to backend only
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
name: frontend-policy
namespace: payments
spec:
podSelector:
matchLabels:
app: frontend
policyTypes:
- Ingress
- Egress
ingress:
- {} # Accept from any source (simulates ingress controller)
egress:
- to:
- podSelector:
matchLabels:
app: backend
ports:
- port: 80
- ports:
- protocol: TCP
port: 53
- protocol: UDP
port: 53
---
# Step 4: Backend accepts from frontend, can reach database only
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
name: backend-policy
namespace: payments
spec:
podSelector:
matchLabels:
app: backend
policyTypes:
- Ingress
- Egress
ingress:
- from:
- podSelector:
matchLabels:
app: frontend
ports:
- port: 80
egress:
- to:
- podSelector:
matchLabels:
app: database
ports:
- port: 80
- ports:
- protocol: TCP
port: 53
- protocol: UDP
port: 53
---
# Step 5: Database accepts from backend only, no egress
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
name: database-policy
namespace: payments
spec:
podSelector:
matchLabels:
app: database
policyTypes:
- Ingress
- Egress
ingress:
- from:
- podSelector:
matchLabels:
app: backend
ports:
- port: 80
egress:
- ports:
- protocol: TCP
port: 53
- protocol: UDP
port: 53
EOF
echo "Network Policies applied:"
kubectl get networkpolicy -n payments
Solution
Terminal window
echo "=== AFTER ZERO TRUST: Micro-Segmented Network ==="
echo ""
echo "Test 1: Frontend → Backend (SHOULD PASS - legitimate path)"
kubectl exec -n payments deploy/frontend -- curl -s --max-time 3 backend.payments.svc.cluster.local && echo "PASS" || echo "BLOCKED"
echo ""
echo "Test 2: Frontend → Database (SHOULD BLOCK - frontend must go through backend)"
kubectl exec -n payments deploy/frontend -- curl -s --max-time 3 database.payments.svc.cluster.local 2>&1 && echo "PASS (BAD!)" || echo "BLOCKED (GOOD!)"
echo ""
echo "Test 3: Backend → Database (SHOULD PASS - legitimate path)"
kubectl exec -n payments deploy/backend -- curl -s --max-time 3 database.payments.svc.cluster.local && echo "PASS" || echo "BLOCKED"
echo ""
echo "Test 4: Database → Frontend (SHOULD BLOCK - DB should not initiate connections)"
kubectl exec -n payments deploy/database -- curl -s --max-time 3 frontend.payments.svc.cluster.local 2>&1 && echo "PASS (BAD!)" || echo "BLOCKED (GOOD!)"
echo ""
echo "Test 5: Database → external internet (SHOULD BLOCK - DB must not reach internet)"
kubectl exec -n payments deploy/database -- curl -s --max-time 3 https://example.com 2>&1 && echo "PASS (BAD!)" || echo "BLOCKED (GOOD!)"
echo ""
echo "CONCLUSION: Only legitimate communication paths are allowed."
echo "Lateral movement is prevented. The blast radius of a compromise is contained."
Terminal window
kind delete cluster --name zero-trust-lab
rm /tmp/zero-trust-cluster.yaml
  • I deployed a multi-tier application in a flat network and verified unrestricted access
  • I applied default-deny Network Policies to enforce Zero Trust
  • I verified that only legitimate communication paths (frontend->backend->database) work
  • I confirmed that unauthorized paths (frontend->database, database->frontend) are blocked
  • I can explain the four layers of micro-segmentation
  • I can describe how an Identity-Aware Proxy replaces a VPN
  • I can explain how SLSA protects the CI/CD supply chain

With Zero Trust securing your infrastructure, it is time to optimize costs at enterprise scale. Head to Module 10.10: FinOps at Enterprise Scale to learn cloud economics, Enterprise Discount Programs, forecasting, chargeback models for shared clusters, and the true cost of multi-cloud operations.