Module 10.9: Zero Trust Architecture in Hybrid Cloud
Complexity: [COMPLEX] | Time to Complete: 2.5h | Prerequisites: Kubernetes Networking, Identity & Access Management, Service Mesh Basics
What You’ll Be Able to Do
Section titled “What You’ll Be Able to Do”After completing this module, you will be able to:
- Implement zero trust network architectures for Kubernetes using service mesh mTLS, network policies, and SPIFFE identities
- Configure workload identity verification with SPIFFE/SPIRE across multi-cluster and multi-cloud environments
- Deploy micro-segmentation policies that enforce least-privilege network access at the pod and service level
- Design end-to-end zero trust architectures that cover ingress, east-west, and egress traffic in Kubernetes clusters
Why This Module Matters
Section titled “Why This Module Matters”In February 2024, a pharmaceutical company with 4,500 employees and a traditional perimeter-based security model was breached through a contractor’s compromised VPN credentials. The attacker used the VPN to access the internal network, then moved laterally across 14 systems over 18 days before being detected. They exfiltrated clinical trial data, patient records, and intellectual property valued at an estimated $340 million. The investigation revealed that once inside the VPN perimeter, the attacker had access to 83% of internal services because the security model assumed that anything inside the network was trusted.
This is the fundamental flaw of perimeter security: it creates a hard outer shell and a soft interior. A VPN gives you an all-or-nothing binary: you are either outside (no access) or inside (access to almost everything). In a world where contractors, remote employees, cloud services, and Kubernetes clusters all need varying levels of access, the perimeter model is dangerously inadequate.
Zero Trust flips this model. Instead of “trust everything inside the network,” Zero Trust says “trust nothing, verify everything.” Every request — whether it comes from inside your data center, from a Kubernetes pod, from an employee’s laptop, or from a cloud service — must prove its identity, demonstrate it is authorized, and pass through policy evaluation before being granted access. In this module, you will learn the principles of Zero Trust architecture, how BeyondCorp and Identity-Aware Proxies work, how to implement micro-segmentation in Kubernetes, how to replace VPNs with modern access patterns, and how SLSA frameworks secure your CI/CD supply chain.
Zero Trust Principles
Section titled “Zero Trust Principles”The Three Pillars
Section titled “The Three Pillars”flowchart TD A[ZERO TRUST PILLARS] A --> B[1. VERIFY EXPLICITLY] A --> C[2. LEAST PRIVILEGE] A --> D[3. ASSUME BREACH]
B --> B1[Identity] B --> B2[Device health] B --> B3[Location] B --> B4[Service ID] B --> B5[Risk score]
C --> C1[Just-in-time] C --> C2[Just-enough] C --> C3[Time-limited] C --> C4[Scope-limited] C --> C5[Reviewed]
D --> D1[Segment] D --> D2[Encrypt] D --> D3[Monitor] D --> D4[Detect] D --> D5[Respond]Zero Trust vs Perimeter Security
Section titled “Zero Trust vs Perimeter Security”| Aspect | Perimeter Security | Zero Trust |
|---|---|---|
| Trust model | Inside network = trusted | Nothing trusted by default |
| Network access | VPN grants broad access | Per-resource access based on identity + context |
| Lateral movement | Easy once inside | Micro-segmented, each service independently secured |
| Authentication | Once at VPN login | Continuous, per-request |
| Authorization | Network-level (IP, VLAN) | Application-level (identity, role, context) |
| Encryption | At the perimeter (TLS termination) | Everywhere (mTLS between all services) |
| Monitoring | Perimeter logs (firewall) | Every transaction logged and analyzed |
| Kubernetes impact | Cluster accessible via VPN | Each pod/service independently authenticated |
BeyondCorp: Google’s Zero Trust Implementation
Section titled “BeyondCorp: Google’s Zero Trust Implementation”Stop and think: If there is no VPN, how do employees securely access internal applications without exposing those applications to the public internet?
Google pioneered Zero Trust at enterprise scale with BeyondCorp, their internal access model that eliminated the corporate VPN entirely. Every Google employee accesses internal applications the same way from any network — there is no “corporate network” that grants additional trust.
BeyondCorp Architecture
Section titled “BeyondCorp Architecture”flowchart TD A[Employee any network] -- "HTTPS (always encrypted)" --> B[Identity-Aware Proxy IAP]
B --> C["Checks:<br>1. Identity (OIDC/SAML)<br>2. Device trust (MDM enrolled?)<br>3. Context (location, time)<br>4. Risk score (behavioral)<br>5. Access policy (per-app)"]
C --> D{ALLOW?} D -- Yes --> E[Proxy to backend] D -- No --> F[403 Forbidden]
E --> G["Internal Application<br>(K8s Service, VM, SaaS)<br><br>No public endpoint needed<br>IAP handles all external access"]Identity-Aware Proxy Implementations
Section titled “Identity-Aware Proxy Implementations”| Provider | Service | How It Works |
|---|---|---|
| GCP | Cloud IAP | Built-in proxy for GCE, GKE, App Engine. Checks Google Identity + device trust via Endpoint Verification. |
| AWS | Verified Access | Evaluates identity (IAM Identity Center) + device posture (Jamf, CrowdStrike) per request. Runs at the VPC level. |
| Azure | Azure AD Application Proxy | Proxies requests to on-prem/cloud apps. Evaluates Conditional Access policies per request. |
| Open Source | Pomerium, OAuth2-proxy, Teleport | Self-hosted proxies with OIDC integration. Full control, requires operational effort. |
AWS Verified Access for Kubernetes
Section titled “AWS Verified Access for Kubernetes”# Create a Verified Access trust provider (connects to your IdP)VA_TRUST=$(aws ec2 create-verified-access-trust-provider \ --trust-provider-type user \ --user-trust-provider-type oidc \ --oidc-options '{ "Issuer": "https://company.okta.com/oauth2/default", "AuthorizationEndpoint": "https://company.okta.com/oauth2/default/v1/authorize", "TokenEndpoint": "https://company.okta.com/oauth2/default/v1/token", "UserInfoEndpoint": "https://company.okta.com/oauth2/default/v1/userinfo", "ClientId": "0oa1234567abcdefg", "ClientSecret": "secret123", "Scope": "openid profile email groups" }' \ --query 'VerifiedAccessTrustProvider.VerifiedAccessTrustProviderId' --output text)
# Create a Verified Access instanceVA_INSTANCE=$(aws ec2 create-verified-access-instance \ --query 'VerifiedAccessInstance.VerifiedAccessInstanceId' --output text)
# Attach the trust provider to the instanceaws ec2 attach-verified-access-trust-provider \ --verified-access-instance-id $VA_INSTANCE \ --verified-access-trust-provider-id $VA_TRUST
# Create an endpoint that points to your K8s ingressVA_GROUP=$(aws ec2 create-verified-access-group \ --verified-access-instance-id $VA_INSTANCE \ --query 'VerifiedAccessGroup.VerifiedAccessGroupId' --output text)
aws ec2 create-verified-access-endpoint \ --verified-access-group-id $VA_GROUP \ --endpoint-type load-balancer \ --attachment-type vpc \ --domain-certificate-arn arn:aws:acm:us-east-1:123456789012:certificate/abc-123 \ --application-domain dashboard.company.com \ --endpoint-domain-prefix dashboard \ --load-balancer-options '{ "LoadBalancerArn": "arn:aws:elasticloadbalancing:us-east-1:123456789012:loadbalancer/app/k8s-ingress/abc123", "Port": 443, "Protocol": "https", "SubnetIds": ["subnet-aaa", "subnet-bbb"] }' \ --policy-document '{ "Version": "2012-10-17", "Statement": [{ "Effect": "Allow", "Principal": "*", "Action": "ec2:*", "Resource": "*", "Condition": { "StringEquals": { "verified_access.groups": ["engineering"] } } }] }'Pomerium: Open-Source Identity-Aware Proxy for Kubernetes
Section titled “Pomerium: Open-Source Identity-Aware Proxy for Kubernetes”# Deploy Pomerium as an IAP in front of Kubernetes servicesapiVersion: v1kind: ConfigMapmetadata: name: pomerium-config namespace: pomeriumdata: config.yaml: | authenticate_service_url: https://authenticate.company.com identity_provider: oidc identity_provider_url: https://company.okta.com/oauth2/default identity_provider_client_id: 0oa1234567abcdefg identity_provider_client_secret_file: /secrets/idp-client-secret
policy: # ArgoCD: only platform engineers - from: https://argocd.company.com to: http://argocd-server.argocd.svc.cluster.local:80 allowed_groups: - platform-engineers cors_allow_preflight: true preserve_host_header: true
# Grafana: all engineers, read-only for non-SRE - from: https://grafana.company.com to: http://grafana.monitoring.svc.cluster.local:3000 allowed_groups: - all-engineers set_request_headers: X-Grafana-Role: | {{- if .Groups | has "sre-team" -}}Admin{{- else -}}Viewer{{- end -}}
# Backstage: all engineers - from: https://backstage.company.com to: http://backstage.backstage.svc.cluster.local:7007 allowed_groups: - all-engineers
# Kubernetes Dashboard: platform team only, with device trust - from: https://k8s-dashboard.company.com to: http://kubernetes-dashboard.kubernetes-dashboard.svc.cluster.local:443 tls_skip_verify: true allowed_groups: - platform-engineers allowed_idp_claims: device_trust: - "managed"Micro-Segmentation in Kubernetes
Section titled “Micro-Segmentation in Kubernetes”Pause and predict: If an attacker compromises a frontend pod in a default Kubernetes cluster, what prevents them from reaching the database pod directly?
Micro-segmentation applies the Zero Trust principle of “assume breach” at the network level. Instead of a flat network where any pod can talk to any other pod, micro-segmentation restricts communication to only explicitly allowed paths.
Defense in Depth with Network Policies
Section titled “Defense in Depth with Network Policies”flowchart TD subgraph "Layer 1: Namespace Isolation" A["payments NS<br>(default deny all)"] B["identity NS<br>(default deny all)"] C["search NS<br>(default deny all)"] end
subgraph "Layer 2: Service-Level Policies" D["frontend<br>(port 80)"] -- "Only frontend can reach backend" --> E["backend<br>(port 8080)"] E -- "Only backend can reach database" --> F["database<br>(port 5432)"] end
subgraph "Layer 3: mTLS (Service Mesh)" G["Every connection authenticated + encrypted<br>SPIFFE identities verified per request"] end
subgraph "Layer 4: Application-Level Authorization" H["HTTP method + path + headers checked per request<br>Istio AuthorizationPolicy or OPA"] endComprehensive Network Policy Set
Section titled “Comprehensive Network Policy Set”# Layer 1: Default deny all ingress and egress in every namespaceapiVersion: networking.k8s.io/v1kind: NetworkPolicymetadata: name: default-deny-all namespace: paymentsspec: podSelector: {} policyTypes: - Ingress - Egress
---# Layer 2: Allow DNS resolution (required for all pods)apiVersion: networking.k8s.io/v1kind: NetworkPolicymetadata: name: allow-dns namespace: paymentsspec: podSelector: {} policyTypes: - Egress egress: - to: [] ports: - protocol: TCP port: 53 - protocol: UDP port: 53
---# Layer 2: Frontend can receive traffic from ingress controllerapiVersion: networking.k8s.io/v1kind: NetworkPolicymetadata: name: allow-ingress-to-frontend namespace: paymentsspec: podSelector: matchLabels: app: payment-frontend ingress: - from: - namespaceSelector: matchLabels: kubernetes.io/metadata.name: ingress-nginx podSelector: matchLabels: app.kubernetes.io/name: ingress-nginx ports: - protocol: TCP port: 8080
---# Layer 2: Frontend can talk to backend API onlyapiVersion: networking.k8s.io/v1kind: NetworkPolicymetadata: name: frontend-to-backend namespace: paymentsspec: podSelector: matchLabels: app: payment-frontend policyTypes: - Egress egress: - to: - podSelector: matchLabels: app: payment-backend ports: - protocol: TCP port: 8080
---# Layer 2: Backend can talk to database onlyapiVersion: networking.k8s.io/v1kind: NetworkPolicymetadata: name: backend-to-database namespace: paymentsspec: podSelector: matchLabels: app: payment-backend policyTypes: - Egress egress: - to: - podSelector: matchLabels: app: payment-database ports: - protocol: TCP port: 5432
---# Layer 2: Backend can talk to external payment gatewayapiVersion: networking.k8s.io/v1kind: NetworkPolicymetadata: name: backend-to-payment-gateway namespace: paymentsspec: podSelector: matchLabels: app: payment-backend policyTypes: - Egress egress: - to: - ipBlock: cidr: 203.0.113.0/24 # Payment gateway IP range ports: - protocol: TCP port: 443Istio Authorization Policies (Layer 4)
Section titled “Istio Authorization Policies (Layer 4)”# Only the payment-frontend service account can call the payment-backendapiVersion: security.istio.io/v1kind: AuthorizationPolicymetadata: name: payment-backend-authz namespace: paymentsspec: selector: matchLabels: app: payment-backend action: ALLOW rules: - from: - source: principals: - "cluster.local/ns/payments/sa/payment-frontend" to: - operation: methods: ["GET", "POST"] paths: ["/api/v1/payments/*", "/api/v1/refunds/*"] - from: - source: principals: - "cluster.local/ns/monitoring/sa/prometheus" to: - operation: methods: ["GET"] paths: ["/metrics"]
---# Deny all other access to payment-backendapiVersion: security.istio.io/v1kind: AuthorizationPolicymetadata: name: payment-backend-deny-all namespace: paymentsspec: selector: matchLabels: app: payment-backend action: DENY rules: - from: - source: notPrincipals: - "cluster.local/ns/payments/sa/payment-frontend" - "cluster.local/ns/monitoring/sa/prometheus"Removing VPNs: The Path to Zero Trust Access
Section titled “Removing VPNs: The Path to Zero Trust Access”The VPN Replacement Architecture
Section titled “The VPN Replacement Architecture”flowchart LR subgraph "BEFORE (VPN)" A["Employee Laptop"] -- "VPN Gateway" --> B["FLAT NETWORK<br>(access to 83% of internal services)"] end
subgraph "AFTER (Zero Trust)" C["Employee Laptop<br><br>Checks:<br>- Device<br>- Posture<br>- Cert"] -- "Identity-Aware Proxy<br><br>Checks:<br>- Identity<br>- Authorization<br>- Context" --> D["Only the ONE service<br>they need access to<br><br>mTLS, logged per-request"] endkubectl Access Without VPN
Section titled “kubectl Access Without VPN”# Teleport for Zero Trust Kubernetes accessapiVersion: apps/v1kind: Deploymentmetadata: name: teleport-kube-agent namespace: teleportspec: replicas: 2 selector: matchLabels: app: teleport-kube-agent template: metadata: labels: app: teleport-kube-agent spec: serviceAccountName: teleport-kube-agent containers: - name: teleport image: public.ecr.aws/gravitational/teleport-distroless:16 args: - "--config=/etc/teleport/teleport.yaml" volumeMounts: - name: config mountPath: /etc/teleport volumes: - name: config configMap: name: teleport-config
---apiVersion: v1kind: ConfigMapmetadata: name: teleport-config namespace: teleportdata: teleport.yaml: | version: v3 teleport: join_params: token_name: kube-agent-token method: kubernetes proxy_server: teleport.company.com:443 kubernetes_service: enabled: true listen_addr: 0.0.0.0:3027 kube_cluster_name: eks-prod-east labels: environment: production provider: aws region: us-east-1# Developer workflow: access kubectl without VPN# 1. Login via browser-based SSOtsh login --proxy=teleport.company.com
# 2. List available clusterstsh kube ls# Cluster Labels# ------------------- ----------------------------------# eks-prod-east environment=production provider=aws# aks-staging-west environment=staging provider=azure# onprem-legacy environment=production provider=onprem
# 3. Connect to a clustertsh kube login eks-prod-east
# 4. Use kubectl normally (proxied through Teleport)kubectl get pods -n payments
# Every command is:# - Authenticated via SSO (no static kubeconfig)# - Authorized per Teleport RBAC (namespace/verb restrictions)# - Logged with session recording# - Time-limited (session expires after configured duration)SLSA in Enterprise CI/CD
Section titled “SLSA in Enterprise CI/CD”Stop and think: Even with perfect network security, how could an attacker compromise a workload before it is even deployed to Kubernetes?
Supply chain security is a critical component of Zero Trust. SLSA (Supply-chain Levels for Software Artifacts) provides a framework for securing the CI/CD pipeline.
SLSA Levels
Section titled “SLSA Levels”| Level | Requirement | What It Prevents |
|---|---|---|
| SLSA 1 | Build process documented | ”How was this built?” is answerable |
| SLSA 2 | Version-controlled build, authenticated provenance | Source tampering, build reproducibility |
| SLSA 3 | Hardened build platform, non-falsifiable provenance | Compromised build system, forged attestations |
| SLSA 4 | Two-person review, hermetic builds | Insider threats, dependency confusion |
Implementing SLSA for Kubernetes Deployments
Section titled “Implementing SLSA for Kubernetes Deployments”# GitHub Actions pipeline with SLSA provenancename: Build and Deploy with SLSAon: push: branches: [main]
permissions: contents: read packages: write id-token: write # Required for OIDC-based signing
jobs: build: runs-on: ubuntu-latest outputs: digest: ${{ steps.build.outputs.digest }}
steps: - uses: actions/checkout@v4
- name: Build container image id: build run: | docker build -t ghcr.io/company/payment-service:${{ github.sha }} . DIGEST=$(docker inspect --format='{{index .RepoDigests 0}}' ghcr.io/company/payment-service:${{ github.sha }} | cut -d@ -f2) echo "digest=$DIGEST" >> $GITHUB_OUTPUT
- name: Push to registry run: | echo "${{ secrets.GITHUB_TOKEN }}" | docker login ghcr.io -u ${{ github.actor }} --password-stdin docker push ghcr.io/company/payment-service:${{ github.sha }}
- name: Sign image with cosign (keyless) uses: sigstore/cosign-installer@v3 - run: | cosign sign --yes \ ghcr.io/company/payment-service@${{ steps.build.outputs.digest }}
- name: Generate SLSA provenance uses: slsa-framework/slsa-github-generator/.github/workflows/generator_container_slsa3.yml@v2.0.0 with: image: ghcr.io/company/payment-service digest: ${{ steps.build.outputs.digest }}
deploy: needs: build runs-on: ubuntu-latest steps: - name: Verify signature before deploy run: | cosign verify \ --certificate-identity-regexp='https://github.com/company/.*' \ --certificate-oidc-issuer='https://token.actions.githubusercontent.com' \ ghcr.io/company/payment-service@${{ needs.build.outputs.digest }}
- name: Deploy to Kubernetes run: | kubectl set image deployment/payment-service \ payment-service=ghcr.io/company/payment-service@${{ needs.build.outputs.digest }} \ -n payments# Kyverno policy: only allow signed images from our CI/CDapiVersion: kyverno.io/v1kind: ClusterPolicymetadata: name: verify-slsa-provenancespec: validationFailureAction: Enforce webhookTimeoutSeconds: 30 rules: - name: verify-signature match: any: - resources: kinds: - Pod verifyImages: - imageReferences: - "ghcr.io/company/*" attestors: - entries: - keyless: subject: "https://github.com/company/*" issuer: "https://token.actions.githubusercontent.com" rekor: url: "https://rekor.sigstore.dev" mutateDigest: true verifyDigest: true required: trueDid You Know?
Section titled “Did You Know?”-
Google’s BeyondCorp project started in 2011 after Operation Aurora, a sophisticated cyberattack from China that compromised Google’s internal systems through a VPN vulnerability. Google spent 8 years migrating from perimeter security to BeyondCorp, making the transition for over 100,000 employees. By 2019, no Google employee used a VPN for internal access. The total cost of the migration was estimated at over $500 million, but Google calculated it saved them $4 billion in prevented breach costs over the following 5 years.
-
The SLSA framework was created by Google in 2021 based on their internal “Binary Authorization for Borg” (BAB) system, which has been mandatory for all Google production deployments since 2013. Every binary running in Google’s production environment must have verifiable provenance — a cryptographically signed attestation of how, when, and where it was built. This prevented multiple insider threats and supply chain attacks that would have otherwise succeeded.
-
Network Policies in Kubernetes are implemented by the CNI plugin, not by Kubernetes itself. This means that if your CNI does not support Network Policies (like the default kubenet in some managed services or Flannel without extension), your NetworkPolicy resources are silently ignored — they exist as objects but have zero enforcement. Calico, Cilium, and Azure CNI all support Network Policies. Always verify enforcement, not just resource creation.
-
Pomerium, the open-source Identity-Aware Proxy, was created by engineers who found that Google’s BeyondCorp papers described a brilliant architecture but provided no open-source implementation. Pomerium reached 10,000 GitHub stars in 2024 and is used by organizations ranging from 50-person startups to Fortune 500 companies. The average Pomerium deployment replaces 3-5 VPN appliances, saving approximately $120,000/year in licensing costs.
Common Mistakes
Section titled “Common Mistakes”| Mistake | Why It Happens | How to Fix It |
|---|---|---|
| Zero Trust without identity foundation | Teams jump to micro-segmentation and IAP without first establishing strong identity (OIDC, device trust, service accounts). | Start with identity: deploy OIDC for humans, SPIFFE for services, device trust for endpoints. Then layer on micro-segmentation and IAP. |
| Network Policies without default deny | Teams add “allow” policies but never set the default deny baseline. Pods can still communicate freely on paths without explicit policies. | Always start with a default-deny NetworkPolicy in every namespace. Then add explicit allow policies for each legitimate communication path. |
| mTLS in the mesh but plaintext sidecars | Service mesh provides mTLS between proxies, but the connection from the proxy to the application container inside the same pod is plaintext on localhost. | This is expected behavior — localhost traffic within a pod is considered trusted. If you need end-to-end encryption (e.g., for FIPS compliance), the application itself must implement TLS. |
| VPN removal without alternative | Security team removes the VPN before deploying IAP or Teleport. Developers cannot access anything. Shadow IT VPN tunnels appear. | Deploy the Zero Trust access layer first (IAP, Teleport). Run it in parallel with the VPN for 3-6 months. Only decommission the VPN after all access patterns are migrated. |
| Image signing without admission enforcement | CI/CD pipeline signs images with cosign, but no admission webhook verifies signatures. Unsigned images can still be deployed. | Deploy Kyverno or Gatekeeper with image verification policies. Signing without enforcement is security theater. |
| Overly broad Istio AuthorizationPolicies | Teams write policies with action: ALLOW that match too broadly, effectively allowing everything. The policy exists but does not restrict. | Use deny-by-default: start with an AuthorizationPolicy that denies all, then add specific allow rules for each legitimate path. Test with istioctl analyze. |
Question 1: A developer's laptop is stolen while logged into the corporate VPN with a valid kubeconfig file. Under a traditional perimeter security model, what happens next compared to a Zero Trust architecture using an Identity-Aware Proxy (IAP) like Teleport?
Under a perimeter security model, the attacker now has full network access to the corporate environment and the Kubernetes API server because the VPN provides a binary “inside/trusted” state. The valid kubeconfig file allows the attacker to authenticate to the cluster and execute commands with the developer’s broad RBAC permissions, potentially compromising the entire environment.
In a Zero Trust architecture, the stolen laptop and VPN connection are useless on their own. The IAP continuously verifies identity and context per request. Even if the attacker has the laptop, they would need the developer’s SSO credentials and physical MFA token to establish a new session. Furthermore, the IAP enforces device health checks (which might fail if the device is reported stolen) and limits access strictly to the namespaces the developer needs, minimizing the blast radius. Trust is never binary; it is continuously evaluated.
Question 2: A team has deployed Network Policies with a default-deny rule, but pods can still communicate freely. What is the most likely cause?
The most likely cause is that the CNI plugin does not support Network Policies. NetworkPolicy resources are processed by the CNI plugin, not by the Kubernetes API server. If the cluster uses a CNI that does not implement the NetworkPolicy API (like Flannel without the Calico integration, or AWS VPC CNI without the network policy controller), the NetworkPolicy objects are stored in etcd but have no enforcement. The pods see no firewalling because there is no component enforcing the rules. To diagnose: (1) Check which CNI is installed (kubectl get pods -n kube-system | grep -E 'calico|cilium|weave'). (2) Verify the CNI supports Network Policies (check documentation). (3) Test enforcement: create a default-deny policy and verify that pods actually cannot communicate. On EKS, you need to enable the VPC CNI network policy feature or install Calico alongside VPC CNI.
Question 3: A sophisticated attacker compromises your CI/CD worker node and injects malicious code during the build process of your payment service. How does SLSA Level 3 prevent this compromised container image from running in your production Kubernetes cluster?
SLSA Level 3 requires a hardened build platform and non-falsifiable provenance. The build platform is isolated so that individual builds cannot influence each other or tamper with the build process. Provenance is generated by the build platform itself (not by the build script), and it is cryptographically signed in a way that the build script cannot forge. If an attacker compromises a CI/CD worker (e.g., injects malicious code into a build), the provenance will either: (1) accurately reflect that the build used a modified source (because provenance is generated independently of the build script), or (2) be absent (if the attacker tries to skip provenance generation, the admission webhook rejects the artifact). The key insight is that at SLSA 3, provenance is a property of the build platform, not of the build. The build cannot lie about its own origin.
Question 4: The CISO mandates the removal of the corporate VPN within 6 months in favor of a Zero Trust architecture. The infrastructure team proposes shutting down the VPN next weekend and routing all traffic through a newly installed Identity-Aware Proxy (IAP) to force adoption. Why is this approach likely to fail, and what sequence of steps should be taken instead?
This “rip and replace” approach is highly likely to fail and cause a massive business disruption because it assumes all applications and access patterns are immediately compatible with the IAP. Without a strong identity foundation already in place, users will be locked out of critical services, leading to shadow IT workarounds and halted productivity.
Instead, the migration must be incremental and run in parallel. First, you must establish a strong identity foundation (SSO, MFA, device MDM). Second, deploy the IAP alongside the existing VPN without disrupting current workflows. Third, incrementally migrate applications starting with low-risk internal tools (like Grafana or Backstage) to test the IAP, before moving to production Kubernetes access. You must monitor access patterns over 3-6 months to ensure all legitimate traffic has shifted to the IAP before finally decommissioning the VPN. Rushing the VPN shutdown is the most common failure mode in Zero Trust migrations.
Question 5: A security auditor reviews your cluster and notices you are using Istio Authorization Policies to restrict traffic between services, but you have no Kubernetes Network Policies. They flag this as a vulnerability. Why would they require both if Istio already controls access?
Network Policies operate at Layer 3/4 (IP addresses and ports). They control which pods can establish TCP/UDP connections to which other pods. They are enforced by the CNI plugin and work without a service mesh. Istio Authorization Policies operate at Layer 7 (HTTP methods, paths, headers, service identities). They control what requests are allowed within an established connection. They require the Istio sidecar proxy.
You need both for defense in depth. Network Policies prevent unauthorized network connections from being established at all — even if Istio is misconfigured or the sidecar is bypassed. Istio Authorization Policies provide fine-grained control that Network Policies cannot: allowing GET but denying DELETE, or allowing /api/v1/payments but denying /api/v1/admin. Network Policies are the coarse guard at the door; Istio policies are the fine-grained access control inside the room.
Question 6: An engineer argues that implementing mTLS in your Istio service mesh makes Network Policies unnecessary because "mTLS already verifies identity and encrypts traffic." Why is this assertion dangerous in a Zero Trust environment?
mTLS verifies the identity of the communicating parties (via SPIFFE certificates) and encrypts the traffic. But it does not restrict which communications can happen. By default, Istio’s mTLS allows any service with a valid mesh certificate to communicate with any other service. mTLS ensures that the caller is who they claim to be; it does not ensure the caller is authorized for that specific action. You still need: (1) AuthorizationPolicies to restrict which identities can call which services (Layer 7). (2) Network Policies as a backup in case the Istio sidecar is bypassed (e.g., host-networked pods, init containers that run before the sidecar, or pods without the sidecar injected). mTLS is an authentication mechanism, not an authorization mechanism. Confusing the two is a common and dangerous mistake.
Hands-On Exercise: Implement Zero Trust Micro-Segmentation
Section titled “Hands-On Exercise: Implement Zero Trust Micro-Segmentation”In this exercise, you will implement a multi-layered Zero Trust architecture in a kind cluster with Network Policies, RBAC, and simulated identity-aware access.
Task 1: Create the Zero Trust Lab Cluster
Section titled “Task 1: Create the Zero Trust Lab Cluster”Solution
# Create a cluster with Calico CNI for Network Policy enforcementcat <<'EOF' > /tmp/zero-trust-cluster.yamlkind: ClusterapiVersion: kind.x-k8s.io/v1alpha4name: zero-trust-labnetworking: disableDefaultCNI: true podSubnet: 192.168.0.0/16nodes: - role: control-plane - role: worker - role: workerEOF
kind create cluster --config /tmp/zero-trust-cluster.yaml
# Install Calico for Network Policy enforcementkubectl apply -f https://raw.githubusercontent.com/projectcalico/calico/v3.28.0/manifests/calico.yaml
# Wait for Calico to be readykubectl wait --for=condition=ready pod -l k8s-app=calico-node -n kube-system --timeout=120skubectl wait --for=condition=ready pod -l k8s-app=calico-kube-controllers -n kube-system --timeout=120s
echo "Cluster ready with Calico CNI (Network Policy support enabled)"Task 2: Deploy a Multi-Service Application
Section titled “Task 2: Deploy a Multi-Service Application”Solution
# Create namespaceskubectl create namespace paymentskubectl create namespace monitoring
# Deploy a 3-tier applicationcat <<'EOF' | kubectl apply -f -# FrontendapiVersion: apps/v1kind: Deploymentmetadata: name: frontend namespace: paymentsspec: replicas: 2 selector: matchLabels: app: frontend template: metadata: labels: app: frontend tier: frontend spec: containers: - name: frontend image: nginx:1.27.3 ports: - containerPort: 80 resources: limits: cpu: 100m memory: 128Mi---apiVersion: v1kind: Servicemetadata: name: frontend namespace: paymentsspec: selector: app: frontend ports: - port: 80---# Backend APIapiVersion: apps/v1kind: Deploymentmetadata: name: backend namespace: paymentsspec: replicas: 2 selector: matchLabels: app: backend template: metadata: labels: app: backend tier: backend spec: containers: - name: backend image: nginx:1.27.3 ports: - containerPort: 80 resources: limits: cpu: 100m memory: 128Mi---apiVersion: v1kind: Servicemetadata: name: backend namespace: paymentsspec: selector: app: backend ports: - port: 80---# DatabaseapiVersion: apps/v1kind: Deploymentmetadata: name: database namespace: paymentsspec: replicas: 1 selector: matchLabels: app: database template: metadata: labels: app: database tier: database spec: containers: - name: database image: nginx:1.27.3 ports: - containerPort: 80 resources: limits: cpu: 100m memory: 128Mi---apiVersion: v1kind: Servicemetadata: name: database namespace: paymentsspec: selector: app: database ports: - port: 80EOF
kubectl wait --for=condition=ready pod -l app=frontend -n payments --timeout=60skubectl wait --for=condition=ready pod -l app=backend -n payments --timeout=60skubectl wait --for=condition=ready pod -l app=database -n payments --timeout=60sTask 3: Verify Flat Network (Before Zero Trust)
Section titled “Task 3: Verify Flat Network (Before Zero Trust)”Solution
echo "=== BEFORE ZERO TRUST: Flat Network ==="echo ""echo "Test: Frontend → Backend (should succeed - legitimate)"kubectl exec -n payments deploy/frontend -- curl -s --max-time 3 backend.payments.svc.cluster.local || echo "FAILED"
echo ""echo "Test: Frontend → Database (should succeed - PROBLEM: frontend should not access DB directly)"kubectl exec -n payments deploy/frontend -- curl -s --max-time 3 database.payments.svc.cluster.local || echo "FAILED"
echo ""echo "Test: Database → Frontend (should succeed - PROBLEM: DB should not call frontend)"kubectl exec -n payments deploy/database -- curl -s --max-time 3 frontend.payments.svc.cluster.local || echo "FAILED"
echo ""echo "CONCLUSION: Without Network Policies, every pod can talk to every other pod."echo "This is the 'soft interior' problem of perimeter security."Task 4: Apply Zero Trust Network Policies
Section titled “Task 4: Apply Zero Trust Network Policies”Solution
cat <<'EOF' | kubectl apply -f -# Step 1: Default deny ALL trafficapiVersion: networking.k8s.io/v1kind: NetworkPolicymetadata: name: default-deny-all namespace: paymentsspec: podSelector: {} policyTypes: - Ingress - Egress---# Step 2: Allow DNS for all podsapiVersion: networking.k8s.io/v1kind: NetworkPolicymetadata: name: allow-dns namespace: paymentsspec: podSelector: {} policyTypes: - Egress egress: - ports: - protocol: TCP port: 53 - protocol: UDP port: 53---# Step 3: Frontend can receive from outside and send to backend onlyapiVersion: networking.k8s.io/v1kind: NetworkPolicymetadata: name: frontend-policy namespace: paymentsspec: podSelector: matchLabels: app: frontend policyTypes: - Ingress - Egress ingress: - {} # Accept from any source (simulates ingress controller) egress: - to: - podSelector: matchLabels: app: backend ports: - port: 80 - ports: - protocol: TCP port: 53 - protocol: UDP port: 53---# Step 4: Backend accepts from frontend, can reach database onlyapiVersion: networking.k8s.io/v1kind: NetworkPolicymetadata: name: backend-policy namespace: paymentsspec: podSelector: matchLabels: app: backend policyTypes: - Ingress - Egress ingress: - from: - podSelector: matchLabels: app: frontend ports: - port: 80 egress: - to: - podSelector: matchLabels: app: database ports: - port: 80 - ports: - protocol: TCP port: 53 - protocol: UDP port: 53---# Step 5: Database accepts from backend only, no egressapiVersion: networking.k8s.io/v1kind: NetworkPolicymetadata: name: database-policy namespace: paymentsspec: podSelector: matchLabels: app: database policyTypes: - Ingress - Egress ingress: - from: - podSelector: matchLabels: app: backend ports: - port: 80 egress: - ports: - protocol: TCP port: 53 - protocol: UDP port: 53EOF
echo "Network Policies applied:"kubectl get networkpolicy -n paymentsTask 5: Verify Zero Trust Enforcement
Section titled “Task 5: Verify Zero Trust Enforcement”Solution
echo "=== AFTER ZERO TRUST: Micro-Segmented Network ==="echo ""
echo "Test 1: Frontend → Backend (SHOULD PASS - legitimate path)"kubectl exec -n payments deploy/frontend -- curl -s --max-time 3 backend.payments.svc.cluster.local && echo "PASS" || echo "BLOCKED"
echo ""echo "Test 2: Frontend → Database (SHOULD BLOCK - frontend must go through backend)"kubectl exec -n payments deploy/frontend -- curl -s --max-time 3 database.payments.svc.cluster.local 2>&1 && echo "PASS (BAD!)" || echo "BLOCKED (GOOD!)"
echo ""echo "Test 3: Backend → Database (SHOULD PASS - legitimate path)"kubectl exec -n payments deploy/backend -- curl -s --max-time 3 database.payments.svc.cluster.local && echo "PASS" || echo "BLOCKED"
echo ""echo "Test 4: Database → Frontend (SHOULD BLOCK - DB should not initiate connections)"kubectl exec -n payments deploy/database -- curl -s --max-time 3 frontend.payments.svc.cluster.local 2>&1 && echo "PASS (BAD!)" || echo "BLOCKED (GOOD!)"
echo ""echo "Test 5: Database → external internet (SHOULD BLOCK - DB must not reach internet)"kubectl exec -n payments deploy/database -- curl -s --max-time 3 https://example.com 2>&1 && echo "PASS (BAD!)" || echo "BLOCKED (GOOD!)"
echo ""echo "CONCLUSION: Only legitimate communication paths are allowed."echo "Lateral movement is prevented. The blast radius of a compromise is contained."Clean Up
Section titled “Clean Up”kind delete cluster --name zero-trust-labrm /tmp/zero-trust-cluster.yamlSuccess Criteria
Section titled “Success Criteria”- I deployed a multi-tier application in a flat network and verified unrestricted access
- I applied default-deny Network Policies to enforce Zero Trust
- I verified that only legitimate communication paths (frontend->backend->database) work
- I confirmed that unauthorized paths (frontend->database, database->frontend) are blocked
- I can explain the four layers of micro-segmentation
- I can describe how an Identity-Aware Proxy replaces a VPN
- I can explain how SLSA protects the CI/CD supply chain
Next Module
Section titled “Next Module”With Zero Trust securing your infrastructure, it is time to optimize costs at enterprise scale. Head to Module 10.10: FinOps at Enterprise Scale to learn cloud economics, Enterprise Discount Programs, forecasting, chargeback models for shared clusters, and the true cost of multi-cloud operations.