Module 6.3: GKE Workload Identity and Security
Complexity: [MEDIUM] | Time to Complete: 2.5h | Prerequisites: Module 6.1 (GKE Architecture)
What You’ll Be Able to Do
Section titled “What You’ll Be Able to Do”After completing this module, you will be able to:
- Configure GKE Workload Identity to map Kubernetes service accounts to GCP IAM service accounts
- Implement Binary Authorization to enforce container image provenance and deploy-time attestation policies
- Deploy GKE security posture features to identify workload misconfigurations and enforce Pod Security Standards
- Integrate Google Cloud Secret Manager using the CSI driver for secure external secret management
Why This Module Matters
Section titled “Why This Module Matters”In January 2024, a logistics company discovered that every pod in their GKE cluster had read/write access to every Cloud Storage bucket and every Pub/Sub topic in their project. A junior developer had deployed a debug pod that scraped all Pub/Sub messages from the production order queue and wrote them to a personal GCS bucket for “testing.” The data included customer addresses, phone numbers, and delivery instructions for 2.1 million orders. The root cause was depressingly common: when the cluster was created, the default node service account was granted the Editor role on the project, and every pod on the cluster inherited that identity. No one had configured Workload Identity. The remediation cost $890,000 in legal fees, notification costs, and a GDPR fine. The fix---configuring Workload Identity and scoping IAM permissions per pod---took two days.
This incident illustrates the most dangerous default in GKE: without Workload Identity, every pod on a node shares the same GCP identity. A compromised pod, a rogue container, or even a developer with kubectl access can impersonate the node’s service account and access any GCP resource that account can reach. Workload Identity solves this by binding individual Kubernetes ServiceAccounts to individual GCP service accounts, giving each workload only the permissions it needs.
In this module, you will learn how Workload Identity Federation for GKE works, how to configure Binary Authorization to ensure only trusted container images run in your cluster, how Shielded and Confidential Nodes protect the node itself, and how to integrate Secret Manager with GKE. By the end, you will set up a pod that securely accesses Pub/Sub using Workload Identity and enforce a Binary Authorization policy that blocks unsigned images.
The Problem: Node-Level Identity
Section titled “The Problem: Node-Level Identity”Without Workload Identity, GKE pods access GCP services using the node’s service account. Every VM (node) in a node pool runs with a GCP service account attached, and every pod on that node can access the metadata server to obtain OAuth tokens for that account.
flowchart TD subgraph Node ["Node (VM) - SA: node-sa@project.iam (Editor)"] A["App Pod<br/>(Needs: GCS read)"] B["Debug Pod<br/>(Needs: Nothing)"] C["Rogue Pod<br/>(Wants: EVERYTHING)"] M["Metadata Server (169.254.169.254)<br/>Returns: node-sa token (Editor access)"] end
A --> M B --> M C --> MThis is a violation of the principle of least privilege. The app pod only needs GCS read access, but it gets Editor. The debug pod needs nothing, but it gets Editor. The rogue pod gets Editor too.
Workload Identity Federation for GKE
Section titled “Workload Identity Federation for GKE”Workload Identity Federation (WIF) for GKE maps Kubernetes ServiceAccounts to GCP IAM service accounts. Each pod gets credentials scoped to exactly the GCP resources it needs.
How It Works
Section titled “How It Works”flowchart TD subgraph Node ["Node (VM) - Node SA: restricted-node-sa (minimal permissions)"] A["App Pod<br/>(KSA: app-sa)"] B["Debug Pod<br/>(KSA: default)"] C["Batch Pod<br/>(KSA: batch-sa)"]
subgraph Meta ["GKE Metadata Server (replaces default)"] M1["app-sa → gcs-reader@proj.iam"] M2["default → (no GCP SA, access denied)"] M3["batch-sa → pubsub-writer@proj.iam"] end end
A --> M1 B --> M2 C --> M3Setting Up Workload Identity
Section titled “Setting Up Workload Identity”export PROJECT_ID=$(gcloud config get-value project)export PROJECT_NUMBER=$(gcloud projects describe $PROJECT_ID --format="value(projectNumber)")
# Step 1: Enable Workload Identity on the cluster (if not already)# Best practice: enable at cluster creation with --workload-poolgcloud container clusters update my-cluster \ --region=us-central1 \ --workload-pool=$PROJECT_ID.svc.id.goog
# Step 2: Create a GCP service account for the workloadgcloud iam service-accounts create gcs-reader-sa \ --display-name="GCS Reader for App Pod"
# Step 3: Grant the GCP SA only the permissions it needsgcloud projects add-iam-policy-binding $PROJECT_ID \ --member="serviceAccount:gcs-reader-sa@$PROJECT_ID.iam.gserviceaccount.com" \ --role="roles/storage.objectViewer"
# Step 4: Create a Kubernetes ServiceAccountkubectl create serviceaccount app-sa --namespace=default
# Step 5: Bind the Kubernetes SA to the GCP SAgcloud iam service-accounts add-iam-policy-binding \ gcs-reader-sa@$PROJECT_ID.iam.gserviceaccount.com \ --role="roles/iam.workloadIdentityUser" \ --member="serviceAccount:$PROJECT_ID.svc.id.goog[default/app-sa]"
# Step 6: Annotate the Kubernetes SA with the GCP SA emailkubectl annotate serviceaccount app-sa \ --namespace=default \ iam.gke.io/gcp-service-account=gcs-reader-sa@$PROJECT_ID.iam.gserviceaccount.comUsing Workload Identity in a Pod
Section titled “Using Workload Identity in a Pod”apiVersion: v1kind: Podmetadata: name: gcs-reader namespace: defaultspec: serviceAccountName: app-sa # This is the key line containers: - name: reader image: google/cloud-sdk:slim command: ["sleep", "infinity"] resources: requests: cpu: 100m memory: 128Mi# Deploy and verifykubectl apply -f gcs-reader-pod.yaml
# Exec into the pod and verify the identitykubectl exec -it gcs-reader -- gcloud auth list# Should show: gcs-reader-sa@PROJECT_ID.iam.gserviceaccount.com
# Test GCS access (should work)kubectl exec -it gcs-reader -- gsutil ls gs://some-readable-bucket/
# Test Pub/Sub access (should be denied)kubectl exec -it gcs-reader -- gcloud pubsub topics list# Should fail with permission deniedFleet Workload Identity Federation (Cross-Project)
Section titled “Fleet Workload Identity Federation (Cross-Project)”For multi-project setups, Fleet Workload Identity Federation allows pods in one project to access resources in another project without creating service accounts in every project.
# Register the cluster with a Fleetgcloud container fleet memberships register my-cluster \ --gke-cluster=$REGION/my-cluster \ --enable-workload-identity
# Grant cross-project access using the Fleet identitygcloud projects add-iam-policy-binding OTHER_PROJECT_ID \ --member="serviceAccount:$PROJECT_ID.svc.id.goog[NAMESPACE/KSA_NAME]" \ --role="roles/storage.objectViewer"Stop and think: If a pod in the
defaultnamespace does not have aserviceAccountNamespecified in its spec, which Kubernetes ServiceAccount does it use? How does this impact Workload Identity if that ServiceAccount is not annotated?
Binary Authorization
Section titled “Binary Authorization”Binary Authorization ensures that only trusted container images can be deployed to your GKE cluster. It works by requiring cryptographic attestations on images before they are allowed to run.
How Binary Authorization Works
Section titled “How Binary Authorization Works”flowchart TD A[Developer pushes code] --> B[Cloud Build builds image] B --> C[Image pushed to Artifact Registry] C --> D[Attestor signs the image digest<br/>Human review, vulnerability scan pass] D --> E[Developer creates Deployment]
E --> F[GKE Admission Controller checks:<br/>1. Is Binary Authorization enabled?<br/>2. Does the image have a valid attestation?<br/>3. Does the image match the policy?]
F -->|YES| G[Allow pod to start] F -->|NO| H[Block pod, log violation]Setting Up Binary Authorization
Section titled “Setting Up Binary Authorization”# Step 1: Enable Binary Authorization APIgcloud services enable binaryauthorization.googleapis.com \ --project=$PROJECT_ID
# Step 2: Enable Binary Authorization on the clustergcloud container clusters update my-cluster \ --region=us-central1 \ --binauthz-evaluation-mode=PROJECT_SINGLETON_POLICY_ENFORCE
# Step 3: View the default policygcloud container binauthz policy export
# Step 4: Create a policy that allows only Artifact Registry imagescat <<'EOF' > /tmp/binauthz-policy.yamladmissionWhitelistPatterns:- namePattern: gcr.io/google-containers/*- namePattern: gcr.io/google-samples/*- namePattern: us-docker.pkg.dev/google-samples/*defaultAdmissionRule: enforcementMode: ENFORCED_BLOCK_AND_AUDIT_LOG evaluationMode: ALWAYS_DENYglobalPolicyEvaluationMode: ENABLEclusterAdmissionRules: us-central1.my-cluster: enforcementMode: ENFORCED_BLOCK_AND_AUDIT_LOG evaluationMode: REQUIRE_ATTESTATION requireAttestationsBy: - projects/PROJECT_ID/attestors/build-attestorEOF
# Step 5: Import the policygcloud container binauthz policy import /tmp/binauthz-policy.yamlCreating an Attestor
Section titled “Creating an Attestor”# Create a key ring and key for signinggcloud kms keyrings create binauthz-keyring \ --location=global
gcloud kms keys create attestor-key \ --keyring=binauthz-keyring \ --location=global \ --purpose=asymmetric-signing \ --default-algorithm=ec-sign-p256-sha256
# Create a Container Analysis notecat <<EOF > /tmp/note.json{ "attestation": { "hint": { "humanReadableName": "Build Attestor Note" } }}EOF
curl -X POST \ "https://containeranalysis.googleapis.com/v1/projects/$PROJECT_ID/notes/?noteId=build-attestor-note" \ -H "Authorization: Bearer $(gcloud auth print-access-token)" \ -H "Content-Type: application/json" \ -d @/tmp/note.json
# Create the attestorgcloud container binauthz attestors create build-attestor \ --attestation-authority-note=build-attestor-note \ --attestation-authority-note-project=$PROJECT_ID
# Add the KMS key to the attestorgcloud container binauthz attestors public-keys add \ --attestor=build-attestor \ --keyversion-project=$PROJECT_ID \ --keyversion-location=global \ --keyversion-keyring=binauthz-keyring \ --keyversion-key=attestor-key \ --keyversion=1
# Sign an imageIMAGE_PATH="us-central1-docker.pkg.dev/$PROJECT_ID/my-repo/my-app"IMAGE_DIGEST=$(gcloud container images describe $IMAGE_PATH:latest \ --format="value(image_summary.digest)")
gcloud container binauthz attestations sign-and-create \ --artifact-url="$IMAGE_PATH@$IMAGE_DIGEST" \ --attestor=build-attestor \ --attestor-project=$PROJECT_ID \ --keyversion-project=$PROJECT_ID \ --keyversion-location=global \ --keyversion-keyring=binauthz-keyring \ --keyversion-key=attestor-key \ --keyversion=1Testing Binary Authorization
Section titled “Testing Binary Authorization”# This should succeed (signed image or whitelisted pattern)kubectl run trusted --image=gcr.io/google-samples/hello-app:1.0
# This should be BLOCKED (unsigned image from Docker Hub)kubectl run untrusted --image=nginx:latest# Error: admission webhook "imagepolicywebhook.image-policy.k8s.io"# denied the request: Image nginx:latest denied by Binary Authorization policy
# Check audit logs for denialsgcloud logging read \ 'resource.type="k8s_cluster" AND protoPayload.response.reason="BINARY_AUTHORIZATION"' \ --limit=5War Story: A team enabled Binary Authorization in enforce mode on a Friday afternoon. On Monday morning, their CI/CD pipeline had broken because Cloud Build was pushing images but not creating attestations. Every deployment for 48 hours was blocked. Start with DRYRUN_AUDIT_LOG_ONLY mode to identify what would be blocked before switching to enforce mode.
Pause and predict: You enable Binary Authorization in enforce mode with a policy requiring an attestation from a specific KMS key. A developer deploys an image signed by a different, older KMS key that was recently removed from the attestor. What will happen when the pod starts, and where would you look to verify this?
Shielded GKE Nodes and Confidential Nodes
Section titled “Shielded GKE Nodes and Confidential Nodes”Shielded GKE Nodes
Section titled “Shielded GKE Nodes”Shielded nodes provide verifiable integrity for your cluster nodes, protecting against rootkits and boot-level tampering.
| Feature | Protection | How It Works |
|---|---|---|
| Secure Boot | Prevents unsigned kernel modules | Only Google-signed boot components load |
| vTPM | Measured boot integrity | Stores measurements for remote attestation |
| Integrity Monitoring | Detects runtime tampering | Compares boot measurements to known-good baseline |
# Shielded nodes are enabled by default on new GKE clusters# Verify on an existing cluster:gcloud container clusters describe my-cluster \ --region=us-central1 \ --format="yaml(shieldedNodes)"
# Explicitly enable if not set:gcloud container clusters update my-cluster \ --region=us-central1 \ --enable-shielded-nodesConfidential Nodes
Section titled “Confidential Nodes”Confidential Nodes go beyond Shielded Nodes by encrypting data in memory using AMD SEV (Secure Encrypted Virtualization). Even if an attacker has physical access to the server or can perform a cold-boot attack, they cannot read the node’s memory.
# Create a node pool with Confidential Nodesgcloud container node-pools create confidential-pool \ --cluster=my-cluster \ --region=us-central1 \ --machine-type=n2d-standard-4 \ --num-nodes=1 \ --enable-confidential-nodes
# Note: Confidential Nodes require N2D machine types (AMD EPYC)# and are available in limited regions| Feature | Shielded Nodes | Confidential Nodes |
|---|---|---|
| Boot integrity | Yes | Yes |
| Memory encryption | No | Yes (AMD SEV) |
| Performance impact | None | ~2-6% overhead |
| Machine types | All | N2D only (AMD) |
| Cost | No additional cost | ~10% premium |
| Use case | All production clusters | Financial, healthcare, PII |
Stop and think: Your compliance team requires that data in use (in memory) must be encrypted. Which node type must you choose, and what specific CPU architecture is required to support this feature?
GKE Security Posture Dashboard
Section titled “GKE Security Posture Dashboard”The Security Posture dashboard provides a centralized view of security issues across your GKE clusters. It scans for misconfigurations, vulnerability exposure, and policy violations.
What It Detects
Section titled “What It Detects”# Enable Security Posture on the clustergcloud container clusters update my-cluster \ --region=us-central1 \ --security-posture=standard \ --workload-vulnerability-scanning=standard
# Check security posture findings via gcloudgcloud container security-posture findings list \ --project=$PROJECT_ID \ --format="table(finding.severity, finding.category, finding.description)"The dashboard checks for:
- Workload configuration: Pods running as root, missing security contexts, privileged containers
- Container vulnerabilities: CVEs in container images from Artifact Registry
- Network exposure: Services exposed to the internet without authentication
- RBAC issues: Overly permissive ClusterRoleBindings
- Supply chain: Images not from trusted registries
Hardening Pod Security
Section titled “Hardening Pod Security”GKE supports Pod Security Standards (PSS) through the built-in Pod Security Admission controller:
# Enforce restricted Pod Security Standard on a namespacekubectl label namespace production \ pod-security.kubernetes.io/enforce=restricted \ pod-security.kubernetes.io/warn=restricted \ pod-security.kubernetes.io/audit=restricted# A pod that passes the "restricted" security standardapiVersion: v1kind: Podmetadata: name: secure-pod namespace: productionspec: securityContext: runAsNonRoot: true runAsUser: 1000 fsGroup: 2000 seccompProfile: type: RuntimeDefault containers: - name: app image: us-central1-docker.pkg.dev/my-project/repo/app:v1 securityContext: allowPrivilegeEscalation: false readOnlyRootFilesystem: true capabilities: drop: - ALL resources: requests: cpu: 100m memory: 128Mi limits: cpu: 200m memory: 256MiPause and predict: You apply the
restrictedPod Security Standard to a namespace inenforcemode. A developer tries to deploy a pod withrunAsNonRoot: false. Will the pod be created? What happens if the namespace was set towarnmode instead?
Secret Manager Integration
Section titled “Secret Manager Integration”GKE integrates with Google Cloud Secret Manager through the Secret Manager add-on, which uses the Secrets Store CSI Driver to mount secrets as files in pods.
Setting Up Secret Manager CSI Driver
Section titled “Setting Up Secret Manager CSI Driver”# Enable the Secret Manager add-on on the clustergcloud container clusters update my-cluster \ --region=us-central1 \ --enable-secret-manager
# Verify the driver is installedkubectl get csidriver secrets-store.csi.k8s.io
# Create a secret in Secret Managerecho -n "my-database-password" | gcloud secrets create db-password \ --data-file=- \ --replication-policy=automatic
# Grant the workload's GCP SA access to the secretgcloud secrets add-iam-policy-binding db-password \ --member="serviceAccount:app-sa@$PROJECT_ID.iam.gserviceaccount.com" \ --role="roles/secretmanager.secretAccessor"Mounting Secrets in Pods
Section titled “Mounting Secrets in Pods”# SecretProviderClass defines which secrets to mountapiVersion: secrets-store.csi.x-k8s.io/v1kind: SecretProviderClassmetadata: name: gcp-secretsspec: provider: gcp parameters: secrets: | - resourceName: "projects/PROJECT_NUMBER/secrets/db-password/versions/latest" path: "db-password" - resourceName: "projects/PROJECT_NUMBER/secrets/api-key/versions/latest" path: "api-key"
---apiVersion: v1kind: Podmetadata: name: app-with-secretsspec: serviceAccountName: app-sa # Must have Workload Identity configured containers: - name: app image: us-central1-docker.pkg.dev/my-project/repo/app:v1 volumeMounts: - name: secrets mountPath: /var/secrets readOnly: true resources: requests: cpu: 100m memory: 128Mi volumes: - name: secrets csi: driver: secrets-store.csi.k8s.io readOnly: true volumeAttributes: secretProviderClass: gcp-secrets# After deploying, verify the secret is mountedkubectl exec app-with-secrets -- cat /var/secrets/db-password# Output: my-database-password
# Secrets are NOT stored in etcd, reducing the blast radius# if the cluster's etcd encryption is compromisedSecret Manager vs Kubernetes Secrets
Section titled “Secret Manager vs Kubernetes Secrets”| Aspect | Kubernetes Secrets | Secret Manager + CSI |
|---|---|---|
| Storage | etcd (in cluster) | Google-managed (external) |
| Encryption at rest | Application-layer encryption | Automatic, Google-managed keys or CMEK |
| Versioning | No (replace only) | Full version history |
| Rotation | Manual (update + rollout) | Automatic with periodic sync |
| Audit logging | Kubernetes audit logs | Cloud Audit Logs (who accessed what, when) |
| Cross-cluster sharing | Not supported | Same secret across clusters/projects |
| Access control | RBAC (namespace-scoped) | IAM (project/org-scoped) |
Stop and think: A developer wants to roll back a deployment that uses Secret Manager for database credentials. The older version of the deployment needs an older password. How does the Secret Manager CSI driver handle versioning compared to native Kubernetes Secrets?
Did You Know?
Section titled “Did You Know?”-
Before Workload Identity existed, the recommended workaround was to distribute service account JSON key files as Kubernetes Secrets. This meant private key material was stored in etcd, potentially logged, and visible to anyone with RBAC access to the namespace. Google internal security audits in 2019 found that 34% of GKE clusters in a sample had service account keys stored as Kubernetes Secrets. Workload Identity, launched in 2019, eliminated the need for key files entirely by using short-lived, automatically-rotated tokens.
-
Binary Authorization attestations are immutable and tied to the exact image digest (SHA-256), not the tag. If someone pushes a new image with the tag
v1.0(overwriting the old one), the attestation on the original image becomes invalid for the new image because the digest changed. This prevents a supply chain attack where an attacker replaces a trusted image with a malicious one while keeping the same tag. Always deploy by digest in production:image: us-central1-docker.pkg.dev/proj/repo/app@sha256:abc123... -
Confidential GKE Nodes encrypt each node’s memory with a unique key that changes on every boot. The key is generated inside the AMD Secure Processor and never leaves the CPU. Google’s hypervisor, host OS, and other VMs on the same physical host cannot read the node’s memory. The performance overhead is typically 2-6% for most workloads because the encryption happens in the CPU’s memory controller at hardware speed, not in software.
-
The GKE metadata server that enables Workload Identity intercepts all traffic to 169.254.169.254 (the standard cloud metadata endpoint) from pods. When a pod with Workload Identity configured requests an access token, the GKE metadata server contacts Google’s Security Token Service (STS) to exchange the Kubernetes ServiceAccount token for a short-lived GCP access token scoped to the mapped GCP service account. These tokens expire after 1 hour and are automatically refreshed. Pods without Workload Identity receive a “permission denied” response instead of the node’s credentials.
Common Mistakes
Section titled “Common Mistakes”| Mistake | Why It Happens | How to Fix It |
|---|---|---|
| Using the default Compute Engine service account for nodes | Cluster created without specifying a custom node SA | Create a dedicated node SA with minimal permissions; use --service-account flag |
| Not annotating the Kubernetes ServiceAccount | Workload Identity binding created but annotation forgotten | Always annotate: iam.gke.io/gcp-service-account=GSA@PROJECT.iam |
| Enabling Binary Authorization in enforce mode immediately | Wanting security without testing impact first | Start with DRYRUN_AUDIT_LOG_ONLY mode; review logs for 1-2 weeks before enforcing |
Granting roles/editor to workload service accounts | ”Editor” seems like a reasonable default | Use least-privilege roles: storage.objectViewer, pubsub.subscriber, etc. |
| Storing secrets as Kubernetes Secrets without encryption | Assuming K8s Secrets are encrypted by default | Enable application-layer encryption or use Secret Manager CSI driver |
| Forgetting to create the IAM binding for Workload Identity | Creating the KSA and GSA but not connecting them | The iam.workloadIdentityUser binding on the GSA is required for the mapping to work |
Not setting pod-security.kubernetes.io labels | Assuming GKE blocks unsafe pods by default | Apply Pod Security Standards labels to namespaces; start with warn mode |
| Using image tags instead of digests with Binary Authorization | Tags are mutable and can be overwritten | Deploy by digest (@sha256:...) to ensure attestation matches the exact image |
1. Your security team discovers that three different applications running on the same GKE node can all read from a sensitive Cloud Storage bucket, even though only one application actually requires this access. You are tasked with implementing Workload Identity to fix this. How will configuring Workload Identity fundamentally change the way these pods authenticate with Google Cloud APIs?
Without Workload Identity, every pod on a node shares the node VM’s GCP service account, allowing any pod to retrieve an access token for that shared identity from the default metadata server. Workload Identity Federation solves this by running a specialized GKE metadata server as a DaemonSet on each node, which intercepts metadata requests from pods. Instead of returning the node’s credentials, the GKE metadata server checks the pod’s specific Kubernetes ServiceAccount and exchanges it for a short-lived GCP access token scoped only to the mapped GCP service account. This ensures that the two applications not requiring bucket access receive permission denied errors, while the authorized application successfully authenticates.
2. You are preparing to roll out Binary Authorization across all production GKE clusters. The lead developer is concerned that enabling this feature might block emergency hotfixes if the automated attestation pipeline fails during an incident. How should you configure the rollout to address this concern while still gaining visibility into unsigned images?
You should configure the Binary Authorization policy to use “dry run” mode (DRYRUN_AUDIT_LOG_ONLY) instead of the default enforce mode (ENFORCED_BLOCK_AND_AUDIT_LOG). In dry run mode, Binary Authorization evaluates every pod creation request against the policy but does not actually block the pod from starting, even if it lacks the required attestations. Instead, it logs a detailed violation event to Cloud Audit Logs indicating that the pod would have been blocked. This allows you to deploy the policy and observe its impact over time, ensuring emergency hotfixes can still deploy while you identify and fix pipeline gaps before switching to full enforcement.
3. During a routine audit, an inspector notices that your deployment manifests use tags like `image: frontend:v2.1` instead of SHA-256 digests. They flag this as a critical violation of your Binary Authorization policy, even though the images are successfully passing the attestation checks. Why is deploying by tag considered a security risk when using Binary Authorization?
Container image tags are mutable, meaning a developer or an attacker can push a new image with the exact same tag, overwriting the original content in the registry. Binary Authorization attestations are cryptographically bound to the immutable SHA-256 digest of the image, not the mutable tag. If you deploy by tag, Kubernetes might pull a modified image, and the original attestation will no longer be valid for the new digest, which could lead to unexpected deployment failures or bypasses if caching is involved. Deploying by digest guarantees that the cluster runs the exact identical bits that were scanned, tested, and attested during your secure supply chain process.
4. Your architecture currently uses native Kubernetes Secrets to store third-party API keys. A security review mandates that all secrets must have a verifiable access audit trail and must not be stored in the cluster's etcd database. Why is migrating to the Google Cloud Secret Manager CSI driver the correct architectural choice to meet these requirements?
With native Kubernetes Secrets, the secret values are stored directly within the cluster’s etcd database, meaning anyone with administrative access to the control plane or broad RBAC permissions can view them without generating a granular access log. The Secret Manager CSI driver fundamentally changes this architecture by storing the secrets externally in Google Cloud Secret Manager and mounting them into pods as temporary, in-memory files at runtime. Because the secrets are fetched directly from the external service, they never pass through or rest in etcd. Furthermore, every retrieval of the secret by a workload generates a distinct entry in Cloud Audit Logs, satisfying the requirement for a verifiable access audit trail.
5. Your company is hosting a multi-tenant SaaS application on GKE and needs to protect the node's boot sequence from being compromised by persistent rootkits. You are deciding between Shielded Nodes and Confidential Nodes. Why would Shielded Nodes be sufficient for this specific requirement?
Shielded Nodes are specifically designed to provide verifiable integrity for the node’s boot process by utilizing Secure Boot, which ensures that only Google-signed boot components and kernel modules are loaded. They also leverage a Virtual Trusted Platform Module (vTPM) to create a measured boot chain, continuously monitoring for any tampering against a known-good baseline. While Confidential Nodes offer these same boot protections, their primary differentiating feature is the encryption of data in use (in memory) using specialized hardware. Since your specific requirement is focused solely on protecting the boot sequence from rootkits rather than encrypting active memory, Shielded Nodes fully address the threat model without incurring the performance or cost overhead of Confidential Nodes.
6. You have just enabled Workload Identity on a legacy GKE cluster that has been running in production for two years. Immediately after the update completes, several applications begin crashing because they are receiving "403 Permission Denied" errors when trying to read from Cloud Storage. What architectural change caused this outage, and how do you resolve it?
Enabling Workload Identity on an existing cluster changes the behavior of the metadata server interception for all pods on the affected nodes. Pods that previously defaulted to the node’s underlying Compute Engine service account now have their metadata requests intercepted by the Workload Identity DaemonSet, which requires a specific mapping to grant access. Because these legacy applications lacked a configured Kubernetes ServiceAccount annotated with a GCP service account mapping, the metadata server denied them access to GCP credentials entirely. To resolve the outage, you must create the necessary GCP service accounts, bind them to Kubernetes ServiceAccounts with the iam.workloadIdentityUser role, annotate the KSAs, and update the application Deployments to explicitly reference these new ServiceAccounts.
Hands-On Exercise: Workload Identity for Pub/Sub and Binary Authorization
Section titled “Hands-On Exercise: Workload Identity for Pub/Sub and Binary Authorization”Objective
Section titled “Objective”Configure Workload Identity to securely access Pub/Sub from a pod, and set up Binary Authorization to block untrusted images.
Prerequisites
Section titled “Prerequisites”gcloudCLI installed and authenticated- A GCP project with billing enabled
- GKE, Pub/Sub, Binary Authorization, and KMS APIs enabled
Task 1: Create a GKE Cluster with Workload Identity
Solution
export PROJECT_ID=$(gcloud config get-value project)export REGION=us-central1
# Enable required APIsgcloud services enable \ container.googleapis.com \ pubsub.googleapis.com \ binaryauthorization.googleapis.com \ cloudkms.googleapis.com \ secretmanager.googleapis.com \ --project=$PROJECT_ID
# Create cluster with Workload Identitygcloud container clusters create security-demo \ --region=$REGION \ --num-nodes=1 \ --machine-type=e2-standard-2 \ --release-channel=regular \ --enable-ip-alias \ --workload-pool=$PROJECT_ID.svc.id.goog \ --enable-shielded-nodes
# Get credentialsgcloud container clusters get-credentials security-demo --region=$REGIONTask 2: Set Up Workload Identity for Pub/Sub Access
Solution
# Create a Pub/Sub topic and subscriptiongcloud pubsub topics create demo-ordersgcloud pubsub subscriptions create demo-orders-sub \ --topic=demo-orders
# Create a GCP service account for the publishergcloud iam service-accounts create pubsub-publisher \ --display-name="Pub/Sub Publisher"
# Grant Pub/Sub publisher rolegcloud pubsub topics add-iam-policy-binding demo-orders \ --member="serviceAccount:pubsub-publisher@$PROJECT_ID.iam.gserviceaccount.com" \ --role="roles/pubsub.publisher"
# Create Kubernetes ServiceAccountkubectl create serviceaccount pubsub-sa
# Bind KSA to GSAgcloud iam service-accounts add-iam-policy-binding \ pubsub-publisher@$PROJECT_ID.iam.gserviceaccount.com \ --role="roles/iam.workloadIdentityUser" \ --member="serviceAccount:$PROJECT_ID.svc.id.goog[default/pubsub-sa]"
# Annotate KSAkubectl annotate serviceaccount pubsub-sa \ iam.gke.io/gcp-service-account=pubsub-publisher@$PROJECT_ID.iam.gserviceaccount.comTask 3: Deploy a Pod That Publishes to Pub/Sub
Solution
# Deploy a pod with Workload Identitykubectl apply -f - <<EOFapiVersion: v1kind: Podmetadata: name: publisherspec: serviceAccountName: pubsub-sa containers: - name: publisher image: google/cloud-sdk:slim command: ["sleep", "3600"] resources: requests: cpu: 100m memory: 256MiEOF
# Wait for pod to be readykubectl wait --for=condition=Ready pod/publisher --timeout=120s
# Verify Workload Identity is workingkubectl exec publisher -- gcloud auth list# Should show pubsub-publisher@PROJECT_ID.iam.gserviceaccount.com
# Publish a messagekubectl exec publisher -- \ gcloud pubsub topics publish demo-orders \ --message='{"order_id": "12345", "item": "widget", "qty": 3}'
# Verify the message was publishedgcloud pubsub subscriptions pull demo-orders-sub --auto-ack --limit=1
# Try to access a resource NOT granted (should fail)kubectl exec publisher -- gsutil ls gs://# Should fail with permission denied (403)Task 4: Enable Binary Authorization in Dry Run Mode
Solution
# Enable Binary Authorization on the clustergcloud container clusters update security-demo \ --region=$REGION \ --binauthz-evaluation-mode=PROJECT_SINGLETON_POLICY_ENFORCE
# Export and examine the current policygcloud container binauthz policy export
# Create a policy that blocks everything except Google images (dry run)cat <<EOF > /tmp/binauthz-policy.yamladmissionWhitelistPatterns:- namePattern: gcr.io/google-containers/*- namePattern: gcr.io/google-samples/*- namePattern: gke.gcr.io/*- namePattern: gcr.io/gke-release/*- namePattern: $REGION-docker.pkg.dev/$PROJECT_ID/*- namePattern: registry.k8s.io/*- namePattern: google/cloud-sdk*defaultAdmissionRule: enforcementMode: DRYRUN_AUDIT_LOG_ONLY evaluationMode: ALWAYS_DENYglobalPolicyEvaluationMode: ENABLEEOF
gcloud container binauthz policy import /tmp/binauthz-policy.yaml
echo "Binary Authorization is now in DRY RUN mode."echo "Unsigned images will be LOGGED but not blocked."Task 5: Test Binary Authorization Behavior
Solution
# Deploy an image from Docker Hub (would be blocked in enforce mode)kubectl run nginx-test --image=nginx:1.27 --restart=Never \ --overrides='{"spec":{"containers":[{"name":"nginx-test","image":"nginx:1.27","resources":{"requests":{"cpu":"100m","memory":"64Mi"}}}]}}'
# In dry run mode, this will succeed but generate an audit logkubectl get pod nginx-test
# Check audit logs for Binary Authorization dry-run violationssleep 30 # Give logs time to propagategcloud logging read \ 'resource.type="k8s_cluster" AND protoPayload.methodName="io.k8s.core.v1.pods.create" AND labels."binaryauthorization.googleapis.com/decision"="DENIED"' \ --limit=5 \ --format="table(timestamp, protoPayload.resourceName)"
# Clean up test podkubectl delete pod nginx-test
echo "In a real deployment, you would:"echo "1. Review dry-run logs for 1-2 weeks"echo "2. Whitelist or attest all legitimate images"echo "3. Switch to ENFORCED_BLOCK_AND_AUDIT_LOG mode"Task 6: Enable Security Posture and Test Pod Security Standards
Solution
# Enable Security Posture on the clustergcloud container clusters update security-demo \ --region=$REGION \ --security-posture=standard \ --workload-vulnerability-scanning=standard
# Create a namespace with restricted Pod Security Standardskubectl create namespace secure-nskubectl label namespace secure-ns \ pod-security.kubernetes.io/enforce=restricted \ pod-security.kubernetes.io/warn=restricted
# Try to deploy a privileged pod (should be blocked)kubectl apply -n secure-ns -f - <<EOFapiVersion: v1kind: Podmetadata: name: bad-podspec: containers: - name: nginx image: nginx:1.27 securityContext: privileged: trueEOF# The API server will reject this pod creation immediately
# Deploy a compliant podkubectl apply -n secure-ns -f - <<EOFapiVersion: v1kind: Podmetadata: name: good-podspec: securityContext: runAsNonRoot: true runAsUser: 1000 seccompProfile: type: RuntimeDefault containers: - name: nginx image: nginx:1.27 securityContext: allowPrivilegeEscalation: false capabilities: drop: - ALLEOF# This pod will be created successfullyTask 7: Mount External Secrets using Secret Manager CSI Driver
Solution
# Enable Secret Manager add-ongcloud container clusters update security-demo \ --region=$REGION \ --enable-secret-manager
# Create a secret in GCP Secret Managerecho -n "super-secret-api-key" | gcloud secrets create demo-api-key \ --data-file=- \ --replication-policy=automatic
# Grant the Workload Identity SA access to the secretgcloud secrets add-iam-policy-binding demo-api-key \ --member="serviceAccount:pubsub-publisher@$PROJECT_ID.iam.gserviceaccount.com" \ --role="roles/secretmanager.secretAccessor"
# Create SecretProviderClass in the clusterkubectl apply -f - <<EOFapiVersion: secrets-store.csi.x-k8s.io/v1kind: SecretProviderClassmetadata: name: gcp-secretsspec: provider: gcp parameters: secrets: | - resourceName: "projects/$PROJECT_ID/secrets/demo-api-key/versions/latest" path: "api-key.txt"EOF
# Deploy a pod that mounts the secretkubectl apply -f - <<EOFapiVersion: v1kind: Podmetadata: name: secret-readerspec: serviceAccountName: pubsub-sa containers: - name: reader image: google/cloud-sdk:slim command: ["sleep", "3600"] volumeMounts: - name: secrets-volume mountPath: /var/secrets readOnly: true volumes: - name: secrets-volume csi: driver: secrets-store.csi.k8s.io readOnly: true volumeAttributes: secretProviderClass: gcp-secretsEOF
# Verify the secret is mounted correctlykubectl wait --for=condition=Ready pod/secret-reader --timeout=120skubectl exec secret-reader -- cat /var/secrets/api-key.txtTask 8: Clean Up
Solution
# Delete the clustergcloud container clusters delete security-demo \ --region=$REGION --quiet
# Delete Pub/Sub resourcesgcloud pubsub subscriptions delete demo-orders-sub --quietgcloud pubsub topics delete demo-orders --quiet
# Delete the secretgcloud secrets delete demo-api-key --quiet
# Delete the GCP service accountgcloud iam service-accounts delete \ pubsub-publisher@$PROJECT_ID.iam.gserviceaccount.com --quiet
# Reset Binary Authorization policy to allow allcat <<EOF > /tmp/binauthz-default.yamldefaultAdmissionRule: enforcementMode: ENFORCED_BLOCK_AND_AUDIT_LOG evaluationMode: ALWAYS_ALLOWglobalPolicyEvaluationMode: ENABLEEOFgcloud container binauthz policy import /tmp/binauthz-default.yaml
# Clean up temp filesrm -f /tmp/binauthz-policy.yaml /tmp/binauthz-default.yaml /tmp/note.json
echo "Cleanup complete."Success Criteria
Section titled “Success Criteria”- Cluster created with Workload Identity enabled
- Pub/Sub topic and subscription created
- Pod successfully publishes to Pub/Sub using Workload Identity (no key files)
- Pod cannot access resources not granted to its service account
- Binary Authorization enabled in dry run mode
- Audit logs show denied images from untrusted sources
- Security Posture enabled and Pod Security Standards enforce
restrictedmode - Secret Manager CSI driver mounts an external secret successfully
- All resources cleaned up
Next Module
Section titled “Next Module”Next up: Module 6.4: GKE Storage --- Master Persistent Disk CSI drivers, regional PD failover, Filestore for shared NFS, Cloud Storage FUSE for object storage access, and Backup for GKE to protect your stateful workloads.