Module 6.4: GKE Storage
Complexity: [MEDIUM] | Time to Complete: 2h | Prerequisites: Module 6.1 (GKE Architecture)
What You’ll Be Able to Do
Section titled “What You’ll Be Able to Do”After completing this module, you will be able to:
- Configure Persistent Disks (pd-standard, pd-ssd, pd-balanced) and Filestore CSI driver for GKE workloads
- Implement volume snapshots and backup schedules using Backup for GKE for stateful application protection
- Deploy regional persistent disks for cross-zone high availability of stateful workloads on GKE
- Evaluate GKE storage options (Persistent Disk, Filestore, Cloud Storage FUSE) for different access patterns
Why This Module Matters
Section titled “Why This Module Matters”In August 2023, an online gaming company running on GKE lost 6 hours of player progress data for 180,000 active users. Their PostgreSQL database was running on a single-zone Persistent Disk attached to a StatefulSet pod. When us-central1-a experienced a partial zone outage, the node hosting the database went offline. Because the PD was zonal, it could not be attached to a node in another zone. The StatefulSet controller created a replacement pod in us-central1-b, but it could not mount the volume---zonal PDs are locked to their zone. The database was down for 6 hours until the zone recovered. The company’s VP of Engineering later estimated the revenue loss at $420,000 and the player trust damage as “incalculable.” The fix was straightforward: switch to a Regional Persistent Disk, which synchronously replicates data to two zones and can failover in under a minute. It cost 16 cents more per GB per month.
Storage in Kubernetes is where the “cattle, not pets” philosophy meets reality. Stateless pods can be replaced instantly, but pods with Persistent Volumes carry data that must survive restarts, rescheduling, and zone failures. GKE offers multiple storage options---Persistent Disks (block storage), Filestore (managed NFS), Cloud Storage FUSE (object storage as a filesystem), and Backup for GKE (disaster recovery). Choosing the right storage backend and configuring it for resilience is often the difference between a minor disruption and a catastrophic data loss event.
In this module, you will learn the full GKE storage landscape: how the PD CSI driver provisions and attaches disks, when to use regional PDs for high availability, how Filestore provides shared NFS access across pods, how Cloud Storage FUSE mounts GCS buckets as local filesystems, and how Backup for GKE protects your stateful workloads.
Persistent Disk CSI Driver
Section titled “Persistent Disk CSI Driver”The Compute Engine Persistent Disk CSI driver is the primary block storage driver for GKE. It is installed by default on all GKE clusters and provisions Google Cloud Persistent Disks as Kubernetes PersistentVolumes.
Disk Types
Section titled “Disk Types”| Disk Type | Identifier | IOPS (Read) | Throughput (Read) | Use Case | Cost (us-central1) |
|---|---|---|---|---|---|
| Standard | pd-standard | 0.75/GB | 12 MB/s/GB | Logs, cold data, backups | ~$0.040/GB/mo |
| Balanced | pd-balanced | 6/GB | 28 MB/s/GB | General purpose (default) | ~$0.100/GB/mo |
| SSD | pd-ssd | 30/GB | 48 MB/s/GB | Databases, latency-sensitive | ~$0.170/GB/mo |
| Extreme | pd-extreme | Configurable (up to 120K) | Configurable (up to 2.4 GB/s) | SAP HANA, Oracle, high-IOPS | ~$0.125/GB/mo + IOPS |
| Hyperdisk Balanced | hyperdisk-balanced | Configurable | Configurable | Next-gen general purpose | Variable |
StorageClasses
Section titled “StorageClasses”GKE provides default StorageClasses, but you should define your own for production workloads.
# List default StorageClasseskubectl get storageclasses
# NAME PROVISIONER RECLAIMPOLICY VOLUMEBINDINGMODE# premium-rwo pd.csi.storage.gke.io Delete WaitForFirstConsumer# standard pd.csi.storage.gke.io Delete Immediate# standard-rwo pd.csi.storage.gke.io Delete WaitForFirstConsumer# Custom StorageClass for production databasesapiVersion: storage.k8s.io/v1kind: StorageClassmetadata: name: fast-regionalprovisioner: pd.csi.storage.gke.ioparameters: type: pd-ssd replication-type: regional-pd # Synchronous replication across 2 zonesvolumeBindingMode: WaitForFirstConsumer # Bind when pod is scheduledreclaimPolicy: Retain # Do NOT delete the disk when PVC is deletedallowVolumeExpansion: true # Allow resizing without downtime
---# StorageClass for dev/test (cheaper, zonal)apiVersion: storage.k8s.io/v1kind: StorageClassmetadata: name: dev-standardprovisioner: pd.csi.storage.gke.ioparameters: type: pd-balancedvolumeBindingMode: WaitForFirstConsumerreclaimPolicy: DeleteallowVolumeExpansion: trueVolume Binding Modes
Section titled “Volume Binding Modes”This is a subtle but important setting:
| Mode | Behavior | When to Use |
|---|---|---|
Immediate | PV is provisioned as soon as PVC is created | Pre-provisioning, when zone does not matter |
WaitForFirstConsumer | PV is provisioned when a pod mounts it | Regional clusters (ensures disk is in the same zone as the pod) |
Stop and think: You just created a PVC using a StorageClass with
Immediatebinding in a regional cluster spanning three zones. The disk provisions instantly in zone A. What happens if the Kubernetes scheduler later decides the only node with enough CPU for your pod is in zone B?
War Story: A team used Immediate binding mode in a regional cluster. The PD was provisioned in us-central1-a, but the pod was scheduled to us-central1-c. The pod hung in Pending with the error “disk is in zone us-central1-a, which does not match the zone of node us-central1-c.” Always use WaitForFirstConsumer in regional clusters.
Regional Persistent Disks
Section titled “Regional Persistent Disks”Regional PDs synchronously replicate data to two zones within the same region. This is the critical feature for high-availability stateful workloads.
How Regional PDs Work
Section titled “How Regional PDs Work” Zonal PD: Regional PD: ┌─────────────────┐ ┌─────────────────┐ │ us-central1-a │ │ us-central1-a │ │ ┌─────────┐ │ │ ┌─────────┐ │ │ │ PD-SSD │ │ │ │ PD-SSD │◄───┼──── Synchronous │ │ (data) │ │ │ │ (copy 1)│ │ replication │ └─────────┘ │ │ └─────────┘ │ │ │ └─────────────────┘ │ If zone fails: │ ┌─────────────────┐ │ DATA IS │ │ us-central1-b │ │ INACCESSIBLE │ │ ┌─────────┐ │ │ │ │ │ PD-SSD │◄──── If zone-a fails: └─────────────────┘ │ │ (copy 2)│ │ Pod restarts in │ └─────────┘ │ zone-b, mounts └─────────────────┘ copy 2 (<60 sec)Provisioning Regional PDs
Section titled “Provisioning Regional PDs”# StatefulSet with Regional PDapiVersion: apps/v1kind: StatefulSetmetadata: name: postgresspec: serviceName: postgres replicas: 1 selector: matchLabels: app: postgres template: metadata: labels: app: postgres spec: containers: - name: postgres image: postgres:16 ports: - containerPort: 5432 env: - name: POSTGRES_PASSWORD value: "change-me-in-production" - name: PGDATA value: /var/lib/postgresql/data/pgdata volumeMounts: - name: data mountPath: /var/lib/postgresql/data resources: requests: cpu: 500m memory: 1Gi limits: cpu: "2" memory: 4Gi volumeClaimTemplates: - metadata: name: data spec: accessModes: ["ReadWriteOnce"] storageClassName: fast-regional # Uses our regional PD StorageClass resources: requests: storage: 50GiFailover Behavior
Section titled “Failover Behavior”When a zone fails and the pod is rescheduled to another zone:
- GKE detects the node is unhealthy (~5 minutes by default)
- The StatefulSet controller creates a replacement pod
- The pod is scheduled to a healthy zone
- The Regional PD is detached from the failed zone and attached in the new zone (~30-60 seconds)
- The pod starts with the same data
# Force-detach a stuck PD (emergency use only)gcloud compute disks detach my-disk \ --zone=us-central1-a \ --instance=failed-node
# Monitor PV/PVC status during failoverkubectl get pv,pvc -o widekubectl describe pv <pv-name> | grep -A 5 "Status"Volume Snapshots
Section titled “Volume Snapshots”The PD CSI driver supports Kubernetes VolumeSnapshots for point-in-time backups.
# Create a VolumeSnapshotClassapiVersion: snapshot.storage.k8s.io/v1kind: VolumeSnapshotClassmetadata: name: pd-snapshot-classdriver: pd.csi.storage.gke.iodeletionPolicy: Retainparameters: storage-locations: us-central1
---# Take a snapshot of the PostgreSQL PVCapiVersion: snapshot.storage.k8s.io/v1kind: VolumeSnapshotmetadata: name: postgres-snapshot-20240315spec: volumeSnapshotClassName: pd-snapshot-class source: persistentVolumeClaimName: data-postgres-0# Verify the snapshotkubectl get volumesnapshotkubectl describe volumesnapshot postgres-snapshot-20240315
# Restore from snapshot (create a new PVC from the snapshot)cat <<'EOF' | kubectl apply -f -apiVersion: v1kind: PersistentVolumeClaimmetadata: name: postgres-restoredspec: storageClassName: fast-regional accessModes: - ReadWriteOnce resources: requests: storage: 50Gi dataSource: name: postgres-snapshot-20240315 kind: VolumeSnapshot apiGroup: snapshot.storage.k8s.ioEOFFilestore CSI Driver (Managed NFS)
Section titled “Filestore CSI Driver (Managed NFS)”Filestore provides managed NFS file shares that can be mounted by multiple pods simultaneously with ReadWriteMany access. This is essential for workloads that need shared filesystem access.
When to Use Filestore
Section titled “When to Use Filestore”| Use Case | Why Filestore | Alternative |
|---|---|---|
| CMS shared uploads | Multiple pods write to the same directory | N/A (PD is ReadWriteOnce) |
| ML training data | Large dataset shared across training pods | Cloud Storage FUSE (cheaper) |
| Legacy apps requiring NFS | Application expects a POSIX filesystem | Refactor to use object storage |
| Build artifacts | CI/CD pods share build cache | Cloud Storage (if latency is okay) |
Filestore Tiers
Section titled “Filestore Tiers”| Tier | Min Capacity | IOPS | Throughput | Use Case |
|---|---|---|---|---|
| Basic HDD | 1 TiB | 600 (read) | 100 MB/s (read) | Cold data, infrequent access |
| Basic SSD | 2.5 TiB | 60K (read) | 1.2 GB/s (read) | General purpose shared storage |
| Zonal | 1 TiB | Up to 170K | Up to 3.6 GB/s | High-performance, single zone |
| Enterprise | 1 TiB | Up to 120K | Up to 2.4 GB/s | HA across zones, SLA-backed |
Setting Up Filestore CSI
Section titled “Setting Up Filestore CSI”# Enable the Filestore CSI driver on the clustergcloud container clusters update my-cluster \ --region=us-central1 \ --update-addons=GcpFilestoreCsiDriver=ENABLED
# Verify the driver is installedkubectl get csidriver filestore.csi.storage.gke.io# StorageClass for FilestoreapiVersion: storage.k8s.io/v1kind: StorageClassmetadata: name: filestore-sharedprovisioner: filestore.csi.storage.gke.ioparameters: tier: basic-ssd network: defaultvolumeBindingMode: WaitForFirstConsumerreclaimPolicy: RetainallowVolumeExpansion: true
---# PVC requesting shared storageapiVersion: v1kind: PersistentVolumeClaimmetadata: name: shared-dataspec: accessModes: - ReadWriteMany # Multiple pods can write simultaneously storageClassName: filestore-shared resources: requests: storage: 2560Gi # Minimum 2.5 TiB for basic-ssd# Two Deployments sharing the same Filestore volumeapiVersion: apps/v1kind: Deploymentmetadata: name: writerspec: replicas: 2 selector: matchLabels: app: writer template: metadata: labels: app: writer spec: containers: - name: writer image: busybox command: ["sh", "-c", "while true; do echo $(hostname) $(date) >> /shared/log.txt; sleep 5; done"] volumeMounts: - name: shared mountPath: /shared resources: requests: cpu: 50m memory: 32Mi volumes: - name: shared persistentVolumeClaim: claimName: shared-data
---apiVersion: apps/v1kind: Deploymentmetadata: name: readerspec: replicas: 3 selector: matchLabels: app: reader template: metadata: labels: app: reader spec: containers: - name: reader image: busybox command: ["sh", "-c", "while true; do tail -5 /shared/log.txt; sleep 10; done"] volumeMounts: - name: shared mountPath: /shared readOnly: true resources: requests: cpu: 50m memory: 32Mi volumes: - name: shared persistentVolumeClaim: claimName: shared-dataCloud Storage FUSE CSI Driver
Section titled “Cloud Storage FUSE CSI Driver”Cloud Storage FUSE mounts GCS buckets as local filesystems inside pods. This gives pods access to petabytes of object storage through standard file system operations.
How It Works
Section titled “How It Works” ┌──────────────────────────────────────────────┐ │ Pod │ │ ┌──────────────────────────────────────┐ │ │ │ Application │ │ │ │ open("/data/model.bin", "r") │ │ │ └──────────────┬───────────────────────┘ │ │ │ POSIX file operations │ │ ┌──────────────▼───────────────────────┐ │ │ │ FUSE sidecar (gcsfuse) │ │ │ │ Translates file ops → GCS API calls │ │ │ └──────────────┬───────────────────────┘ │ └─────────────────┼──────────────────────────┘ │ GCS JSON API ┌─────────────────▼──────────────────────────┐ │ Cloud Storage Bucket │ │ gs://my-ml-datasets/model.bin │ └────────────────────────────────────────────┘Enabling Cloud Storage FUSE
Section titled “Enabling Cloud Storage FUSE”# Enable the Cloud Storage FUSE CSI drivergcloud container clusters update my-cluster \ --region=us-central1 \ --update-addons=GcsFuseCsiDriver=ENABLED
# Verifykubectl get csidriver gcsfuse.csi.storage.gke.ioUsing Cloud Storage FUSE in Pods
Section titled “Using Cloud Storage FUSE in Pods”# PersistentVolume pointing to a GCS bucketapiVersion: v1kind: PersistentVolumemetadata: name: gcs-pvspec: accessModes: - ReadWriteMany capacity: storage: 5Ti # Informational only (GCS is unlimited) storageClassName: "" mountOptions: - implicit-dirs # Show directories from object prefixes - uid=1000 # Map files to application user - gid=1000 csi: driver: gcsfuse.csi.storage.gke.io volumeHandle: my-ml-datasets # GCS bucket name readOnly: false
---apiVersion: v1kind: PersistentVolumeClaimmetadata: name: gcs-pvcspec: accessModes: - ReadWriteMany resources: requests: storage: 5Ti storageClassName: "" volumeName: gcs-pv
---apiVersion: v1kind: Podmetadata: name: ml-training annotations: gke-gcsfuse/volumes: "true" # Required annotation to inject sidecarspec: serviceAccountName: ml-sa # Must have Workload Identity with GCS access containers: - name: trainer image: us-central1-docker.pkg.dev/my-project/ml/trainer:v3 volumeMounts: - name: datasets mountPath: /data resources: requests: cpu: "4" memory: 16Gi volumes: - name: datasets persistentVolumeClaim: claimName: gcs-pvcCloud Storage FUSE Limitations
Section titled “Cloud Storage FUSE Limitations”| Limitation | Impact | Workaround |
|---|---|---|
| Not POSIX-compliant | No atomic renames, no file locking | Use for read-heavy workloads, not databases |
| Higher latency than PD | Each file op is a GCS API call | Enable file caching for repeated reads |
| Eventual consistency for listings | New files may not appear immediately in ls | Use --stat-cache-ttl=0 for real-time needs |
| No append support | Cannot append to existing files | Write new files instead of appending |
# Enable file caching for better read performance# Add to the pod annotation or PV mount optionsmetadata: annotations: gke-gcsfuse/volumes: "true" gke-gcsfuse/cpu-limit: "500m" gke-gcsfuse/memory-limit: "512Mi" gke-gcsfuse/ephemeral-storage-limit: "10Gi" # Cache sizeBackup for GKE
Section titled “Backup for GKE”Backup for GKE provides managed backup and restore for your entire GKE workloads---including both the Kubernetes configuration (Deployments, Services, ConfigMaps) and the persistent volume data.
Architecture
Section titled “Architecture” ┌──────────────────────────────────────────────────┐ │ Backup for GKE │ │ │ │ ┌────────────────┐ ┌────────────────┐ │ │ │ Backup Plan │ │ Backup Plan │ │ │ │ (daily, 30d │ │ (weekly, 90d │ │ │ │ retention) │ │ retention) │ │ │ └───────┬────────┘ └───────┬────────┘ │ │ │ │ │ │ ▼ ▼ │ │ ┌──────────────────────────────────────┐ │ │ │ Backups │ │ │ │ backup-2024-03-15-0200 (config+data)│ │ │ │ backup-2024-03-14-0200 (config+data)│ │ │ │ backup-2024-03-13-0200 (config+data)│ │ │ └──────────────────────────────────────┘ │ │ │ │ Restores: │ │ - Same cluster (in-place rollback) │ │ - Different cluster (migration/DR) │ │ - Selective (specific namespaces only) │ └──────────────────────────────────────────────────┘Setting Up Backup for GKE
Section titled “Setting Up Backup for GKE”# Enable the Backup for GKE APIgcloud services enable gkebackup.googleapis.com
# Enable the backup agent on the clustergcloud container clusters update my-cluster \ --region=us-central1 \ --update-addons=BackupRestore=ENABLED
# Create a backup plan (daily backups, 30-day retention)gcloud beta container backup-restore backup-plans create daily-backup \ --project=$PROJECT_ID \ --location=$REGION \ --cluster=projects/$PROJECT_ID/locations/$REGION/clusters/my-cluster \ --all-namespaces \ --include-volume-data \ --include-secrets \ --backup-retain-days=30 \ --backup-delete-lock-days=7 \ --cron-schedule="0 2 * * *" \ --paused=falseCreating Manual Backups
Section titled “Creating Manual Backups”# Create an on-demand backup (before a risky deployment)gcloud beta container backup-restore backups create pre-deploy-backup \ --project=$PROJECT_ID \ --location=$REGION \ --backup-plan=daily-backup \ --wait-for-completion
# List backupsgcloud beta container backup-restore backups list \ --project=$PROJECT_ID \ --location=$REGION \ --backup-plan=daily-backup \ --format="table(name, state, completeTime, resourceCount, volumeCount)"Restoring from Backup
Section titled “Restoring from Backup”# Create a restore plan (defines how backups are restored)gcloud beta container backup-restore restore-plans create full-restore \ --project=$PROJECT_ID \ --location=$REGION \ --cluster=projects/$PROJECT_ID/locations/$REGION/clusters/my-cluster \ --backup-plan=projects/$PROJECT_ID/locations/$REGION/backupPlans/daily-backup \ --all-namespaces \ --volume-data-restore-policy=RESTORE_VOLUME_DATA_FROM_BACKUP \ --cluster-resource-conflict-policy=USE_BACKUP_VERSION \ --namespaced-resource-restore-mode=MERGE_SKIP_ON_CONFLICT
# Execute a restoregcloud beta container backup-restore restores create restore-20240315 \ --project=$PROJECT_ID \ --location=$REGION \ --restore-plan=full-restore \ --backup=projects/$PROJECT_ID/locations/$REGION/backupPlans/daily-backup/backups/pre-deploy-backup \ --wait-for-completionSelective Namespace Restore
Section titled “Selective Namespace Restore”# Restore only the "payments" namespace from a backupgcloud beta container backup-restore restore-plans create payments-restore \ --project=$PROJECT_ID \ --location=$REGION \ --cluster=projects/$PROJECT_ID/locations/$REGION/clusters/my-cluster \ --backup-plan=projects/$PROJECT_ID/locations/$REGION/backupPlans/daily-backup \ --selected-namespaces=payments \ --volume-data-restore-policy=RESTORE_VOLUME_DATA_FROM_BACKUP \ --namespaced-resource-restore-mode=DELETE_AND_RESTOREStorage Decision Matrix
Section titled “Storage Decision Matrix”Choosing the right storage for your workload:
Need block storage for a single pod? │ ├── YES → Is HA required? │ ├── YES → Regional PD (pd-ssd or pd-balanced) │ └── NO → Zonal PD (cheaper, dev/test) │ Need shared filesystem across pods? │ ├── YES → How much data? │ ├── < 10 TiB, need POSIX → Filestore │ └── > 10 TiB, read-heavy → Cloud Storage FUSE │ Need object storage access from pods? │ └── YES → Cloud Storage FUSE (or use GCS client libraries directly)Pause and predict: You are deploying a highly available legacy CMS. The application requires three replicas of the web tier to share a single directory for user-uploaded media (images, PDFs), which currently totals around 2 TiB. Based on the decision matrix below, which GKE storage solution should you choose and why?
| Factor | PD (Block) | Filestore (NFS) | Cloud Storage FUSE |
|---|---|---|---|
| Access mode | ReadWriteOnce | ReadWriteMany | ReadWriteMany |
| Latency | Sub-ms | Low ms | 10-50ms per operation |
| Max size | 64 TiB | 100 TiB | Unlimited |
| POSIX compliant | Yes (ext4/xfs) | Yes | No (partial) |
| Best for | Databases, stateful apps | Shared data, CMS | ML datasets, logs, archives |
| Min size | 10 GiB | 1 TiB (HDD), 2.5 TiB (SSD) | N/A (bucket) |
| Cost | $0.04-0.17/GB/mo | $0.20-0.36/GB/mo | $0.020-0.026/GB/mo |
Did You Know?
Section titled “Did You Know?”-
Regional Persistent Disks perform synchronous replication across exactly two zones in the same region. Every write to the primary copy must be acknowledged by the secondary copy before the write returns to the application. This adds approximately 1-2ms of write latency compared to a zonal PD, but guarantees zero data loss (RPO=0) during a zone failover. The two zones are chosen automatically by GKE based on the cluster’s node topology and cannot be manually selected.
-
Cloud Storage FUSE was originally developed inside Google for Borg workloads that needed to read training data from Colossus (Google’s internal distributed storage system). The open-source version was released in 2015 and the GKE CSI driver followed in 2023. Internally, Google ML training jobs read petabytes of data per day through FUSE-like interfaces. The GKE CSI driver injects a sidecar container that runs the gcsfuse process, which is why pods need the
gke-gcsfuse/volumes: "true"annotation. -
Backup for GKE does not just snapshot disks---it captures the full Kubernetes state. A backup includes all Kubernetes resource configurations (Deployments, Services, ConfigMaps, Secrets, CRDs, custom resources), PersistentVolume data (via disk snapshots), and namespace metadata. This means you can restore an entire application stack---not just the data---to a different cluster in a different region. This is what distinguishes it from simply taking PD snapshots manually.
-
You can expand a Persistent Volume online without stopping the pod. The PD CSI driver supports volume expansion when the StorageClass has
allowVolumeExpansion: true. You simply edit the PVC to request a larger size, and the driver resizes the underlying disk and expands the filesystem---all while the pod continues running. However, you can only increase size, never decrease. Shrinking a PV requires creating a new smaller PV, copying data, and switching over.
Common Mistakes
Section titled “Common Mistakes”| Mistake | Why It Happens | How to Fix It |
|---|---|---|
| Using zonal PD for production databases | Default StorageClass creates zonal disks | Create a StorageClass with replication-type: regional-pd |
Using Immediate volume binding in regional clusters | Copied from single-zone examples | Always use WaitForFirstConsumer to match disk zone with pod zone |
Setting reclaim policy to Delete on production PVs | Default StorageClass behavior | Use Retain for production; manually delete PVs after confirming data is safe |
| Not planning IP ranges for pod CIDR (storage-related) | Forgetting that Filestore needs VPC access | Ensure Filestore network matches the GKE cluster’s VPC |
| Choosing Filestore for object storage workloads | Assuming NFS is always better | Use Cloud Storage FUSE for read-heavy, large-scale data; it is 10x cheaper per GB |
| Skipping backup configuration for stateful workloads | ”We have replication, we are fine” | Replication protects against hardware failure; backups protect against human error and data corruption |
| Not testing restore procedures | Creating backups but never testing restores | Schedule quarterly restore drills to a test cluster; an untested backup is not a backup |
| Using Cloud Storage FUSE for database storage | Seeing “ReadWriteMany” and assuming POSIX compliance | Cloud Storage FUSE lacks atomic renames and file locking; never use it for databases |
1. Your e-commerce database runs on a single GKE node with a standard zonal Persistent Disk attached. During a major sales event, the Google Cloud zone hosting that node experiences a complete power failure. The cluster has nodes in other healthy zones. What happens to your database, and how would configuring a Regional Persistent Disk have changed this outcome?
With a zonal Persistent Disk, your database goes completely offline and cannot be recovered until the specific zone is restored by Google Cloud, because the disk physically resides only in that failed zone. The Kubernetes scheduler might create a replacement pod in a healthy zone, but it will remain Pending because it cannot mount the zonal disk. If you had configured a Regional Persistent Disk, the data would have been synchronously replicated to a second zone in real-time. The scheduler would spin up the replacement pod in that second healthy zone, attach the replica disk within 60 seconds, and your database would resume operations with zero data loss (RPO=0).
2. You deploy a new application to a regional GKE cluster using a StorageClass with `Immediate` volume binding. The PersistentVolumeClaim bounds successfully, but the pod remains in a `Pending` state indefinitely, with an error citing a zone mismatch. Why did this happen, and how does changing the binding mode resolve the underlying issue?
This happens because Immediate binding forces the Persistent Disk CSI driver to provision the storage instantly, picking a zone for the disk before the Kubernetes scheduler has decided where the pod will run. If the scheduler later places the pod on a node in a different zone than the newly created disk, the pod cannot mount it. By changing the StorageClass to use WaitForFirstConsumer, you instruct the CSI driver to delay volume creation until the pod is actually scheduled. This ensures the scheduler picks the optimal node first, and the disk is subsequently provisioned in the exact same zone, guaranteeing they are co-located.
3. A machine learning team needs to mount a 50 TB dataset of training images into 20 concurrent training pods. The data is read-only, and cost is a major concern. The DevOps team initially suggests Filestore Enterprise, but you propose Cloud Storage FUSE instead. Why is Cloud Storage FUSE the better architectural choice for this specific workload?
Cloud Storage FUSE is the better choice because the workload involves large-scale, read-heavy data access where cost is the primary constraint and full POSIX compliance (like file locking or atomic renames) is not required. Filestore Enterprise would cost significantly more (around $0.20-$0.36/GB/month) and is designed for low-latency, complex file operations that ML training typically doesn’t need. Cloud Storage FUSE leverages standard Google Cloud Storage buckets, dropping the cost to roughly $0.020/GB/month while easily scaling to 50 TB and supporting simultaneous ReadWriteMany access across all 20 pods. You can also enable FUSE file caching to mitigate the higher per-operation latency associated with object storage.
4. A junior engineer accidentally deletes an entire production namespace, including the StatefulSet, ConfigMaps, Secrets, and the associated PersistentVolumeClaims. You have daily PD snapshots enabled on the underlying disks. Why are these PD snapshots alone insufficient for a rapid recovery, and how would Backup for GKE have prevented a prolonged outage?
Persistent Disk snapshots only capture the raw block data residing on the physical disk; they do not back up any Kubernetes state or configuration. To recover using only PD snapshots, you would have to manually recreate the deleted namespace, reconstruct the ConfigMaps and Secrets, redefine the StatefulSet, and manually orchestrate creating new PVCs from the snapshots. Backup for GKE solves this by capturing both the Kubernetes resource configurations (the “state”) and the underlying volume data in a unified snapshot. In this disaster scenario, Backup for GKE would allow you to execute a single restore command to recreate the namespace, all its resources, and the populated volumes automatically, drastically reducing your Recovery Time Objective (RTO).
5. To handle a temporary spike in log generation, you edit a PersistentVolumeClaim to increase its storage request from 100Gi to 500Gi. A week later, log volume returns to normal, and you want to reduce the PVC back to 100Gi to save costs. Describe the exact process you must follow to achieve this size reduction.
You cannot simply edit the existing PersistentVolumeClaim to reduce its size, because Google Cloud Persistent Disks do not support shrinking and volume expansion is strictly a one-way operation. To achieve the size reduction, you must manually create a brand new PVC requesting the desired 100Gi size. You then need to deploy a temporary pod that mounts both the old 500Gi volume and the new 100Gi volume to copy the data across using tools like rsync. Finally, you must update your application’s deployment manifests to reference the new PVC, restart the application, and delete the original 500Gi PVC.
6. You are tasked with providing shared filesystem storage for a small internal application that only generates about 50 GB of data. You decide to create a Basic SSD Filestore instance, but the provisioning command fails. Why does Filestore reject this configuration, and what is a more appropriate storage alternative for this workload?
Filestore rejects the configuration because it enforces hard minimum capacity requirements to accommodate its dedicated underlying infrastructure; a Basic SSD tier requires an absolute minimum of 2.5 TiB. Attempting to provision only 50 GB violates this boundary, and provisioning the full 2.5 TiB would be a massive waste of resources and budget for such a small dataset. A more appropriate alternative would be to deploy a lightweight, in-cluster NFS server backed by a single 50 GB regional Persistent Disk, or to rewrite the application to use Cloud Storage FUSE if it simply needs object storage without strict POSIX filesystem requirements.
Hands-On Exercise: Regional PD Failover and Backup for GKE
Section titled “Hands-On Exercise: Regional PD Failover and Backup for GKE”Objective
Section titled “Objective”Deploy a stateful application with Regional PDs, simulate a zone failure to observe failover behavior, and use Backup for GKE to backup and restore the application.
Prerequisites
Section titled “Prerequisites”gcloudCLI installed and authenticated- A GCP project with billing enabled
- GKE and Backup for GKE APIs enabled
Task 1: Create a GKE Cluster and Regional StorageClass
Solution
export PROJECT_ID=$(gcloud config get-value project)export REGION=us-central1
# Enable APIsgcloud services enable container.googleapis.com gkebackup.googleapis.com
# Create a regional clustergcloud container clusters create storage-demo \ --region=$REGION \ --num-nodes=1 \ --machine-type=e2-standard-2 \ --release-channel=regular \ --enable-ip-alias \ --workload-pool=$PROJECT_ID.svc.id.goog \ --addons=BackupRestore
# Get credentialsgcloud container clusters get-credentials storage-demo --region=$REGION
# Create the Regional PD StorageClasskubectl apply -f - <<'EOF'apiVersion: storage.k8s.io/v1kind: StorageClassmetadata: name: regional-ssdprovisioner: pd.csi.storage.gke.ioparameters: type: pd-ssd replication-type: regional-pdvolumeBindingMode: WaitForFirstConsumerreclaimPolicy: RetainallowVolumeExpansion: trueEOF
kubectl get storageclassesTask 2: Deploy a Stateful Application with Regional PD
Solution
kubectl apply -f - <<'EOF'apiVersion: v1kind: Servicemetadata: name: counter-dbspec: clusterIP: None selector: app: counter-db ports: - port: 5432
---apiVersion: apps/v1kind: StatefulSetmetadata: name: counter-dbspec: serviceName: counter-db replicas: 1 selector: matchLabels: app: counter-db template: metadata: labels: app: counter-db spec: terminationGracePeriodSeconds: 10 containers: - name: postgres image: postgres:16 ports: - containerPort: 5432 env: - name: POSTGRES_DB value: counter - name: POSTGRES_USER value: app - name: POSTGRES_PASSWORD value: demo-password-change-me - name: PGDATA value: /var/lib/postgresql/data/pgdata volumeMounts: - name: data mountPath: /var/lib/postgresql/data resources: requests: cpu: 250m memory: 512Mi volumeClaimTemplates: - metadata: name: data spec: accessModes: ["ReadWriteOnce"] storageClassName: regional-ssd resources: requests: storage: 10GiEOF
# Wait for the StatefulSet to be readykubectl rollout status statefulset/counter-db --timeout=180s
# Insert test datakubectl exec counter-db-0 -- psql -U app -d counter -c \ "CREATE TABLE visits (id SERIAL PRIMARY KEY, ts TIMESTAMP DEFAULT NOW());"kubectl exec counter-db-0 -- psql -U app -d counter -c \ "INSERT INTO visits DEFAULT VALUES; INSERT INTO visits DEFAULT VALUES; INSERT INTO visits DEFAULT VALUES;"kubectl exec counter-db-0 -- psql -U app -d counter -c \ "SELECT count(*) FROM visits;"
# Verify the PV is a regional PDPV_NAME=$(kubectl get pvc data-counter-db-0 -o jsonpath='{.spec.volumeName}')kubectl get pv $PV_NAME -o yaml | grep -A 5 "nodeAffinity"Task 3: Simulate Zone Failure and Observe Failover
Solution
# Find which node and zone the pod is running onNODE=$(kubectl get pod counter-db-0 -o jsonpath='{.spec.nodeName}')ZONE=$(kubectl get node $NODE -o jsonpath='{.metadata.labels.topology\.kubernetes\.io/zone}')echo "Pod is on node: $NODE in zone: $ZONE"
# Cordon and drain the node to simulate zone failurekubectl cordon $NODEkubectl delete pod counter-db-0 --grace-period=10
# Watch the pod reschedule to another zoneecho "Watching pod reschedule..."kubectl get pods -w -l app=counter-db &WATCH_PID=$!sleep 60kill $WATCH_PID 2>/dev/null
# Verify the pod restarted in a different zoneNEW_NODE=$(kubectl get pod counter-db-0 -o jsonpath='{.spec.nodeName}')NEW_ZONE=$(kubectl get node $NEW_NODE -o jsonpath='{.metadata.labels.topology\.kubernetes\.io/zone}')echo "Pod is now on node: $NEW_NODE in zone: $NEW_ZONE"
# Verify data survived the failoverkubectl exec counter-db-0 -- psql -U app -d counter -c \ "SELECT count(*) FROM visits;"# Should still show 3 rows
# Uncordon the original nodekubectl uncordon $NODETask 4: Set Up Backup for GKE
Solution
# Create a backup plangcloud beta container backup-restore backup-plans create storage-demo-backup \ --project=$PROJECT_ID \ --location=$REGION \ --cluster=projects/$PROJECT_ID/locations/$REGION/clusters/storage-demo \ --all-namespaces \ --include-volume-data \ --include-secrets \ --backup-retain-days=7
# Create a manual backupgcloud beta container backup-restore backups create manual-backup-1 \ --project=$PROJECT_ID \ --location=$REGION \ --backup-plan=storage-demo-backup \ --wait-for-completion
# Verify the backupgcloud beta container backup-restore backups describe manual-backup-1 \ --project=$PROJECT_ID \ --location=$REGION \ --backup-plan=storage-demo-backup \ --format="yaml(state, resourceCount, volumeCount, sizeBytes)"Task 5: Simulate Data Loss and Restore from Backup
Solution
# Simulate accidental data deletionkubectl exec counter-db-0 -- psql -U app -d counter -c \ "DROP TABLE visits;"kubectl exec counter-db-0 -- psql -U app -d counter -c \ "SELECT count(*) FROM visits;" 2>&1 || echo "Table is gone!"
# Delete the StatefulSet and PVC to simulate total losskubectl delete statefulset counter-dbkubectl delete pvc data-counter-db-0
# Create a restore plangcloud beta container backup-restore restore-plans create full-restore \ --project=$PROJECT_ID \ --location=$REGION \ --cluster=projects/$PROJECT_ID/locations/$REGION/clusters/storage-demo \ --backup-plan=projects/$PROJECT_ID/locations/$REGION/backupPlans/storage-demo-backup \ --all-namespaces \ --volume-data-restore-policy=RESTORE_VOLUME_DATA_FROM_BACKUP \ --namespaced-resource-restore-mode=DELETE_AND_RESTORE \ --cluster-resource-conflict-policy=USE_BACKUP_VERSION
# Execute the restoregcloud beta container backup-restore restores create restore-1 \ --project=$PROJECT_ID \ --location=$REGION \ --restore-plan=full-restore \ --backup=projects/$PROJECT_ID/locations/$REGION/backupPlans/storage-demo-backup/backups/manual-backup-1 \ --wait-for-completion
# Wait for the StatefulSet to come backkubectl rollout status statefulset/counter-db --timeout=300s
# Verify data is restoredkubectl exec counter-db-0 -- psql -U app -d counter -c \ "SELECT count(*) FROM visits;"# Should show 3 rows againTask 6: Clean Up
Solution
# Delete backup resources firstgcloud beta container backup-restore restore-plans delete full-restore \ --project=$PROJECT_ID --location=$REGION --quiet 2>/dev/nullgcloud beta container backup-restore backups delete manual-backup-1 \ --project=$PROJECT_ID --location=$REGION \ --backup-plan=storage-demo-backup --quiet 2>/dev/nullgcloud beta container backup-restore backup-plans delete storage-demo-backup \ --project=$PROJECT_ID --location=$REGION --quiet
# Delete the clustergcloud container clusters delete storage-demo \ --region=$REGION --quiet
# Check for orphaned regional PDs (reclaim policy was Retain)gcloud compute disks list --filter="name~pvc" \ --format="table(name, zone, sizeGb, status)"# Delete any orphaned disks manually if needed
echo "Cleanup complete."Success Criteria
Section titled “Success Criteria”- Regional PD StorageClass created with
replication-type: regional-pd - StatefulSet deployed with data written to PostgreSQL
- Pod successfully failed over to a different zone with data intact
- Backup created with Backup for GKE (includes volume data)
- Data deleted and StatefulSet destroyed to simulate total loss
- Application restored from backup with all data intact
- All resources cleaned up
Next Module
Section titled “Next Module”Next up: Module 6.5: GKE Observability and Fleet Management --- Learn how to monitor GKE with Cloud Operations Suite and Managed Prometheus, manage multiple clusters with Fleet, enable cross-cluster communication with Multi-Cluster Services, and implement cost allocation.