Skip to content

Module 9.1: Relational Database Integration (RDS / Cloud SQL / Flexible Server)

Complexity: [MEDIUM] | Time to Complete: 2h | Prerequisites: Cloud Essentials (any provider), Kubernetes networking basics

After completing this module, you will be able to:

  • Configure private connectivity from Kubernetes pods to managed databases (RDS, Cloud SQL, Flexible Server) using VPC-native networking
  • Implement connection pooling with PgBouncer or ProxySQL sidecars to optimize database connection management from pods
  • Deploy automated credential rotation for database secrets using cloud-native rotation with Kubernetes External Secrets Operator
  • Design high-availability database architectures with cross-AZ failover that Kubernetes workloads survive transparently

In September 2022, a Series B fintech startup ran their PostgreSQL database as a StatefulSet inside EKS. They had read every blog post about “running databases on Kubernetes” and felt confident. One Friday at 4:47 PM, a node auto-scaling event drained the database pod. The PersistentVolume was in us-east-1a, but the replacement node landed in us-east-1b. The pod sat in Pending for 22 minutes. During those 22 minutes, their payment processing pipeline — which served 14,000 transactions per hour — was completely dead. The postmortem estimated $89,000 in lost revenue and two enterprise customers who never came back.

The startup migrated to Amazon RDS the following Monday. Not because Kubernetes cannot run databases — it absolutely can — but because managed databases handle the hardest parts of database operations: automated failover, point-in-time recovery, patching, and cross-AZ replication. The real engineering challenge shifted from “keeping PostgreSQL alive” to “connecting Kubernetes workloads to managed databases securely, efficiently, and reliably.”

This module teaches you the second part. You will learn how to connect Kubernetes pods to managed relational databases across all three major clouds using private networking, connection pooling, credential rotation, schema migrations in a GitOps workflow, and high-availability patterns that survive AZ failures without your on-call engineer losing sleep.


The first rule of database connectivity from Kubernetes: never expose your database to the public internet. Every cloud provider offers private endpoint mechanisms that keep traffic on the provider’s backbone network.

+---------------------------+ +---------------------------+
| Kubernetes VPC | | Database Service |
| | | |
| +------+ +------+ | | +------------------+ |
| | Pod A | | Pod B | | VPC | | Primary (AZ-a) | |
| +---+---+ +---+---+ | Peering/ | +------------------+ |
| | | | Private | |
| +---+-----------+---+ | Endpoint | +------------------+ |
| | ClusterIP Service |----+----------+-->| Replica (AZ-b) | |
| +-------------------+ | | +------------------+ |
+---------------------------+ +---------------------------+

Stop and think: If your pod in us-east-1a queries a database in us-east-1b, the traffic is private and secure. However, what other consequence does crossing an Availability Zone boundary have? (Hint: Think about your cloud provider’s monthly billing statement).

On AWS, your EKS cluster and RDS instance should share the same VPC or use VPC peering. RDS instances deployed into private subnets are accessible from any resource within the VPC.

Terminal window
# Create a DB subnet group using private subnets
aws rds create-db-subnet-group \
--db-subnet-group-name eks-database-subnets \
--db-subnet-group-description "Private subnets for RDS from EKS" \
--subnet-ids subnet-0a1b2c3d4e5f00001 subnet-0a1b2c3d4e5f00002
# Create a security group allowing traffic from EKS node CIDR
aws ec2 create-security-group \
--group-name rds-from-eks \
--description "Allow PostgreSQL from EKS nodes" \
--vpc-id vpc-0abc123def456
SG_ID=$(aws ec2 describe-security-groups \
--filters "Name=group-name,Values=rds-from-eks" \
--query 'SecurityGroups[0].GroupId' --output text)
# Allow port 5432 from EKS pod CIDR (check your VPC CNI config)
aws ec2 authorize-security-group-ingress \
--group-id $SG_ID \
--protocol tcp --port 5432 \
--cidr 10.0.0.0/16
# Create RDS instance in private subnets
aws rds create-db-instance \
--db-instance-identifier app-postgres \
--db-instance-class db.r6g.large \
--engine postgres --engine-version 16.4 \
--master-username appadmin \
--manage-master-user-password \
--allocated-storage 100 --storage-type gp3 \
--db-subnet-group-name eks-database-subnets \
--vpc-security-group-ids $SG_ID \
--multi-az --storage-encrypted \
--no-publicly-accessible

The --manage-master-user-password flag tells RDS to store the master password in AWS Secrets Manager automatically. No human ever sees or handles the password.

GCP: Cloud SQL with Private Service Connect

Section titled “GCP: Cloud SQL with Private Service Connect”
Terminal window
# Allocate IP range for Private Service Connect
gcloud compute addresses create google-managed-services \
--global --purpose=VPC_PEERING \
--addresses=10.100.0.0 --prefix-length=16 \
--network=my-vpc
# Create the private connection
gcloud services vpc-peerings connect \
--service=servicenetworking.googleapis.com \
--ranges=google-managed-services \
--network=my-vpc
# Create Cloud SQL with private IP only
gcloud sql instances create app-postgres \
--database-version=POSTGRES_16 \
--tier=db-custom-4-16384 \
--region=us-central1 \
--network=my-vpc \
--no-assign-ip \
--availability-type=REGIONAL \
--storage-type=SSD --storage-size=100GB \
--storage-auto-increase
# Get the private IP
gcloud sql instances describe app-postgres \
--format='value(ipAddresses.filter("type=PRIVATE").ipAddress)'

Azure: Flexible Server with Private Endpoint

Section titled “Azure: Flexible Server with Private Endpoint”
Terminal window
# Create a private DNS zone for PostgreSQL
az network private-dns zone create \
--resource-group myRG \
--name privatelink.postgres.database.azure.com
# Link DNS zone to the VNET
az network private-dns zone vnet-link create \
--resource-group myRG \
--zone-name privatelink.postgres.database.azure.com \
--name aks-link --virtual-network aks-vnet \
--registration-enabled false
# Create Flexible Server with VNET integration
az postgres flexible-server create \
--resource-group myRG --name app-postgres \
--version 16 --sku-name Standard_D4ds_v5 \
--storage-size 128 \
--vnet aks-vnet --subnet db-subnet \
--private-dns-zone privatelink.postgres.database.azure.com \
--high-availability ZoneRedundant

Regardless of cloud, create an ExternalName or headless Service so your application code uses a Kubernetes-native DNS name:

apiVersion: v1
kind: Service
metadata:
name: app-database
namespace: production
spec:
type: ExternalName
externalName: app-postgres.abc123.us-east-1.rds.amazonaws.com

Your application connects to app-database.production.svc.cluster.local. If you migrate from RDS to Cloud SQL, you change the Service — not every application config.


Every database connection consumes memory on the server (roughly 5-10 MB per connection for PostgreSQL). Kubernetes makes this worse because pods scale horizontally. If you have 20 replicas, each maintaining a pool of 10 connections, that is 200 connections. During a rolling deployment, both old and new pods exist simultaneously — suddenly 400 connections.

Managed databases have connection limits. An RDS db.r6g.large instance supports roughly 1,600 connections, but performance degrades well before that ceiling. The answer is connection pooling.

The sidecar pattern places PgBouncer in the same pod as your application. Each pod gets its own pooler.

apiVersion: apps/v1
kind: Deployment
metadata:
name: api-server
namespace: production
spec:
replicas: 10
selector:
matchLabels:
app: api-server
template:
metadata:
labels:
app: api-server
spec:
containers:
- name: api
image: mycompany/api-server:2.1.0
ports:
- containerPort: 8080
env:
- name: DATABASE_URL
value: "postgresql://appuser:$(DB_PASSWORD)@localhost:6432/appdb?sslmode=require"
- name: DB_PASSWORD
valueFrom:
secretKeyRef:
name: db-credentials
key: password
- name: pgbouncer
image: bitnami/pgbouncer:1.23.0
ports:
- containerPort: 6432
env:
- name: PGBOUNCER_DATABASE
value: appdb
- name: POSTGRESQL_HOST
value: app-postgres.abc123.us-east-1.rds.amazonaws.com
- name: POSTGRESQL_PORT
value: "5432"
- name: POSTGRESQL_USERNAME
valueFrom:
secretKeyRef:
name: db-credentials
key: username
- name: POSTGRESQL_PASSWORD
valueFrom:
secretKeyRef:
name: db-credentials
key: password
- name: PGBOUNCER_POOL_MODE
value: transaction
- name: PGBOUNCER_DEFAULT_POOL_SIZE
value: "5"
- name: PGBOUNCER_MAX_CLIENT_CONN
value: "100"
resources:
requests:
cpu: 50m
memory: 64Mi
limits:
cpu: 200m
memory: 128Mi

For larger clusters, a centralized PgBouncer Deployment is more efficient:

apiVersion: apps/v1
kind: Deployment
metadata:
name: pgbouncer
namespace: database
spec:
replicas: 3
selector:
matchLabels:
app: pgbouncer
template:
metadata:
labels:
app: pgbouncer
spec:
topologySpreadConstraints:
- maxSkew: 1
topologyKey: topology.kubernetes.io/zone
whenUnsatisfiable: DoNotSchedule
labelSelector:
matchLabels:
app: pgbouncer
containers:
- name: pgbouncer
image: bitnami/pgbouncer:1.23.0
ports:
- containerPort: 6432
env:
- name: PGBOUNCER_POOL_MODE
value: transaction
- name: PGBOUNCER_DEFAULT_POOL_SIZE
value: "25"
- name: PGBOUNCER_MAX_CLIENT_CONN
value: "1000"
- name: PGBOUNCER_MAX_DB_CONNECTIONS
value: "150"
readinessProbe:
tcpSocket:
port: 6432
initialDelaySeconds: 5
periodSeconds: 10
---
apiVersion: v1
kind: Service
metadata:
name: pgbouncer
namespace: database
spec:
selector:
app: pgbouncer
ports:
- port: 5432
targetPort: 6432
Pool ModeHow It WorksBest ForWatch Out
sessionConnection assigned for entire client sessionLegacy apps using PREPARE/LISTENFewest pooling benefits
transactionConnection returned after each transactionMost web applicationsCannot use session-level features
statementConnection returned after each statementSimple read workloadsBreaks multi-statement transactions

Pause and predict: If you use session pooling with a modern microservice that opens and closes database connections rapidly for each HTTP request, what will happen to the backend connections on your PostgreSQL server?

For 90% of Kubernetes workloads, transaction mode is the correct choice. It provides the best balance of connection reuse and compatibility.


Hardcoded database passwords in Kubernetes Secrets are a ticking time bomb. When you need to rotate them — and you will — you face a coordination problem: update the password in the database, update the Secret in Kubernetes, restart every pod that uses it, and do all of this without downtime.

External Secrets Operator (ESO) with Rotation

Section titled “External Secrets Operator (ESO) with Rotation”

ESO syncs secrets from cloud provider secret managers into Kubernetes Secrets automatically.

apiVersion: external-secrets.io/v1
kind: ExternalSecret
metadata:
name: db-credentials
namespace: production
spec:
refreshInterval: 5m
secretStoreRef:
name: aws-secrets-manager
kind: ClusterSecretStore
target:
name: db-credentials
creationPolicy: Owner
data:
- secretKey: username
remoteRef:
key: production/database/credentials
property: username
- secretKey: password
remoteRef:
key: production/database/credentials
property: password
- secretKey: host
remoteRef:
key: production/database/credentials
property: host

When the secret rotates in Secrets Manager (via an AWS Lambda rotation function or equivalent), ESO picks up the new value within the refreshInterval window.

The safest rotation pattern uses two database users, alternating between them:

Time 0: user_a (active) user_b (standby)
Time 1: Rotate user_b password in Secrets Manager
Time 2: Update K8s Secret to point to user_b
Time 3: Rolling restart -- pods pick up user_b credentials
Time 4: user_a (standby) user_b (active)
Time 5: Rotate user_a password (safe -- nobody using it)

This ensures zero-downtime rotation because the old credentials remain valid throughout the entire rollout.

Terminal window
# AWS Secrets Manager rotation with dual-user strategy
aws secretsmanager rotate-secret \
--secret-id production/database/credentials \
--rotation-lambda-arn arn:aws:lambda:us-east-1:123456789:function:db-rotation \
--rotation-rules '{"AutomaticallyAfterDays": 30}'

Use Reloader or stakater/Reloader to automatically trigger rolling restarts:

apiVersion: apps/v1
kind: Deployment
metadata:
name: api-server
annotations:
reloader.stakater.com/auto: "true"
spec:
# ... Reloader watches for Secret changes and triggers rolling updates

Running ALTER TABLE in production is nerve-wracking enough. Doing it automatically through a GitOps pipeline requires careful design to avoid breaking running applications.

Never make breaking schema changes in a single step. Instead:

Phase 1: EXPAND - Add new column (nullable or with default)
Phase 2: MIGRATE - Application writes to both old and new columns
Phase 3: CONTRACT - Remove old column after all pods use new schema
+-------------------+------------------------------------------+-----------------------+
| Phase | Database Schema | Application Behavior |
+-------------------+------------------------------------------+-----------------------+
| 1: EXPAND | [ id | name | email (NEW, nullable) ] | App v1: Writes [name] |
+-------------------+------------------------------------------+-----------------------+
| 2: MIGRATE | [ id | name | email ] | App v2: Writes both |
| | (Backfill script populates email) | Reads [email] |
+-------------------+------------------------------------------+-----------------------+
| 3: CONTRACT | [ id | email ] | App v3: Writes [email]|
| | (name column dropped) | |
+-------------------+------------------------------------------+-----------------------+
apiVersion: batch/v1
kind: Job
metadata:
name: db-migrate-v42
namespace: production
annotations:
argocd.argoproj.io/hook: PreSync
argocd.argoproj.io/hook-delete-policy: BeforeHookCreation
spec:
backoffLimit: 0
template:
spec:
restartPolicy: Never
containers:
- name: migrate
image: mycompany/api-server:2.1.0
command: ["./migrate", "--direction=up", "--steps=1"]
env:
- name: DATABASE_URL
valueFrom:
secretKeyRef:
name: db-credentials
key: connection-string
resources:
requests:
cpu: 100m
memory: 128Mi
serviceAccountName: db-migrator

The argocd.argoproj.io/hook: PreSync annotation tells Argo CD to run this Job before deploying new application pods. The migration runs, the schema updates, then the new application version rolls out.

RuleReason
Never drop columns in the same release that removes their usageOld pods still running during rollout will crash
Always add columns as nullable or with defaultsINSERT statements from old code won’t fail
Use advisory locks in migration scriptsPrevents two migration Jobs from running simultaneously
Set a statement timeoutA single ALTER TABLE locking for 10 minutes will block all queries
Test rollback before applyingmigrate down should always work
-- Safe migration example with timeout and lock
SET lock_timeout = '5s';
SET statement_timeout = '30s';
ALTER TABLE orders ADD COLUMN shipping_method VARCHAR(50) DEFAULT 'standard';
CREATE INDEX CONCURRENTLY idx_orders_shipping ON orders(shipping_method);

All three clouds support Multi-AZ deployments for managed databases. The failover mechanics differ:

FeatureAWS RDS Multi-AZGCP Cloud SQL RegionalAzure Flexible Server ZR
Failover time60-120 seconds~30 seconds~60 seconds
Read from standbyNo (Multi-AZ), Yes (Multi-AZ Cluster)NoNo
Cross-regionSeparate feature (Read Replicas)Cross-region replicasGeo-replication
Endpoint changes on failoverNo (DNS CNAME updated)No (IP stays same)No (DNS updated)

Create separate Services for read and write traffic:

# Write endpoint (primary)
apiVersion: v1
kind: Service
metadata:
name: db-write
namespace: production
spec:
type: ExternalName
externalName: app-postgres.abc123.us-east-1.rds.amazonaws.com
---
# Read endpoint (replicas)
apiVersion: v1
kind: Service
metadata:
name: db-read
namespace: production
spec:
type: ExternalName
externalName: app-postgres-ro.abc123.us-east-1.rds.amazonaws.com

Your application then uses two connection strings:

# Application configuration
WRITE_DB = "postgresql://user:pass@db-write.production.svc:5432/appdb"
READ_DB = "postgresql://user:pass@db-read.production.svc:5432/appdb"

This catches many teams off guard. Cross-AZ data transfer costs money on every cloud:

  • AWS: $0.01/GB per direction between AZs
  • GCP: $0.01/GB between zones in the same region
  • Azure: Free within the same region (as of 2025)

If your application in AZ-a talks to a database in AZ-b, every query and response crosses AZ boundaries. For a chatty application doing 10,000 queries per second, each returning 1 KB, that is roughly 864 GB/day — about $17/day just in cross-AZ transfer.

Mitigation strategies:

  1. Use topology-aware routing to prefer same-AZ replicas
  2. Use connection pooling to reduce round-trips
  3. Batch reads where possible
  4. Cache frequently-accessed data (see Module 9.5)

  1. Amazon RDS manages over 1.2 million active database instances as of 2024, making it by far the largest managed database fleet in the world. The service handles more than 350 billion transactions per day across all engines.

  2. PostgreSQL’s maximum connection limit is not a hard cap — it is a function of available memory. Each connection uses a dedicated backend process consuming 5-10 MB of RAM. A db.r6g.xlarge instance (32 GB RAM) could theoretically support 3,200 connections but would have no memory left for actual query processing.

  3. Google Cloud SQL’s “Private Service Connect” replaced the older VPC peering approach in 2024 because VPC peering does not support transitive routing. If you had a hub-and-spoke network topology, Cloud SQL was unreachable from spoke VPCs — a painful limitation that caught many multi-project architectures.

  4. Schema migration tools have been the #1 cause of production outages at companies surveyed by the DORA team, ahead of bad deployments. The most common failure: a migration adds an index on a 500-million-row table without CONCURRENTLY, locking writes for 45 minutes.


MistakeWhy It HappensHow to Fix It
Exposing the database with a public IP “for debugging”Developers need to query from laptopsUse kubectl port-forward to a pod with database access
Not setting volumeBindingMode: WaitForFirstConsumer when self-hostingDefault StorageClass creates volumes immediatelyDoes not apply to managed DBs, but remember for dev environments
Allowing unlimited connections from podsNo connection pooling configuredDeploy PgBouncer (sidecar or centralized) with explicit limits
Storing database passwords in ConfigMapsConfusion between ConfigMap and SecretUse Secrets, and preferably ESO with a cloud secret manager
Running migrations in application startup codeSeems convenient — every pod migrates on bootUse a dedicated Job (PreSync hook) so migration runs exactly once
Ignoring cross-AZ data transfer costsNot visible until the bill arrivesMonitor with VPC Flow Logs and use topology-aware routing
Using session pool mode in PgBouncer by defaultIt is the default settingExplicitly set transaction mode for web workloads
Not testing database failover”Multi-AZ handles it”Schedule quarterly failover tests using aws rds reboot-db-instance --force-failover

1. Your team is migrating a legacy application to Kubernetes. The application currently hardcodes the RDS endpoint `prod-db.abc123.us-east-1.rds.amazonaws.com` in its configuration files. You suggest creating a Kubernetes Service to represent the database instead. If the database is still hosted in RDS, how does introducing a Kubernetes Service improve the architecture, and what specific type of Service should you use?

An ExternalName Service provides a layer of indirection, decoupling the application’s configuration from the physical database location. By using an ExternalName Service, the application connects to a stable internal DNS name like db-write.production.svc.cluster.local. If you need to migrate the database, promote a read replica to primary, or switch to a different cloud provider, you only update the Service definition once. The application pods do not need to be reconfigured or restarted, minimizing risk and operational overhead during database maintenance.

2. A high-traffic e-commerce API is experiencing latency spikes. You notice the PostgreSQL database is hitting its maximum connection limit. The API is written in Go and opens a connection, runs a quick SELECT query, and closes it for every request. You deploy PgBouncer, but the database connection count doesn't drop significantly. You realize PgBouncer is using `session` mode. Why did `session` mode fail to solve the problem, and how would switching to `transaction` mode fix it?

In session mode, PgBouncer assigns a backend server connection to a client for the entire duration of the client’s session. Because the Go API opens and closes connections rapidly, each request ties up a backend connection, providing minimal pooling benefits. Switching to transaction mode resolves this by returning the backend connection to the pool immediately after each transaction completes. This allows PgBouncer to multiplex thousands of brief client transactions over a small, stable pool of backend database connections, drastically reducing memory overhead and connection churn on the PostgreSQL server.

3. Your team needs to rename the `user_status` column to `account_state` in the primary database. The lead developer plans to run `ALTER TABLE users RENAME COLUMN user_status TO account_state;` during the next Argo CD sync. You block the PR, explaining that this will cause an outage during the rolling deployment. Why will a simple rename cause an outage in Kubernetes, and how should the team apply the expand-contract pattern to execute this change safely?

A simple rename causes an outage because Kubernetes rolling deployments run old and new pod versions simultaneously. The old pods still running during the rollout will attempt to query the user_status column, which no longer exists, causing them to crash immediately. The expand-contract pattern solves this by breaking the change into phases. First, you expand by adding the new account_state column. Next, you deploy application code that writes to both columns. Finally, once all pods are updated and data is backfilled, you contract by removing the old user_status column. This ensures every version of the application can safely interact with the database schema at any given moment.

4. At 3:00 AM, the primary RDS instance in `us-east-1a` suffers a hardware failure. The database is configured for Multi-AZ, and a standby exists in `us-east-1b`. The failover completes in 60 seconds, but your Kubernetes pods continue throwing connection errors for 5 minutes before recovering. Assuming the pods are using an ExternalName Service pointing to the RDS endpoint, what caused this extended downtime, and how does Kubernetes eventually resolve the connection?

During an RDS Multi-AZ failover, AWS promotes the standby instance and updates the DNS CNAME record of the database endpoint to point to the new primary’s IP address. However, Kubernetes pods and nodes often cache DNS lookups based on the Time-To-Live (TTL) of the record. The extended downtime occurs because the pods continue sending traffic to the old, dead IP address until their local DNS cache expires. Once the TTL expires, the pods re-resolve the ExternalName Service, receive the new IP address of the promoted instance, and successfully re-establish their database connections.

5. Your monthly cloud bill shows a massive spike in "Cross-AZ Data Transfer" costs. Your EKS nodes are spread across `us-west-2a`, `2b`, and `2c`, while your RDS instance is primarily in `us-west-2a`. The application makes thousands of small queries per second. Why is this architecture generating data transfer charges, and what are two architectural changes you could make to reduce this specific line item on the bill?

Cloud providers charge for data transfer that crosses Availability Zone boundaries, even within the same region. Because your pods are distributed across three AZs but the database is in one, roughly two-thirds of your application queries and their corresponding result sets are crossing AZ boundaries, incurring bilateral charges. To reduce this cost, you can implement topology-aware routing to force pods to prefer reading from a read replica in their local AZ. Alternatively, you can implement connection pooling or application-level caching to drastically reduce the total volume of round-trips made to the database.

6. A developer notices that a database migration Job deployed via an Argo CD PreSync hook occasionally fails due to a timeout. To ensure the deployment eventually succeeds, they propose changing the Job's `backoffLimit` from `0` to `3`. You reject this change. What is the danger of automatically retrying a failed database migration Job, and why is failing the entire deployment process the safer alternative?

Automatically retrying a database migration Job is dangerous because migrations are rarely idempotent by default. If a migration script fails halfway through—for example, it successfully creates a table but times out creating an index—retrying the Job will cause it to attempt creating the table again, resulting in a fatal error that requires manual database surgery to fix. By keeping backoffLimit: 0, a failure immediately stops the Argo CD sync process. This fail-fast behavior preserves the state of the database and forces an engineer to investigate the partial migration, manually rectify the schema, and safely resume the deployment.

7. Your security team mandates that database passwords be rotated every 30 days. You write a script that updates the password in RDS, then updates the Kubernetes Secret, and finally triggers a rolling restart of the application Deployments. During the next rotation, the application experiences 45 seconds of downtime where database authentication fails. How would implementing a dual-user rotation strategy eliminate this downtime window?

The downtime occurs because there is an unavoidable race condition: old pods still running during the rolling restart have the old password, but the database only accepts the new password. The dual-user rotation strategy eliminates this by maintaining two active database users. When rotation occurs, you change the password of the standby user, update Kubernetes to use the standby user, and trigger the rolling restart. Because the original user’s password was never changed, the old pods continue to function perfectly while the new pods seamlessly connect using the newly rotated credentials.


Hands-On Exercise: Connect Kind Cluster to Local PostgreSQL

Section titled “Hands-On Exercise: Connect Kind Cluster to Local PostgreSQL”

Since managed databases require cloud accounts, we will simulate the architecture locally using Docker and kind.

Terminal window
# Create a Docker network shared between kind and PostgreSQL
docker network create db-lab
# Start PostgreSQL in Docker
docker run -d --name lab-postgres \
--network db-lab \
-e POSTGRES_USER=appadmin \
-e POSTGRES_PASSWORD=lab-secret-123 \
-e POSTGRES_DB=appdb \
-p 5432:5432 \
postgres:16
# Create a kind cluster attached to the same Docker network
cat > /tmp/kind-db-lab.yaml << 'EOF'
kind: Cluster
apiVersion: kind.x-k8s.io/v1alpha4
nodes:
- role: control-plane
- role: worker
- role: worker
EOF
kind create cluster --name db-lab --config /tmp/kind-db-lab.yaml
# Connect kind nodes to the db-lab network
docker network connect db-lab db-lab-control-plane
docker network connect db-lab db-lab-worker
docker network connect db-lab db-lab-worker2
# Get PostgreSQL's IP on the db-lab network
PG_IP=$(docker inspect lab-postgres \
--format '{{range .NetworkSettings.Networks}}{{.IPAddress}}{{end}}' | head -1)
echo "PostgreSQL IP: $PG_IP"

Create a Service that points to the PostgreSQL container.

Solution

Since ExternalName requires a DNS name (not an IP), use a headless Service with Endpoints:

apiVersion: v1
kind: Service
metadata:
name: app-database
namespace: default
spec:
clusterIP: None
ports:
- port: 5432
targetPort: 5432
---
apiVersion: v1
kind: Endpoints
metadata:
name: app-database
namespace: default
subsets:
- addresses:
- ip: "${PG_IP}" # Replace with actual IP from setup
ports:
- port: 5432
Terminal window
# Apply (replace PG_IP with actual value)
sed "s/\${PG_IP}/$PG_IP/" /tmp/db-service.yaml | k apply -f -

Task 2: Deploy PgBouncer as a Centralized Proxy

Section titled “Task 2: Deploy PgBouncer as a Centralized Proxy”

Deploy a PgBouncer Deployment with 2 replicas and a ClusterIP Service.

Solution
apiVersion: v1
kind: Secret
metadata:
name: db-credentials
stringData:
username: appadmin
password: lab-secret-123
---
apiVersion: apps/v1
kind: Deployment
metadata:
name: pgbouncer
spec:
replicas: 2
selector:
matchLabels:
app: pgbouncer
template:
metadata:
labels:
app: pgbouncer
spec:
containers:
- name: pgbouncer
image: bitnami/pgbouncer:1.23.0
ports:
- containerPort: 6432
env:
- name: PGBOUNCER_DATABASE
value: appdb
- name: POSTGRESQL_HOST
value: app-database
- name: POSTGRESQL_PORT
value: "5432"
- name: POSTGRESQL_USERNAME
valueFrom:
secretKeyRef:
name: db-credentials
key: username
- name: POSTGRESQL_PASSWORD
valueFrom:
secretKeyRef:
name: db-credentials
key: password
- name: PGBOUNCER_POOL_MODE
value: transaction
- name: PGBOUNCER_DEFAULT_POOL_SIZE
value: "10"
readinessProbe:
tcpSocket:
port: 6432
initialDelaySeconds: 5
periodSeconds: 10
---
apiVersion: v1
kind: Service
metadata:
name: pgbouncer
spec:
selector:
app: pgbouncer
ports:
- port: 5432
targetPort: 6432
Terminal window
k apply -f /tmp/pgbouncer.yaml
k wait --for=condition=ready pod -l app=pgbouncer --timeout=60s

Task 3: Test Connectivity Through PgBouncer

Section titled “Task 3: Test Connectivity Through PgBouncer”

Run a test pod that connects through PgBouncer and creates a table.

Solution
Terminal window
k run db-test --rm -it --image=postgres:16 --restart=Never -- \
psql "postgresql://appadmin:lab-secret-123@pgbouncer:5432/appdb" \
-c "CREATE TABLE test_connection (id serial PRIMARY KEY, created_at timestamp DEFAULT now());
INSERT INTO test_connection DEFAULT VALUES;
SELECT * FROM test_connection;"

Create a Kubernetes Job that runs a migration script.

Solution
apiVersion: batch/v1
kind: Job
metadata:
name: migration-v1
spec:
backoffLimit: 0
template:
spec:
restartPolicy: Never
containers:
- name: migrate
image: postgres:16
command:
- psql
- "postgresql://appadmin:lab-secret-123@pgbouncer:5432/appdb"
- -c
- |
BEGIN;
SET lock_timeout = '5s';
CREATE TABLE IF NOT EXISTS users (
id SERIAL PRIMARY KEY,
email VARCHAR(255) NOT NULL UNIQUE,
name VARCHAR(255),
created_at TIMESTAMP DEFAULT NOW()
);
INSERT INTO users (email, name) VALUES
('alice@example.com', 'Alice'),
('bob@example.com', 'Bob');
COMMIT;
Terminal window
k apply -f /tmp/migration-job.yaml
k wait --for=condition=complete job/migration-v1 --timeout=30s
k logs job/migration-v1

Create a second endpoint Service simulating a read replica and test routing.

Solution
Terminal window
# Create read-only Service (same PostgreSQL in this lab, but separate Service)
cat <<'EOF' | k apply -f -
apiVersion: v1
kind: Service
metadata:
name: db-read
spec:
clusterIP: None
ports:
- port: 5432
EOF
# Create Endpoints pointing to same PG (simulating a read replica)
cat <<EOF | k apply -f -
apiVersion: v1
kind: Endpoints
metadata:
name: db-read
subsets:
- addresses:
- ip: "$PG_IP"
ports:
- port: 5432
EOF
# Test reading from the "replica"
k run read-test --rm -it --image=postgres:16 --restart=Never -- \
psql "postgresql://appadmin:lab-secret-123@db-read:5432/appdb" \
-c "SELECT * FROM users;"

Implement a manual credential rotation to see how workloads behave when secrets change.

Solution
Terminal window
# 1. Create a dummy Deployment using the secret
cat <<EOF | k apply -f -
apiVersion: apps/v1
kind: Deployment
metadata:
name: api-worker
spec:
replicas: 1
selector:
matchLabels:
app: api-worker
template:
metadata:
labels:
app: api-worker
spec:
containers:
- name: worker
image: postgres:16
command: ["sleep", "3600"]
env:
- name: DB_PASSWORD
valueFrom:
secretKeyRef:
name: db-credentials
key: password
EOF
k wait --for=condition=available deployment/api-worker --timeout=30s
# 2. Update the secret in Kubernetes (simulating an external rotation)
k create secret generic db-credentials \
--from-literal=username=appadmin \
--from-literal=password=new-rotated-secret-456 \
--dry-run=client -o yaml | k apply -f -
# 3. Notice the pod doesn't automatically get the new password
# In a real environment, you need Reloader to trigger this automatically
k rollout restart deployment api-worker
k rollout status deployment api-worker
# 4. Verify the new pod has the new password
k exec deploy/api-worker -- env | grep DB_PASSWORD
  • ExternalName/headless Service resolves to PostgreSQL container
  • PgBouncer Deployment has 2 ready replicas
  • Test pod connects through PgBouncer successfully
  • Migration Job completes and creates the users table
  • Read endpoint returns data from the simulated replica
  • Credential rotation successfully triggers new pod creation via rollout
Terminal window
kind delete cluster --name db-lab
docker rm -f lab-postgres
docker network rm db-lab

Next Module: Module 9.2: Managed Message Brokers & Event-Driven Kubernetes — Learn how to integrate SQS, Pub/Sub, and Service Bus with Kubernetes workloads, and use KEDA to autoscale consumers based on queue depth.