Module 9.5: Advanced Caching Services (ElastiCache / Memorystore)
Complexity: [COMPLEX] | Time to Complete: 2h | Prerequisites: Module 9.1 (Databases), Module 9.4 (Object Storage), Redis fundamentals
What You’ll Be Able to Do
Section titled “What You’ll Be Able to Do”After completing this module, you will be able to:
- Configure Kubernetes pods to connect to managed caching services (ElastiCache, Memorystore, Azure Cache for Redis)
- Implement cache-aside, write-through, and write-behind patterns for applications running on Kubernetes
- Deploy Redis Sentinel or Cluster mode configurations via managed services with private endpoint connectivity
- Diagnose cache performance issues including connection pooling, serialization overhead, and hot key distribution
Why This Module Matters
Section titled “Why This Module Matters”In November 2023, an e-commerce platform ran their product catalog API on GKE. Every product page required three database queries: product details, pricing, and reviews. During a flash sale, traffic jumped from 3,000 to 45,000 requests per second. Cloud SQL hit its connection limit at 15,000 connections. The auto-scaler was adding pods, but each new pod opened more database connections, making the problem worse. The site went down for 28 minutes during peak sale time. Estimated lost revenue: $2.3 million.
The team had a Redis cluster running in GKE — but it was a self-managed StatefulSet with 3 nodes and no monitoring. They did not discover until the postmortem that Redis had silently evicted 60% of the cache 20 minutes before the sale started due to a memory limit that nobody had reviewed since initial deployment. The database was hit with the full 45,000 RPS because the cache was effectively empty.
They migrated to Google Memorystore for Redis with proper memory sizing, connection limits, and eviction monitoring. The next sale handled 62,000 RPS with the database seeing only 800 QPS — a 98% cache hit rate. Caching is not optional for production Kubernetes workloads. It is the difference between your database being a bottleneck and your database being a safety net.
Redis vs Memcached: Choosing Your Engine
Section titled “Redis vs Memcached: Choosing Your Engine”Decision Matrix
Section titled “Decision Matrix”| Factor | Redis | Memcached |
|---|---|---|
| Data structures | Strings, hashes, lists, sets, sorted sets, streams | Strings only (key-value) |
| Persistence | Optional (RDB snapshots, AOF) | None (pure cache) |
| Replication | Primary-replica with automatic failover | None (each node independent) |
| Clustering | Redis Cluster (data sharding) | Client-side sharding |
| Pub/Sub | Built-in | Not available |
| Lua scripting | Yes | No |
| Max item size | 512 MB | 1 MB (default) |
| Multi-threaded | Single-threaded (I/O threads since 6.0) | Multi-threaded |
| Best for | Complex caching, sessions, leaderboards, pub/sub | Simple key-value, large working sets, multi-threaded reads |
For 90% of Kubernetes workloads, Redis is the right choice. Memcached is simpler but far less capable. Choose Memcached only when you need pure key-value caching at extremely high throughput with no need for data structures, persistence, or replication.
Managed Service Comparison
Section titled “Managed Service Comparison”| Feature | AWS ElastiCache Redis | GCP Memorystore Redis | Azure Cache for Redis |
|---|---|---|---|
| Max memory | 6.1 TB (cluster mode) | 300 GB (standard) | 1.2 TB (Enterprise) |
| Cluster mode | Yes (up to 500 shards) | Yes (since 2024) | Yes (Premium/Enterprise) |
| Multi-AZ failover | Automatic | Automatic (Standard tier) | Automatic (Premium+) |
| Encryption at rest | Yes (KMS) | Yes (CMEK) | Yes (managed keys) |
| Encryption in transit | TLS | TLS | TLS |
| VPC integration | VPC subnets | VPC network | VNET injection |
Caching Strategies
Section titled “Caching Strategies”The “right” caching strategy depends on your read/write ratio and consistency requirements.
Cache-Aside (Lazy Loading)
Section titled “Cache-Aside (Lazy Loading)”The most common pattern. The application checks the cache first; on a miss, it reads from the database and populates the cache.
flowchart TD A[Client Request] --> B{Check Cache} B -- hit --> C[Return cached data] B -- miss --> D[Query Database] D --> E[Write to Cache] E --> F[Return data]import redisimport json
r = redis.Redis(host='redis-master.cache.svc', port=6379, decode_responses=True)
def get_product(product_id): # Step 1: Check cache cache_key = f"product:{product_id}" cached = r.get(cache_key) if cached: return json.loads(cached)
# Step 2: Cache miss -- query database product = db.query("SELECT * FROM products WHERE id = %s", product_id)
# Step 3: Populate cache with TTL r.setex(cache_key, 300, json.dumps(product)) # 5-minute TTL
return productPros: Only caches data that is actually requested. Simple to implement. Cons: Cache miss penalty (extra latency on first request). Data can become stale until TTL expires.
Write-Through
Section titled “Write-Through”Every write goes to both the cache and the database simultaneously.
flowchart TD A[Write Request] --> B[Write to Cache] B --> C[Write to Database] C --> D[Return success]def update_product_price(product_id, new_price): # Write to database first db.execute("UPDATE products SET price = %s WHERE id = %s", new_price, product_id)
# Update cache (same transaction boundary) product = db.query("SELECT * FROM products WHERE id = %s", product_id) cache_key = f"product:{product_id}" r.setex(cache_key, 300, json.dumps(product))
return productPros: Cache is always consistent with the database. No stale reads after writes. Cons: Write latency increases (two writes per operation). Caches data that may never be read.
Write-Behind (Write-Back)
Section titled “Write-Behind (Write-Back)”Writes go to the cache immediately and are asynchronously flushed to the database.
flowchart TD A[Write Request] --> B[Write to Cache] B --> C[Return success fast!] B -. async, batched .-> D[Flush to Database]Pros: Extremely fast writes. Database load is smoothed by batching. Cons: Risk of data loss if cache fails before flush. Complex to implement correctly. Not suitable for critical data.
Strategy Selection Guide
Section titled “Strategy Selection Guide”| Scenario | Strategy | Why |
|---|---|---|
| Product catalog (read-heavy) | Cache-aside | Most reads, occasional writes |
| User sessions | Write-through | Must be consistent after login/logout |
| Analytics counters | Write-behind | High write volume, eventual consistency OK |
| API rate limiting | Cache-aside + TTL | Natural expiration, no DB needed |
| Shopping cart | Write-through | Consistency critical for commerce |
| Leaderboard scores | Cache-aside + sorted sets | Redis sorted sets are purpose-built for this |
Stop and think: You are designing a shopping cart service where every item added must be securely recorded, but users frequently refresh the page to view their cart. Which caching strategy provides the necessary consistency while handling the read traffic?
Connecting Kubernetes to Managed Redis
Section titled “Connecting Kubernetes to Managed Redis”ElastiCache Redis from EKS
Section titled “ElastiCache Redis from EKS”# Create ElastiCache Redis clusteraws elasticache create-replication-group \ --replication-group-id app-cache \ --replication-group-description "App caching layer" \ --engine redis --engine-version 7.1 \ --cache-node-type cache.r7g.large \ --num-cache-clusters 3 \ --multi-az-enabled \ --automatic-failover-enabled \ --at-rest-encryption-enabled \ --transit-encryption-enabled \ --cache-subnet-group-name eks-cache-subnets \ --security-group-ids sg-0abc123def456# Kubernetes Service for Redis endpointapiVersion: v1kind: Servicemetadata: name: redis-primary namespace: cachespec: type: ExternalName externalName: app-cache.abc123.ng.0001.use1.cache.amazonaws.com---apiVersion: v1kind: Servicemetadata: name: redis-reader namespace: cachespec: type: ExternalName externalName: app-cache-ro.abc123.ng.0001.use1.cache.amazonaws.comApplication Deployment with Redis
Section titled “Application Deployment with Redis”apiVersion: apps/v1kind: Deploymentmetadata: name: api-server namespace: productionspec: replicas: 10 selector: matchLabels: app: api-server template: metadata: labels: app: api-server spec: containers: - name: api image: mycompany/api-server:3.0.0 env: - name: REDIS_PRIMARY_HOST value: redis-primary.cache.svc.cluster.local - name: REDIS_READER_HOST value: redis-reader.cache.svc.cluster.local - name: REDIS_PORT value: "6379" - name: REDIS_TLS_ENABLED value: "true" - name: REDIS_AUTH_TOKEN valueFrom: secretKeyRef: name: redis-auth key: token - name: REDIS_MAX_CONNECTIONS value: "20" - name: REDIS_CONNECT_TIMEOUT_MS value: "2000" - name: REDIS_COMMAND_TIMEOUT_MS value: "500" resources: requests: cpu: 500m memory: 512MiPause and predict: If you provision an ElastiCache Redis cluster with 3 nodes (1 primary, 2 replicas), how should your Kubernetes application route write commands versus read commands?
Cache Stampede Prevention
Section titled “Cache Stampede Prevention”A cache stampede (also called “thundering herd”) happens when a popular cache key expires and hundreds of pods simultaneously query the database to rebuild it.
flowchart LR subgraph Normal A[100 pods] -->|Cache HIT| B[Return cached data] B -.- C[DB: 0 queries] end subgraph Stampede D[100 pods] -->|Cache MISS| E[100 database queries] E --> F[100 cache writes] F -.- G[DB: 100 queries] endPrevention Strategies
Section titled “Prevention Strategies”1. Probabilistic Early Expiration (PER)
Section titled “1. Probabilistic Early Expiration (PER)”Refresh the cache before it expires, with a probability that increases as the TTL decreases:
import randomimport time
def get_with_per(key, ttl=300, beta=1.0): """Probabilistic early refresh to prevent stampedes.""" cached = r.get(key) if cached: data = json.loads(cached) remaining_ttl = r.ttl(key) # As TTL decreases, probability of refresh increases # beta controls aggressiveness (higher = earlier refresh) delta = ttl * beta * random.random() if remaining_ttl < delta: # This pod refreshes the cache early return refresh_cache(key, ttl) return data return refresh_cache(key, ttl)
def refresh_cache(key, ttl): data = db.query_product(key.split(':')[1]) r.setex(key, ttl, json.dumps(data)) return data2. Distributed Lock (Mutex)
Section titled “2. Distributed Lock (Mutex)”Only one pod rebuilds the cache; others wait or serve stale data:
def get_with_lock(key, ttl=300): cached = r.get(key) if cached: return json.loads(cached)
lock_key = f"lock:{key}" # Try to acquire lock (NX = set if not exists, EX = expiry) acquired = r.set(lock_key, "1", nx=True, ex=10)
if acquired: # This pod rebuilds the cache try: data = db.query_product(key.split(':')[1]) r.setex(key, ttl, json.dumps(data)) return data finally: r.delete(lock_key) else: # Another pod is rebuilding -- wait briefly, then retry time.sleep(0.1) cached = r.get(key) if cached: return json.loads(cached) # Fallback: query database directly (rare) return db.query_product(key.split(':')[1])3. Background Refresh (Never Expire)
Section titled “3. Background Refresh (Never Expire)”Cache entries never expire. A background process refreshes them on a schedule:
apiVersion: batch/v1kind: CronJobmetadata: name: cache-warmer namespace: productionspec: schedule: "*/4 * * * *" jobTemplate: spec: template: spec: restartPolicy: OnFailure containers: - name: warmer image: mycompany/cache-warmer:1.0.0 env: - name: REDIS_HOST value: redis-primary.cache.svc.cluster.local command: - python - -c - | import redis, json r = redis.Redis(host='redis-primary.cache.svc.cluster.local')
# Refresh top 1000 products products = db.query("SELECT id FROM products ORDER BY view_count DESC LIMIT 1000") for p in products: data = db.query_product(p['id']) r.setex(f"product:{p['id']}", 600, json.dumps(data)) print(f"Warmed {len(products)} products")Diagnosing Hot Key Distribution
Section titled “Diagnosing Hot Key Distribution”A “hot key” occurs when a single Redis key receives a disproportionate amount of traffic. Because Redis is single-threaded for command execution, a hot key on a clustered Redis setup will overwhelm a single shard, causing high CPU utilization on one node while other nodes remain idle.
Diagnosis
Section titled “Diagnosis”- CPU Monitoring: Monitor CPU utilization per shard. If one shard is at 99% CPU while others are at 10%, you likely have a hot key.
- Redis CLI: Use
redis-cli --hotkeys(requires themaxmemory-policyto be an LFU policy likeallkeys-lfu). - Command Monitoring: Alternatively, use
OBJECT FREQ <key>to check access frequencies. Avoid running theMONITORcommand in production as it drastically reduces performance.
Mitigation
Section titled “Mitigation”- Local Caching: Cache the hot key in the application’s memory (e.g., using a local in-memory cache variable) for a few seconds to absorb the read spike before it hits Redis.
- Key Duplication: Create copies of the key (e.g.,
product:123:1,product:123:2) and have clients randomly read from one of the copies to distribute load across multiple shards.
Stop and think: If a celebrity tweets a link to a specific product, creating a sudden massive read spike on that single product’s cache key, why won’t simply adding more Redis cluster nodes solve the performance issue?
Connection Limits and Pool Management
Section titled “Connection Limits and Pool Management”Managed Redis instances have maximum connection limits based on instance size. Exceeding them causes connection refused errors.
Connection Budget
Section titled “Connection Budget”| Instance Type | Max Connections | With 50 pods (20 conn each) | Remaining |
|---|---|---|---|
| cache.r7g.large | 65,000 | 1,000 | 64,000 |
| cache.r7g.xlarge | 65,000 | 1,000 | 64,000 |
| cache.t4g.micro | 65,000 | 1,000 | 64,000 |
Redis connection limits are generous, but the bottleneck is often on the client side. Each connection consumes memory and a file descriptor in the pod.
Redis Connection Pooling
Section titled “Redis Connection Pooling”# Good: Connection pool (shared connections)import redis
pool = redis.ConnectionPool( host='redis-primary.cache.svc.cluster.local', port=6379, max_connections=20, # Per pod socket_timeout=2.0, # Fail fast socket_connect_timeout=1.0, retry_on_timeout=True, health_check_interval=30, ssl=True,)r = redis.Redis(connection_pool=pool)
# Bad: New connection per request (connection leak)# r = redis.Redis(host='redis-primary.cache.svc.cluster.local') # DON'TMonitoring Connection Usage
Section titled “Monitoring Connection Usage”# PrometheusRule for Redis connection alertsapiVersion: monitoring.coreos.com/v1kind: PrometheusRulemetadata: name: redis-alerts namespace: monitoringspec: groups: - name: redis rules: - alert: RedisConnectionsHigh expr: redis_connected_clients / redis_config_maxclients > 0.8 for: 5m labels: severity: warning annotations: summary: "Redis connection usage above 80%" - alert: RedisCacheHitRateLow expr: | rate(redis_keyspace_hits_total[5m]) / (rate(redis_keyspace_hits_total[5m]) + rate(redis_keyspace_misses_total[5m])) < 0.9 for: 10m labels: severity: warning annotations: summary: "Redis cache hit rate below 90%"Pause and predict: Your application is scaling up during a Black Friday event. If each of your 100 pods opens 50 concurrent Redis connections, and your Redis instance limit is 65,000, why might you still see connection errors during a rolling deployment?
Envoy Sidecar Caching
Section titled “Envoy Sidecar Caching”For HTTP-based APIs, you can add caching at the proxy layer using Envoy as a sidecar. This caches responses without modifying application code.
Architecture
Section titled “Architecture”flowchart TD A[Client] --> B[K8s Service] B --> C[Pod] subgraph Pod D[Envoy Sidecar port 8080] E[(Local Cache)] F[App Container port 8081] D -->|hit| E D -->|miss| F endEnvoy Cache Filter Configuration
Section titled “Envoy Cache Filter Configuration”apiVersion: v1kind: ConfigMapmetadata: name: envoy-cache-config namespace: productiondata: envoy.yaml: | static_resources: listeners: - name: listener_0 address: socket_address: address: 0.0.0.0 port_value: 8080 filter_chains: - filters: - name: envoy.filters.network.http_connection_manager typed_config: "@type": type.googleapis.com/envoy.extensions.filters.network.http_connection_manager.v3.HttpConnectionManager stat_prefix: ingress http_filters: - name: envoy.filters.http.cache typed_config: "@type": type.googleapis.com/envoy.extensions.filters.http.cache.v3.CacheConfig typed_config: "@type": type.googleapis.com/envoy.extensions.http.cache.simple_http_cache.v3.SimpleHttpCacheConfig - name: envoy.filters.http.router typed_config: "@type": type.googleapis.com/envoy.extensions.filters.http.router.v3.Router route_config: virtual_hosts: - name: backend domains: ["*"] routes: - match: prefix: "/" route: cluster: local_app clusters: - name: local_app type: STATIC load_assignment: cluster_name: local_app endpoints: - lb_endpoints: - endpoint: address: socket_address: address: 127.0.0.1 port_value: 8081Your application must return proper Cache-Control headers:
from flask import Flask, jsonify
app = Flask(__name__)
@app.route('/api/products/<product_id>')def get_product(product_id): product = fetch_product(product_id) response = jsonify(product) response.headers['Cache-Control'] = 'public, max-age=60' return responseStop and think: What HTTP headers are absolutely essential for the Envoy sidecar cache filter to know how long to retain a response?
Did You Know?
Section titled “Did You Know?”-
Redis processes over 100,000 operations per second on a single core in typical workloads. Despite being single-threaded for command execution, Redis achieves this throughput because it operates entirely in memory and uses efficient data structures like skip lists and hash tables. Since version 6.0, I/O threading handles network reads/writes on separate threads while command execution remains single-threaded.
-
The term “cache stampede” was formally studied in a 2009 paper titled “Optimal Probabilistic Cache Stampede Prevention” by Vattani, Chierichetti, and Lowenstein. The “probabilistic early expiration” technique from that paper is now standard practice at companies like Facebook, where cache stampedes on popular content could otherwise bring down entire database clusters.
-
AWS ElastiCache Serverless (launched 2023) automatically scales Redis with no capacity planning required. It charges per ECPU (ElastiCache Compute Unit) and per GB of storage, eliminating the need to choose instance types. For workloads with variable traffic, this can reduce costs by 40-60% compared to provisioned instances.
-
Google Memorystore for Redis Cluster became generally available in 2024, supporting up to 25 shards with 250 GB total capacity. Before this, GCP customers who needed Redis clustering had to self-manage Redis on GKE or use third-party services — a gap that existed for over five years.
Common Mistakes
Section titled “Common Mistakes”| Mistake | Why It Happens | How to Fix It |
|---|---|---|
| Not setting TTL on cache entries | ”We will invalidate manually” | Always set TTL as a safety net, even with manual invalidation |
| Using the same Redis for cache and persistent data | ”One cluster is simpler” | Separate cache (can be flushed) from persistent data (sessions, queues) |
| Ignoring memory eviction policy | Default is noeviction (errors when full) | Set maxmemory-policy allkeys-lru for cache workloads |
| Opening new Redis connection per request | Framework default or developer habit | Use connection pooling; configure max_connections per pod |
| No monitoring of cache hit rate | ”It is just a cache, it either works or it does not” | Track hit rate, memory usage, evictions, and connection count |
| Caching errors/null results | Cache miss returns null, null gets cached | Check for valid data before caching; use “negative cache” with short TTL only intentionally |
| No circuit breaker when Redis is down | Redis failure cascades to database overload | Implement circuit breaker; serve stale data or degrade gracefully |
| Storing serialized objects larger than 100 KB | Convenient to cache entire API responses | Cache individual fields or use Redis hashes; large values cause latency spikes |
1. An e-commerce site experiences heavy read traffic on its product catalog, but product details rarely change. They also have a shopping cart service that updates constantly. Which caching strategies should they apply to each service, and why?
For the product catalog, they should use the cache-aside (lazy loading) pattern. This pattern only caches data when it is requested, making it ideal for read-heavy workloads where most data is rarely accessed or updated, thus saving memory and reducing initial write overhead. For the shopping cart, they should use the write-through pattern. This pattern writes to both the cache and the database on every write operation. It ensures strict consistency between the cache and database, which is critical for commerce where reading stale cart data could lead to lost sales or customer frustration. The higher write latency is an acceptable trade-off for this consistency.
2. Your marketing team sends out a push notification to 5 million users about a 90% off flash sale on a specific gaming console. The console's cache key expires exactly as the notification lands. Your database instantly crashes. What caused this, and how could you have architected the application to prevent it?
This was caused by a cache stampede. When the popular cache key expired, thousands of concurrent requests all missed the cache and simultaneously queried the database to rebuild it, overwhelming its connection limits. To prevent this, you could implement a distributed locking strategy. When the cache miss occurs, the first request acquires a Redis lock and queries the database, while all other requests either wait briefly or return slightly stale data. Alternatively, you could use probabilistic early expiration, where requests have an increasing chance of refreshing the cache before it actually expires, spreading the database load over time.
3. A junior engineer proposes saving money by running the application's user session data and its rendered HTML page cache on the exact same Redis cluster, as "they both just store key-value pairs." Why is this architectural decision dangerous for production reliability?
This decision is dangerous because caches and persistent data stores have fundamentally different lifecycles and memory requirements. A cache is designed to be ephemeral and can be safely flushed or evicted without data loss, as the database remains the source of truth. User sessions, however, are persistent data that cannot be easily regenerated; losing them logs out users. If placed on the same cluster, the heavy memory pressure from the HTML page cache would trigger Redis’s eviction policies (like allkeys-lru), potentially deleting active user sessions to make room for cached pages.
4. You are provisioning an ElastiCache Redis instance that supports up to 65,000 connections. Your application has 100 pods, each configured with a connection pool size of 50. During a rolling deployment, the database operations team alerts you that Redis connections are being refused. What went wrong with your connection budgeting?
Your connection budget failed to account for the overlapping pods during a rolling deployment. While 100 pods with 50 connections each require 5,000 connections (well below the 65,000 limit), a rolling update can temporarily double the number of pods to 200, requiring 10,000 connections. Furthermore, if applications leak connections or if timeouts are configured improperly, old pods may not release their connections promptly. You must always calculate the budget based on the maximum possible simultaneous pods (including surge pods during deployments) plus overhead for monitoring and sidecars.
5. Your company acquired a startup running a monolithic legacy API written in a proprietary language that no one knows how to safely modify. The API is crushing its backend database under read load. How can you implement caching for this API without touching a single line of its code?
You can implement caching by injecting an Envoy sidecar proxy into the legacy application’s Kubernetes pods. Envoy can be configured with an HTTP cache filter that intercepts incoming requests before they reach the application container. If a request matches a cached response, Envoy serves it directly, entirely bypassing the application and the database. This approach requires no code changes, relying instead on standard HTTP Cache-Control headers (if the app emits them) or custom routing rules defined in the Envoy configuration to cache the REST API responses at the network layer.
6. Your cache-aside implementation is throwing intermittent timeouts, and the database is seeing elevated load. You check the Redis cluster and see it is at 100% memory utilization with the `maxmemory-policy` set to `noeviction`. How is this policy directly causing your application's symptoms?
The noeviction policy tells Redis to return an Out of Memory (OOM) error for any write command when it is full, rather than making space. Because your application uses the cache-aside pattern, every cache miss results in a database query followed by an attempt to write the result to Redis. Since Redis rejects the write, the data is never cached. Subsequent requests for the same data result in more cache misses and more database queries, causing the elevated database load. For a cache workload, you must use a policy like allkeys-lru or volatile-lru so Redis automatically deletes old entries to make room for new ones.
Hands-On Exercise: Redis Caching with Stampede Prevention
Section titled “Hands-On Exercise: Redis Caching with Stampede Prevention”# Create kind clusterkind create cluster --name cache-labTask 1: Provision Managed Redis via CLI Simulation
Section titled “Task 1: Provision Managed Redis via CLI Simulation”Before deploying the application, let’s practice provisioning a managed Redis instance. While we use Helm for local testing, the command syntax mirrors cloud CLI tools.
Solution
# In an AWS environment, you would use:# aws elasticache create-replication-group \# --replication-group-id cache-lab-cluster \# --engine redis --cache-node-type cache.t4g.micro \# --num-cache-clusters 1
# For our local Kubernetes lab, we simulate the managed service via Helm:helm repo add bitnami https://charts.bitnami.com/bitnamihelm install redis bitnami/redis \ --namespace cache --create-namespace \ --set architecture=standalone \ --set auth.password=cache-lab-pass \ --set master.persistence.enabled=false \ --set master.resources.requests.memory=128Mi
k wait --for=condition=ready pod -l app.kubernetes.io/name=redis \ --namespace cache --timeout=120sTask 2: Implement Cache-Aside Pattern
Section titled “Task 2: Implement Cache-Aside Pattern”Deploy a pod that demonstrates cache-aside with Redis.
Solution
cat <<'EOF' | k apply -n cache -f -apiVersion: v1kind: Podmetadata: name: cache-aside-demospec: restartPolicy: Never containers: - name: demo image: python:3.12-slim command: - /bin/sh - -c - | pip install redis -q python3 << 'PYEOF' import redis import json import time
r = redis.Redis( host='redis-master.cache.svc.cluster.local', port=6379, password='cache-lab-pass', decode_responses=True, socket_timeout=2.0, max_connections=10, )
# Simulated database DATABASE = { "prod-101": {"name": "Widget Pro", "price": 29.99, "stock": 150}, "prod-102": {"name": "Gadget Max", "price": 49.99, "stock": 75}, "prod-103": {"name": "Tool Kit", "price": 89.99, "stock": 200}, }
def get_product(product_id): """Cache-aside pattern.""" cache_key = f"product:{product_id}"
# Step 1: Check cache cached = r.get(cache_key) if cached: print(f" CACHE HIT: {product_id}") return json.loads(cached)
# Step 2: Cache miss -- "query database" print(f" CACHE MISS: {product_id} (querying DB)") time.sleep(0.05) # Simulate DB latency product = DATABASE.get(product_id)
if product: # Step 3: Populate cache (TTL = 60 seconds) r.setex(cache_key, 60, json.dumps(product))
return product
# Demo: First call is a miss, second is a hit print("=== Cache-Aside Demo ===") for round_num in range(1, 4): print(f"\nRound {round_num}:") for pid in ["prod-101", "prod-102", "prod-103"]: result = get_product(pid)
# Show cache stats info = r.info("stats") print(f"\nHits: {info['keyspace_hits']}, Misses: {info['keyspace_misses']}") hit_rate = info['keyspace_hits'] / (info['keyspace_hits'] + info['keyspace_misses']) * 100 print(f"Hit rate: {hit_rate:.1f}%") PYEOFEOF
k wait --for=condition=ready pod/cache-aside-demo -n cache --timeout=60ssleep 5k logs cache-aside-demo -n cacheTask 3: Demonstrate Cache Stampede
Section titled “Task 3: Demonstrate Cache Stampede”Simulate a stampede by launching many concurrent requests after a cache key expires.
Solution
cat <<'EOF' | k apply -n cache -f -apiVersion: v1kind: Podmetadata: name: stampede-demospec: restartPolicy: Never containers: - name: demo image: python:3.12-slim command: - /bin/sh - -c - | pip install redis -q python3 << 'PYEOF' import redis import json import time import threading
r = redis.Redis( host='redis-master.cache.svc.cluster.local', port=6379, password='cache-lab-pass', decode_responses=True, )
db_queries = {"count": 0} lock = threading.Lock()
def simulate_db_query(): """Simulate an expensive database query.""" with lock: db_queries["count"] += 1 time.sleep(0.1) # 100ms DB latency return {"name": "Popular Product", "price": 99.99}
def get_without_protection(product_id): """No stampede protection -- every miss hits DB.""" cached = r.get(f"product:{product_id}") if cached: return json.loads(cached) data = simulate_db_query() r.setex(f"product:{product_id}", 5, json.dumps(data)) return data
def get_with_lock_protection(product_id): """Distributed lock prevents stampede.""" cache_key = f"product:{product_id}" cached = r.get(cache_key) if cached: return json.loads(cached)
lock_key = f"lock:{cache_key}" acquired = r.set(lock_key, "1", nx=True, ex=5)
if acquired: try: data = simulate_db_query() r.setex(cache_key, 5, json.dumps(data)) return data finally: r.delete(lock_key) else: time.sleep(0.15) # Wait for rebuilder cached = r.get(cache_key) return json.loads(cached) if cached else simulate_db_query()
# Test 1: Without protection r.flushall() db_queries["count"] = 0 threads = [] for i in range(50): t = threading.Thread(target=get_without_protection, args=("hot-product",)) threads.append(t) t.start() for t in threads: t.join() print(f"WITHOUT protection: {db_queries['count']} DB queries from 50 requests")
# Test 2: With lock protection r.flushall() db_queries["count"] = 0 threads = [] for i in range(50): t = threading.Thread(target=get_with_lock_protection, args=("hot-product",)) threads.append(t) t.start() for t in threads: t.join() print(f"WITH lock protection: {db_queries['count']} DB queries from 50 requests") PYEOFEOF
k wait --for=condition=ready pod/stampede-demo -n cache --timeout=60ssleep 10k logs stampede-demo -n cacheTask 4: Monitor Redis Metrics
Section titled “Task 4: Monitor Redis Metrics”Create a pod that reports Redis statistics.
Solution
cat <<'EOF' | k apply -n cache -f -apiVersion: v1kind: Podmetadata: name: redis-monitorspec: restartPolicy: Never containers: - name: monitor image: redis:7 command: - /bin/sh - -c - | echo "=== Redis Health Report ===" echo ""
echo "--- Memory ---" redis-cli -h redis-master -a cache-lab-pass INFO memory 2>/dev/null | grep -E "used_memory_human|maxmemory_human|mem_fragmentation"
echo "" echo "--- Clients ---" redis-cli -h redis-master -a cache-lab-pass INFO clients 2>/dev/null | grep -E "connected_clients|blocked_clients|maxclients"
echo "" echo "--- Stats ---" redis-cli -h redis-master -a cache-lab-pass INFO stats 2>/dev/null | grep -E "keyspace_hits|keyspace_misses|evicted_keys|total_commands"
echo "" echo "--- Keyspace ---" redis-cli -h redis-master -a cache-lab-pass INFO keyspace 2>/dev/null
echo "" echo "--- Eviction Policy ---" redis-cli -h redis-master -a cache-lab-pass CONFIG GET maxmemory-policy 2>/dev/null
echo "" echo "=== Report Complete ==="EOF
k wait --for=condition=ready pod/redis-monitor -n cache --timeout=30ssleep 3k logs redis-monitor -n cacheTask 5: Configure Eviction Policy
Section titled “Task 5: Configure Eviction Policy”Change the Redis eviction policy and demonstrate eviction behavior.
Solution
cat <<'EOF' | k apply -n cache -f -apiVersion: v1kind: Podmetadata: name: eviction-demospec: restartPolicy: Never containers: - name: demo image: python:3.12-slim command: - /bin/sh - -c - | pip install redis -q python3 << 'PYEOF' import redis
r = redis.Redis( host='redis-master.cache.svc.cluster.local', port=6379, password='cache-lab-pass', decode_responses=True, )
# Set a small maxmemory for demonstration r.config_set('maxmemory', '1mb') r.config_set('maxmemory-policy', 'allkeys-lru') print("Set maxmemory=1MB, policy=allkeys-lru")
# Fill cache until evictions happen evicted_before = int(r.info('stats')['evicted_keys']) for i in range(5000): r.set(f"item:{i}", "x" * 200) # ~200 bytes each
evicted_after = int(r.info('stats')['evicted_keys']) total_keys = r.dbsize()
print(f"Attempted to write 5000 keys") print(f"Keys in Redis: {total_keys}") print(f"Keys evicted: {evicted_after - evicted_before}") print(f"Eviction policy working correctly: {evicted_after > evicted_before}")
# Reset maxmemory r.config_set('maxmemory', '0') r.flushall() PYEOFEOF
k wait --for=condition=ready pod/eviction-demo -n cache --timeout=60ssleep 8k logs eviction-demo -n cacheSuccess Criteria
Section titled “Success Criteria”- Redis cluster is successfully provisioned via CLI simulation
- Cache-aside demo shows cache hits on second and third rounds
- Stampede demo shows fewer DB queries with lock protection
- Redis monitor reports memory, clients, and keyspace stats
- Eviction demo shows allkeys-lru evicting keys when memory is full
Cleanup
Section titled “Cleanup”kind delete cluster --name cache-labNext Module: Module 9.6: Search & Analytics Engines (OpenSearch / Elasticsearch) — Learn how to ingest Kubernetes logs into managed search engines, configure index lifecycle management, and optimize queries for operational analytics.