Caching Demystified - Complete Series Index
A comprehensive, in-depth guide to mastering caching from first principles to production-grade architecture and interview preparation.
Series Overview
| Part | Title | Topics Covered |
|---|---|---|
| Part 1 | Fundamentals | What is caching, memory hierarchy, core terminology, types of caches, eviction policies, metrics, invalidation, sizing |
| Part 2 | Strategies and Patterns | Cache-Aside, Read-Through, Write-Through, Write-Behind, Write-Around, Refresh-Ahead, multi-level caching, consistent hashing |
| Part 3 | Technologies and Tools | Caffeine, Redis (deep dive), Memcached, HTTP caching, CDN caching, database caching, Spring Boot |
| Part 4 | Pitfalls and Solutions | Cache stampede, penetration, avalanche, breakdown, hot key, stale data, poisoning, cold start, serialization |
| Part 5 | Interview Questions | 48+ Q&As ordered by frequency - fundamentals to tricky to system design |
Part 1: Fundamentals
caching-demystified-part1-fundamentals.md
- What is caching and why it exists (with kitchen counter, brain, and desk analogies)
- The memory hierarchy - CPU L1/L2/L3, RAM, SSD, HDD, Network (with latency numbers)
- Why caching matters - the math behind Effective Access Time (EAT)
- Complete terminology reference: Cache Hit, Miss, Cold Miss, Capacity Miss, TTL, Eviction, Invalidation, Stale Data, Warm/Cold cache, Cache Coherence, Stampede
- All types of caches: CPU caches, OS page cache, browser cache, DNS cache, in-process, distributed, CDN, database (InnoDB Buffer Pool, Materialized Views), ORM (Hibernate L1/L2)
- All eviction policies: LRU, LFU, FIFO, MRU, Random, ARC, Redis-specific policies
- Cache metrics: Hit Rate, Miss Rate, Eviction Rate, Memory Utilization, Latency percentiles
- Cache invalidation strategies: TTL-based, Event-driven, Polling, Tag-based, CDC
- The double-write race condition and how to solve it
- Cache sizing and capacity planning - the 80/20 rule in practice
Part 2: Strategies and Patterns
caching-demystified-part2-strategies.md
- Cache-Aside (Lazy Loading): read flow, write flow, Java implementation, pros/cons, when to use
- Read-Through: how it differs from Cache-Aside, Java implementation with Caffeine LoadingCache
- Write-Through: synchronous write to cache + DB, @CachePut in Spring Boot, pros/cons
- Write-Behind (Write-Back): async DB writes, data loss risk, Hazelcast MapStore example, pros/cons
- Write-Around: bypassing cache on writes, combining with Cache-Aside for reads
- Refresh-Ahead: proactive refresh before expiry, Caffeine refreshAfterWrite, XFetch algorithm
- Comprehensive strategy comparison table
- Multi-level tiered caching: L1 (Caffeine) + L2 (Redis) + L3 (Database) with Java code
- Local cache coherence problem and Redis Pub/Sub solution
- Consistent hashing: naive hashing failure, ring-based consistent hashing, virtual nodes
- Redis Cluster hash slots (16,384 slots, CRC16 partitioning)
Part 3: Technologies and Tools
caching-demystified-part3-technologies.md
- Caffeine Cache - why it beats Guava, W-TinyLFU algorithm, size-based, weight-based, time-based eviction, LoadingCache, AsyncLoadingCache, Micrometer metrics
- Redis Deep Dive:
- Architecture (single-threaded event loop, multi-threaded I/O in Redis 6+)
- All data structures with real-world use cases: String, Hash, List, Set, Sorted Set, Bitmap, HyperLogLog, Stream
- Key expiration: EXPIRE, PEXPIRE, EXPIREAT, lazy vs active expiration
- Persistence: RDB snapshots, AOF (always/everysec/no), combined RDB+AOF
- High Availability: Replication, Sentinel (quorum, failover), Cluster (hash slots)
- Memory management: maxmemory policies, fragmentation, defragmentation
- Performance: Pipelining, Transactions (MULTI/EXEC/WATCH), Lua scripting
- Memcached - architecture, limitations, vs Redis
- HTTP Caching - Cache-Control all directives (max-age, no-cache, no-store, public, private, s-maxage, must-revalidate, immutable, stale-while-revalidate), ETags, Last-Modified, Vary header, conditional requests, Spring Boot HTTP caching
- CDN Caching - edge servers, PoP, hierarchy, cache invalidation (purging), CloudFront/Cloudflare
- Database Caching - InnoDB Buffer Pool tuning, Materialized Views in PostgreSQL
- Spring Boot Caching - @Cacheable, @CachePut, @CacheEvict, @Caching, @CacheConfig, Caffeine config, Redis RedisCacheManager config, custom KeyGenerator
Part 4: Pitfalls and Solutions
caching-demystified-part4-pitfalls.md
- Cache Stampede / Thundering Herd: mutex lock solution, stale-while-revalidate pattern, Caffeine refreshAfterWrite, randomized TTL, @Cacheable(sync=true)
- Cache Penetration: cache null values (negative caching), Bloom filter (Guava BloomFilter), Bloom filter update on creation
- Cache Avalanche: randomized TTL, circuit breaker (Resilience4j), multi-level caching as buffer, staggered warmup
- Cache Breakdown (Hot Key Expiry): logical expiry pattern (never-expire Redis key), background refresh with distributed lock
- Hot Key Problem: local L1 cache for hot keys, key sharding across N shards, Redis read replicas
- Stale Data and Inconsistency: @TransactionalEventListener(AFTER_COMMIT) for safe invalidation, versioned cache keys
- Cache Poisoning: input validation, namespace isolation, HMAC integrity checking
- Memory Pressure: Redis maxmemory policies, active defragmentation, JVM heap impact
- Cold Start Problem: ApplicationReadyEvent warmup, staggered TTL loading, Caffeine LoadingCache coalescing
- Over-caching vs Under-caching: decision framework (5 questions before caching any data)
- Split-Brain: Redis Sentinel quorum, min-replicas-to-write
- Serialization Pitfalls: Java native serialization dangers, Jackson JSON with FAIL_ON_UNKNOWN_PROPERTIES=false, circular reference handling, JPA lazy loading in cache
- TTL Misconfiguration: too short vs too long, TTL guidelines by data type, dynamic TTL
- Key Design Anti-Patterns: no namespace, SQL as key, inconsistent formats, missing TTL, per-user keys for shared data
Part 5: Interview Questions
caching-demystified-part5-interview-questions.md
Section 1: Core Fundamentals (Most Frequently Asked)
Q1. What is caching and why do we use it?
Q2. Cache hit vs cache miss. What is hit ratio?
Q3. What is TTL? Absolute vs sliding TTL?
Q4. Cache eviction policies. Explain LRU.
Q5. What is cache invalidation? Why is it hard?
Q6. Cache vs database - key differences
Q7. In-process cache vs distributed cache
Q8. Main caching strategies overview
Q9. What is a distributed cache? Give examples.
Q10. Caching in Spring Boot (@Cacheable etc.)
Q11. Redis vs Memcached
Q12. Redis persistence (RDB vs AOF)
Section 2: Caching Patterns
Q13. Cache-Aside vs Read-Through comparison
Q14. Write-Through vs Write-Behind comparison
Q15. Consistent hashing - what it is and why it is needed
Q16. Multi-level caching with real example
Section 3: Redis Deep Dive
Q17. Which Redis data structure for which use case (leaderboard, cart, rate limiting, etc.)
Q18. How does Redis expire keys?
Q19. Redis Sentinel vs Redis Cluster
Q20. Redis pipelining and when to use it
Q21. Distributed lock using Redis (SETNX + Lua)
Section 4: Distributed Systems and Scale
Q22. Cache invalidation in microservices (event-driven, Pub/Sub, CDC)
Q23. CAP theorem applied to caching
Q24. Rate limiter design using Redis (fixed window + sliding window)
Section 5: Cache Pathologies
Q25. Cache stampede - what it is and all prevention methods
Q26. Cache penetration - Bloom filter and null caching
Q27. Cache avalanche vs cache stampede
Q28. Cache breakdown vs cache stampede
Q29. Hot key problem and all solutions
Section 6: HTTP and CDN Caching
Q30. Cache-Control headers (no-cache vs no-store distinction)
Q31. ETags - how they work, vs Last-Modified
Section 7: System Design
Q32. Caching layer for high-traffic e-commerce product catalog
Q33. Session management system using Redis
Section 8: Tricky Questions
Q34. Write-through hidden performance trade-offs
Q35. Nightly batch job + LRU cache pollution problem
Q36. Write-Behind and data loss on crash
Q37. Caching null values creates what new problem?
Q38. Achieving strong consistency in distributed cache - what is the cost?
Q39. 10 app servers with local caches - force invalidation
Q40. Two services with different TTLs for same data - problems
Q41. Forced logout of all sessions for a specific user
Section 9: Spring Boot and Java
Q42. @Cacheable vs @CachePut vs @CacheEvict - when to use each
Q43. N+1 problem in JPA and how caching helps (and does not help)
Section 10: Must-Know Deep Dives
Q44. Two-phase invalidation pattern
Q45. Cache coherence vs cache consistency (precise definitions)
Q46. Production cache degradation investigation (incident response approach)
Q47. MESI protocol for CPU cache coherence
Q48. Cache hit rate drops from 97% to 40% - systematic investigation walkthrough
Key Concepts Quick Reference
The 10 Things to Know Cold for Any Caching Interview
- Hit Ratio formula and what counts as good (< 80% = investigate, > 95% = excellent)
- LRU vs LFU - LRU for general use, LFU for skewed access patterns, ARC self-tunes
- Cache-Aside vs Write-Through - Cache-Aside is default, Write-Through for strong consistency
- Cache Stampede - mutex lock (one fetches, others wait), or Refresh-Ahead (refresh before expiry)
- Cache Penetration - Bloom filter (best at scale), null caching (simple but stale-null risk)
- Cache Avalanche - Randomize TTLs at population time, circuit breaker for DB protection
- Redis Sentinel vs Cluster - Sentinel = HA for single shard, Cluster = sharding + HA
- Consistent hashing - O(1/N) remapping vs 100% with naive hash, virtual nodes for balance
- HTTP no-cache vs no-store - no-cache means "must revalidate," no-store means "never store"
- Distributed invalidation - Event-driven (Kafka/Pub/Sub) is most robust, TTL is safety net
The Caching Decision Framework
Ask these 5 questions before caching any piece of data:
1. Is this data read significantly more than it is written? (Target: > 5:1)
2. Is the same data requested by multiple users or services?
3. Is fetching this data expensive (> 10ms, external API, complex query)?
4. Can the business tolerate data being N seconds/minutes stale?
5. Is cache invalidation on write feasible and implemented?
If 3+ are YES: Cache it.
If fewer than 3: Question whether caching adds enough value for the added complexity.
Caching Strategy Selection at a Glance
I need... Use...
-----------------------------------------
Default caching for reads Cache-Aside + Write-Invalidate
Zero-miss-rate for hot keys Refresh-Ahead (Caffeine refreshAfterWrite)
Write-consistency (DB always matches) Write-Through
High write throughput, some loss OK Write-Behind
Write-once data to not pollute cache Write-Around
Multiple cache levels Caffeine (L1) + Redis (L2) tiered
Shared session across instances Redis with sliding TTL
Rate limiting Redis INCR + EXPIRE (sliding window with ZSet)
Leaderboard Redis Sorted Set
Distributed lock Redis SETNX + Lua release script
Unique count approximation Redis HyperLogLog
Bit-level tracking (DAU, feature flags) Redis Bitmap
This series is part of the System Design and Backend Engineering learning collection.