Caching Demystified - Part 4: Pitfalls, Problems, and Solutions

Cache Stampede / Thundering Herd
Cache Penetration
Cache Avalanche
Cache Breakdown (Hot Key Expiry)
Hot Key Problem (Cache Hotspot)
Stale Data and Cache Inconsistency
Cache Poisoning (Security)
Memory Pressure and Out-Of-Memory Situations
Cold Start Problem
Over-Caching vs Under-Caching
Distributed Cache Split-Brain
Serialization and Deserialization Pitfalls
TTL Misconfiguration
Cache Key Design Anti-Patterns

1. Cache Stampede / Thundering Herd

What Is It?

A cache stampede occurs when a highly popular cache entry expires, and many concurrent requests simultaneously experience a cache miss. All of them, independently, rush to the database at the same time to fetch the same data. This creates a sudden spike in database load that can overwhelm the database, slow down responses, and in severe cases, trigger a complete system outage.

Visual Representation

T=0s:   1,000 requests/second hit cache. Entry "top_products" has TTL=300s.
        Cache serving all requests at 1ms each. Database load: near zero.

T=300s: Entry "top_products" EXPIRES.
        All concurrent requests: CACHE MISS.
        All 1,000 requests/second rush to database simultaneously.
        Database: Suddenly gets 1,000 identical queries in 1 second.
        Database: Overloaded. Latency spikes to 5,000ms. Connections exhausted.
        Result: Cascading failure.

The Problem in Code

// This code has a stampede vulnerability:
public List<Product> getTopProducts() {
    List<Product> products = cache.get("top_products");
    if (products == null) {
        // ALL concurrent threads reach here simultaneously after cache expires
        // ALL of them execute this expensive query at the same time
        products = productRepository.findTopProducts();  // 500ms query
        cache.set("top_products", products, Duration.ofMinutes(5));
    }
    return products;
}

Solutions

Solution 1: Mutex / Distributed Lock (Only One Thread Refreshes)

@Service
public class ProductService {
 
    @Autowired
    private RedisTemplate<String, Object> redisTemplate;
 
    @Autowired
    private ProductRepository productRepository;
 
    public List<Product> getTopProducts() {
        String cacheKey = "top_products";
        String lockKey = "lock:top_products";
 
        // Fast path: cache hit (no lock needed)
        List<Product> products = (List<Product>) redisTemplate.opsForValue().get(cacheKey);
        if (products != null) return products;
 
        // Cache miss: try to acquire lock
        Boolean acquired = redisTemplate.opsForValue()
                .setIfAbsent(lockKey, "locked", Duration.ofSeconds(5));
 
        if (Boolean.TRUE.equals(acquired)) {
            try {
                // Double-check after acquiring lock (another thread may have populated it)
                products = (List<Product>) redisTemplate.opsForValue().get(cacheKey);
                if (products != null) return products;
 
                // We have the lock - fetch from DB and populate cache
                products = productRepository.findTopProducts();
                redisTemplate.opsForValue().set(cacheKey, products, Duration.ofMinutes(5));
                return products;
            } finally {
                redisTemplate.delete(lockKey);  // Always release lock
            }
        } else {
            // Another thread is refreshing - wait briefly and retry
            try {
                Thread.sleep(50);  // Small wait
            } catch (InterruptedException e) {
                Thread.currentThread().interrupt();
            }
            // Retry - the other thread should have populated the cache
            products = (List<Product>) redisTemplate.opsForValue().get(cacheKey);
            if (products != null) return products;
 
            // If still null, fall back to database directly
            return productRepository.findTopProducts();
        }
    }
}

Trade-off: All waiting threads pile up during the refresh window. With thousands of concurrent requests, thread pool can be exhausted.

Solution 2: Stale-While-Revalidate (Return Stale, Refresh in Background)

@Service
public class ProductService {
 
    // Cache entry wrapper with logical expiry
    @Data
    @AllArgsConstructor
    public static class CacheWrapper<T> implements Serializable {
        private T data;
        private long logicalExpiry;  // When we WANT to refresh (before Redis TTL)
 
        public boolean isLogicallyExpired() {
            return System.currentTimeMillis() > logicalExpiry;
        }
    }
 
    @Autowired
    private RedisTemplate<String, CacheWrapper<List<Product>>> redisTemplate;
 
    @Autowired
    private ProductRepository productRepository;
 
    private final ExecutorService backgroundRefresher = Executors.newFixedThreadPool(5);
 
    public List<Product> getTopProducts() {
        String cacheKey = "top_products";
        CacheWrapper<List<Product>> wrapper =
            (CacheWrapper<List<Product>>) redisTemplate.opsForValue().get(cacheKey);
 
        if (wrapper == null) {
            // True cold miss - must load synchronously
            return loadAndCacheTopProducts();
        }
 
        if (wrapper.isLogicallyExpired()) {
            // Logically expired - return stale data immediately, refresh in background
            backgroundRefresher.submit(() -> {
                try {
                    loadAndCacheTopProducts();  // Refresh asynchronously
                } catch (Exception e) {
                    // Log error, stale data will continue to be served until next refresh attempt
                }
            });
        }
 
        return wrapper.getData();  // Return current (possibly stale) data immediately
    }
 
    private List<Product> loadAndCacheTopProducts() {
        List<Product> products = productRepository.findTopProducts();
        long logicalExpiry = System.currentTimeMillis() + Duration.ofMinutes(5).toMillis();
        CacheWrapper<List<Product>> wrapper = new CacheWrapper<>(products, logicalExpiry);
        // Redis TTL is 10 minutes (physical), logical expiry is 5 minutes
        // This gives us 5 extra minutes to serve stale data during background refresh
        redisTemplate.opsForValue().set("top_products", wrapper, Duration.ofMinutes(10));
        return products;
    }
}

Solution 3: Refresh-Ahead with Caffeine (Automatic)

@Bean
public LoadingCache<String, List<Product>> topProductsCache(ProductRepository repository) {
    return Caffeine.newBuilder()
        .maximumSize(100)
        .expireAfterWrite(5, TimeUnit.MINUTES)
        .refreshAfterWrite(4, TimeUnit.MINUTES)  // Proactive refresh before expiry
        .build(key -> repository.findTopProducts());
}

Caffeine handles the background refresh automatically. The first thread to access the key after refreshAfterWrite triggers an async reload. All concurrent requests continue getting the (slightly stale) cached value.

Solution 4: Randomized TTL (Distribute Expiry Times)

// Instead of all keys having the same TTL, add random jitter
private Duration jitteredTTL(Duration baseTTL) {
    long jitterMs = (long) (Math.random() * 30_000);  // Up to 30 seconds jitter
    return baseTTL.plusMillis(jitterMs);
}
 
// Usage
cache.set("top_products", products, jitteredTTL(Duration.ofMinutes(5)));
// Different calls will have TTLs between 5:00 and 5:30 minutes
// Expiries are spread out, reducing the probability of simultaneous misses

2. Cache Penetration

What Is It?

Cache penetration occurs when requests are made for keys that do not exist in either the cache OR the database. Because the cache always misses (there is nothing to cache), every request falls through to the database. An attacker or a bug can exploit this to generate massive database load with requests for non-existent IDs.

Request: GET /users/999999999 (this user does not exist)

Flow without protection:
  Cache check: MISS (key "user:999999999" doesn't exist)
  DB query: SELECT * FROM users WHERE id = 999999999 -> empty result
  Cache: Nothing to store
  Next request for 999999999: MISS again
  DB query: Again -> empty result
  (Repeat for every request. Database gets hammered.)

With 1,000 requests/second for fake IDs:
  1,000 DB queries/second, all returning empty results.
  Database overloaded, legitimate requests slow down.

Solution 1: Cache Null Values (Negative Caching)

public User getUser(Long userId) {
    String cacheKey = "user:" + userId;
 
    // Check cache - distinguish between null (not cached) and NULL_SENTINEL (cached as not existing)
    Object cached = cache.get(cacheKey);
 
    if (cached == NULL_SENTINEL) {
        return null;  // Cached "not found" result
    }
 
    if (cached != null) {
        return (User) cached;  // Cache hit
    }
 
    // Cache miss - check database
    User user = userRepository.findById(userId).orElse(null);
 
    if (user != null) {
        cache.set(cacheKey, user, Duration.ofMinutes(10));  // Cache real user
    } else {
        // Cache the "not found" result with a SHORT TTL
        // Short TTL because the user might be created soon
        cache.set(cacheKey, NULL_SENTINEL, Duration.ofSeconds(30));
    }
 
    return user;
}
 
private static final String NULL_SENTINEL = "NULL";

Caveat: If null values are cached for valid user IDs (e.g., a new user is created but the cache still says "not found"), users will get incorrect empty responses for up to 30 seconds. Keep the null TTL short.

Solution 2: Bloom Filter (More Scalable)

A Bloom filter is a space-efficient probabilistic data structure that can definitively say "this key DEFINITELY DOES NOT EXIST" or "this key MIGHT EXIST."

False negatives: IMPOSSIBLE. If Bloom filter says not in set, it is definitely not in set.
False positives: POSSIBLE. Bloom filter may say "might exist" for something that does not.

@Configuration
public class BloomFilterConfig {
 
    @Bean
    public BloomFilter<Long> userBloomFilter(UserRepository repository) {
        // Create filter with expected insertions and false positive rate
        BloomFilter<Long> filter = BloomFilter.create(
            Funnels.longFunnel(),
            10_000_000,  // Expected number of users
            0.01         // 1% false positive rate (uses ~11.5 MB)
        );
 
        // Pre-populate with all existing user IDs at startup
        repository.findAllIds().forEach(filter::put);
        return filter;
    }
}
 
@Service
public class UserService {
 
    @Autowired
    private BloomFilter<Long> userBloomFilter;
 
    @Autowired
    private RedisTemplate<String, User> redisTemplate;
 
    @Autowired
    private UserRepository userRepository;
 
    public User getUser(Long userId) {
        // Step 1: Quick Bloom Filter check (~0.001ms, no network)
        if (!userBloomFilter.mightContain(userId)) {
            // DEFINITLEY does not exist - skip cache and DB entirely
            return null;
        }
 
        // Step 2: Check cache
        User user = (User) redisTemplate.opsForValue().get("user:" + userId);
        if (user != null) return user;
 
        // Step 3: Check database (Bloom filter said "might exist")
        user = userRepository.findById(userId).orElse(null);
        if (user != null) {
            redisTemplate.opsForValue().set("user:" + userId, user, Duration.ofMinutes(10));
        }
        return user;
    }
 
    // When a new user is created, add to Bloom filter
    public User createUser(CreateUserRequest request) {
        User user = userRepository.save(new User(request));
        userBloomFilter.put(user.getId());  // Keep Bloom filter current
        return user;
    }
}

Bloom Filter trade-offs:

~10 MB memory for 10 million items at 1% false positive rate
False positives pass through to the cache/DB (1% of non-existent IDs still cause a DB miss)
Bloom filter does not support deletions (use Counting Bloom Filter if deletions needed)
Perfect for read-heavy systems with many invalid key requests

3. Cache Avalanche

What Is It?

A cache avalanche occurs when a large number of cache entries expire simultaneously (or the entire cache becomes unavailable), causing a massive flood of cache misses and corresponding database requests all at once.

Triggers:

Many keys set with the same TTL at the same time (e.g., during a startup cache load).
Cache server restart or failure (all data lost).
Cache flush (intentional or accidental FLUSHDB).

Scenario: Cache warmed at startup. 100,000 keys all set with TTL=3600s.

T=0s:    Cache fully loaded. Hit rate: 99%.
T=3600s: ALL 100,000 keys expire simultaneously.
         100,000 concurrent requests: all MISS.
         Database: Receives 100,000 queries in a few seconds.
         Database: Overwhelmed. Response time: 30,000ms. Connections exhausted.
         Result: Service outage.

Solutions

Solution 1: Randomized TTL (Primary Prevention)

// Do NOT do this:
cache.set("product:" + id, product, Duration.ofHours(1));  // All expire at exactly same time
 
// Do this instead:
private Duration randomTTL(Duration base, Duration maxJitter) {
    long jitterMs = (long) (Math.random() * maxJitter.toMillis());
    return base.plusMillis(jitterMs);
}
 
// All "products" entries expire over a 10-minute window instead of all at once
cache.set("product:" + id, product, randomTTL(Duration.ofHours(1), Duration.ofMinutes(10)));

Solution 2: Circuit Breaker (Protect the Database)

// Using Resilience4j CircuitBreaker
@Service
public class ProductService {
 
    @Autowired
    private ProductRepository productRepository;
 
    @Autowired
    private CircuitBreakerRegistry circuitBreakerRegistry;
 
    public Product getProduct(Long productId) {
        String cacheKey = "product:" + productId;
        Product cached = (Product) cache.get(cacheKey);
        if (cached != null) return cached;
 
        CircuitBreaker breaker = circuitBreakerRegistry.circuitBreaker("database");
 
        // If circuit is OPEN (database overloaded), fail fast with fallback
        Supplier<Product> dbCall = CircuitBreaker.decorateSupplier(
            breaker,
            () -> productRepository.findById(productId).orElse(null)
        );
 
        try {
            Product product = dbCall.get();
            if (product != null) {
                cache.set(cacheKey, product, Duration.ofMinutes(10));
            }
            return product;
        } catch (CallNotPermittedException e) {
            // Circuit is OPEN - return stale data or a graceful degradation response
            return getStaleProductFromBackupCache(productId);
        }
    }
}

Solution 3: Multi-Level Cache (L1 Absorbs the Blow)

If a Redis cache avalanche occurs, an in-process L1 cache (even with a very short TTL) absorbs the initial burst of requests:

// L1 local cache - short TTL but very fast
Cache<Long, Product> localCache = Caffeine.newBuilder()
        .maximumSize(10_000)
        .expireAfterWrite(30, TimeUnit.SECONDS)  // Very short TTL
        .build();
 
public Product getProduct(Long productId) {
    // L1 hit (sub-millisecond)
    Product product = localCache.getIfPresent(productId);
    if (product != null) return product;
 
    // L2 (Redis) - even if Redis is down/avalanche, L1 absorbed 90%+ of traffic
    product = (Product) redisTemplate.opsForValue().get("product:" + productId);
    if (product != null) {
        localCache.put(productId, product);
        return product;
    }
 
    // L3: Database (only reached if both L1 and L2 miss)
    product = productRepository.findById(productId).orElse(null);
    if (product != null) {
        redisTemplate.opsForValue().set("product:" + productId, product, Duration.ofMinutes(10));
        localCache.put(productId, product);
    }
    return product;
}

Solution 4: Cache Warmup with Staggered Loading

@Component
public class CacheWarmup implements ApplicationListener<ApplicationReadyEvent> {
 
    @Autowired
    private ProductRepository productRepository;
 
    @Autowired
    private RedisTemplate<String, Product> redisTemplate;
 
    @Override
    public void onApplicationEvent(ApplicationReadyEvent event) {
        // Load products in batches with staggered TTLs
        List<Long> hotProductIds = productRepository.findHotProductIds(10_000);
        Duration baseTTL = Duration.ofHours(1);
        Random random = new Random();
 
        for (Long productId : hotProductIds) {
            Product product = productRepository.findById(productId).orElse(null);
            if (product != null) {
                // Each entry gets a TTL within a 10-minute jitter window
                long jitterSeconds = random.nextInt(600);
                Duration ttl = baseTTL.plusSeconds(jitterSeconds);
                redisTemplate.opsForValue().set("product:" + productId, product, ttl);
            }
        }
    }
}

4. Cache Breakdown (Hot Key Expiry Under Concurrency)

What Is It?

Cache breakdown is similar to cache stampede but specifically refers to a single, extremely hot cache key that expires while being accessed by thousands of concurrent requests. The difference from a general stampede: this is one specific key that is a critical bottleneck.

Scenario: A celebrity's profile page gets 10,000 requests/second.
          Cache key "user:celebrity123" expires.
          10,000 concurrent requests simultaneously:
          - All miss the cache for that one key
          - All query the database for the same user
          Result: Database overloaded by one key expiry

Solutions

Solution 1: Logical Expiry (Never Actually Expire the Redis Key)

The key never expires in Redis (no TTL). Instead, the expiry time is stored INSIDE the value. On access, if the logical expiry has passed, a background refresh is triggered - but the current (stale) value is returned immediately.

@Data
@AllArgsConstructor
@NoArgsConstructor
public class CacheEntry<T> implements Serializable {
    private T data;
    private long logicalExpiryTimestamp;  // Unix milliseconds
 
    public boolean isExpired() {
        return System.currentTimeMillis() > logicalExpiryTimestamp;
    }
}
 
public User getUser(Long userId) {
    String cacheKey = "user:" + userId;
    CacheEntry<User> entry = (CacheEntry<User>) redisTemplate.opsForValue().get(cacheKey);
 
    if (entry == null) {
        // True miss - load synchronously
        return loadAndCache(userId);
    }
 
    if (entry.isExpired()) {
        // Logically expired - return stale data, trigger async refresh
        triggerAsyncRefresh(userId, cacheKey);
    }
 
    return entry.getData();  // Always return immediately
}
 
private void triggerAsyncRefresh(Long userId, String cacheKey) {
    // Use a lock to prevent multiple concurrent refreshes for the same key
    String lockKey = "refresh:lock:" + userId;
    Boolean acquired = redisTemplate.opsForValue()
            .setIfAbsent(lockKey, "1", Duration.ofSeconds(5));
 
    if (Boolean.TRUE.equals(acquired)) {
        CompletableFuture.runAsync(() -> {
            try {
                loadAndCache(userId);
            } finally {
                redisTemplate.delete(lockKey);
            }
        });
    }
    // If not acquired, another thread is already refreshing - do nothing
}
 
private User loadAndCache(Long userId) {
    User user = userRepository.findById(userId).orElse(null);
    if (user != null) {
        long logicalExpiry = System.currentTimeMillis() + Duration.ofMinutes(10).toMillis();
        CacheEntry<User> entry = new CacheEntry<>(user, logicalExpiry);
        // No TTL on the Redis key itself - it never expires at the Redis level
        redisTemplate.opsForValue().set("user:" + userId, entry);
    }
    return user;
}

Solution 2: Distributed Lock (Same as Stampede Prevention)

The same mutex approach from Cache Stampede works here. Only one thread refreshes the hot key; others wait.

5. Hot Key Problem (Cache Hotspot)

What Is It?

A hot key is a cache key that receives a disproportionately high number of requests. Even with a cache cluster, all requests for a hot key go to the single Redis node that owns that key, creating a bottleneck.

Redis Cluster: 10 nodes, each handling 100,000 req/second (balanced)
Hot Key: "trending:global" gets 800,000 req/second alone
         All 800,000 requests go to Node 3 (which owns that hash slot)
         Node 3: Overloaded. CPU 100%. Other keys on Node 3 also slow down.

Solutions

Solution 1: Local Cache Replication of Hot Keys

Replicate hot keys to all application instances' local caches:

@Service
public class TrendingService {
 
    // L1: Local cache for hot keys only - short TTL
    private final Cache<String, List<String>> localHotCache = Caffeine.newBuilder()
            .maximumSize(100)
            .expireAfterWrite(5, TimeUnit.SECONDS)  // Short TTL - hot keys update frequently
            .build();
 
    @Autowired
    private RedisTemplate<String, List<String>> redisTemplate;
 
    public List<String> getTrendingTopics() {
        // Check L1 first (no network, no Redis bottleneck)
        List<String> topics = localHotCache.getIfPresent("trending:global");
        if (topics != null) return topics;
 
        // L2: Redis
        topics = (List<String>) redisTemplate.opsForValue().get("trending:global");
        if (topics != null) {
            localHotCache.put("trending:global", topics);  // Backfill L1
        }
        return topics;
    }
}

Solution 2: Key Sharding (Read Replicas for a Key)

Write to multiple shards of the key, read from a random shard:

@Service
public class HotKeyService {
 
    private static final int SHARD_COUNT = 10;
 
    @Autowired
    private RedisTemplate<String, Object> redisTemplate;
 
    // Write to all shards
    public void setHotValue(String key, Object value, Duration ttl) {
        for (int i = 0; i < SHARD_COUNT; i++) {
            redisTemplate.opsForValue().set(key + ":shard:" + i, value, ttl);
        }
    }
 
    // Read from a random shard (distributes load across 10 Redis nodes)
    public Object getHotValue(String key) {
        int shard = ThreadLocalRandom.current().nextInt(SHARD_COUNT);
        return redisTemplate.opsForValue().get(key + ":shard:" + shard);
    }
}
// Effect: "trending:global" load is distributed across 10 Redis nodes
// Each node receives 1/10th of the original traffic

Solution 3: Redis Read Replicas

For very high-read hot keys, route reads to Redis replicas:

# application.yml - Spring Boot Redis Cluster with read-from-replica
spring:
  data:
    redis:
      cluster:
        nodes:
          - redis-master:6379
        read-from: replica-preferred # Route reads to replicas when possible

6. Stale Data and Cache Inconsistency

The Problem

Cache inconsistency occurs when the cached value no longer matches the database value. This happens because:

The cache was not invalidated when the database was updated.
There is a race condition between the invalidation and a subsequent read.
Multiple services independently cache the same data with different TTLs.

The Dangerous Race Condition

Thread A (Read):                          Thread B (Write):
T1: A reads user:123 - cache miss
T2:                                       B writes user:123 to DB (new email)
T3:                                       B deletes cache key user:123
T4: A reads user:123 from DB (OLD value!) <- A reads BEFORE B's DB write commits
T5: A writes OLD value to cache           <- Stale data in cache!
T6:                                       B's DELETE is now gone
    Cache has OLD value until TTL expires

Solutions

Solution 1: Write-Invalidate with Short TTL Safety Net

@Transactional
public User updateUser(Long userId, UserUpdateRequest request) {
    User user = userRepository.findById(userId).orElseThrow();
    user.setEmail(request.getEmail());
    userRepository.save(user);  // DB update
 
    // Use @TransactionalEventListener to ensure cache is evicted AFTER
    // the transaction commits (not during, where rollback could still happen)
    applicationEventPublisher.publishEvent(new UserUpdatedEvent(userId));
    return user;
}
 
@Component
public class CacheInvalidationListener {
 
    @Autowired
    private RedisTemplate<String, Object> redisTemplate;
 
    @TransactionalEventListener(phase = TransactionPhase.AFTER_COMMIT)
    public void onUserUpdated(UserUpdatedEvent event) {
        // Only runs after transaction successfully commits
        redisTemplate.delete("user:" + event.getUserId());
    }
}

Key insight: Using @TransactionalEventListener(phase = AFTER_COMMIT) ensures cache invalidation only happens after the database transaction is committed, not during. This prevents the case where you evict the cache but the transaction rolls back, leaving the cache empty for valid data.

Solution 2: Versioned Cache Keys

public User getUser(Long userId) {
    // Version is stored separately in a fast counter
    Long version = getVersion("user:" + userId);  // e.g., Redis GET or ZooKeeper
    String cacheKey = "user:" + userId + ":v" + version;
 
    User user = (User) cache.get(cacheKey);
    if (user == null) {
        user = userRepository.findById(userId).orElseThrow();
        cache.set(cacheKey, user, Duration.ofMinutes(10));
    }
    return user;
}
 
public User updateUser(Long userId, UserUpdateRequest request) {
    User user = userRepository.save(buildUpdatedUser(userId, request));
    // Incrementing version "invalidates" old cache entry without explicit delete
    // Old versioned key is now stale and will naturally evict
    incrementVersion("user:" + userId);
    return user;
}

Solution 3: Cache-Aside with Optimistic Locking

public User getUser(Long userId) {
    String cacheKey = "user:" + userId;
 
    // Try cache first
    Object[] cached = (Object[]) redisTemplate.opsForValue().get(cacheKey);
    if (cached != null) {
        Long cachedVersion = (Long) cached[0];
        User cachedUser = (User) cached[1];
 
        // Optional: quick version check against DB to detect staleness
        // (only for critical consistency scenarios)
        Long dbVersion = userRepository.findVersionById(userId);
        if (cachedVersion.equals(dbVersion)) {
            return cachedUser;
        }
        // Version mismatch - cache is stale, fall through to DB
    }
 
    User user = userRepository.findById(userId).orElseThrow();
    redisTemplate.opsForValue().set(cacheKey,
        new Object[] { user.getVersion(), user }, Duration.ofMinutes(5));
    return user;
}

7. Cache Poisoning (Security)

What Is It?

Cache poisoning occurs when an attacker manipulates the cache to store malicious or incorrect data that is then served to other users.

Attack vectors:

HTTP Response Splitting:
If an application caches HTTP responses based on URL, and the URL construction includes user input that is not properly sanitized, an attacker can craft a URL that stores malicious content:

Vulnerable URL construction:
cache_key = "response:" + request.getHeader("Host")

Attacker sends:
Host: legitimate.com\r\nX-Forwarded-Host: evil.com

If cache key is not sanitized, this could corrupt the cache key or poison entries.

Cache Key Injection:
If cache keys are built using user-controlled input without sanitization:

// VULNERABLE:
String cacheKey = "user:" + request.getParameter("userId");
// Attacker sends userId = "123:admin" -> cacheKey = "user:123:admin" (collision!)
 
// SAFE:
String cacheKey = "user:" + sanitize(request.getParameter("userId"));
// sanitize() validates that userId is a numeric integer

Prevention

Never trust user input in cache key construction. Always validate and sanitize.
Namespace cache keys by tenant/service to prevent cross-tenant pollution.
Use signed/HMAC cache values for sensitive data (verify integrity before serving).
Use separate cache instances for different security domains (anonymous vs authenticated).
Validate response integrity when serving from cache for sensitive operations.

// Safer cache key construction
public String buildSafeCacheKey(String prefix, Long id) {
    // Validate that id is positive
    if (id <= 0) throw new IllegalArgumentException("Invalid id");
    return prefix + ":" + id;  // Known-safe format
}

8. Memory Pressure and Out-Of-Memory Situations

Redis Hitting maxmemory

When Redis reaches its maxmemory limit, behavior depends on the configured maxmemory-policy:

noeviction: Commands that require memory return errors. Application breaks.
allkeys-lru: Least recently used keys are evicted silently. Hit rate may drop.

Monitoring Redis memory:

redis-cli INFO memory
 
used_memory:               4294967296  # ~4 GB currently used
used_memory_human:         4.00G
maxmemory:                 4294967296  # 4 GB limit
maxmemory_human:           4.00G
mem_fragmentation_ratio:   1.25        # 25% fragmentation - acceptable
evicted_keys:              12345       # Keys evicted due to maxmemory

When fragmentation ratio is high (> 1.5):

# Enable active defragmentation (Redis 4.0+)
redis-cli CONFIG SET activedefrag yes
redis-cli CONFIG SET active-defrag-ignore-bytes 100mb
redis-cli CONFIG SET active-defrag-threshold-lower 10

JVM In-Process Cache Memory Pressure

Too much data in a Caffeine/Guava cache can cause:

Heap pressure: Objects fill up the JVM heap.
Increased GC pressure: More live objects = more GC pauses.
OOM (OutOfMemoryError): Application crashes.

// Best practices for in-process cache memory management
Cache<String, LargeObject> cache = Caffeine.newBuilder()
    .maximumWeight(200 * 1024 * 1024)  // 200 MB weight limit (better than count limit for variable-size objects)
    .weigher((key, value) -> value.getSerializedSize())
    .softValues()  // Use soft references: JVM can GC these under memory pressure
    // Note: softValues slows down cache because GC may clear values unexpectedly
    .build();

Monitoring cache memory impact:

Use JVM heap dumps to see what percentage of heap is cache data.
Track GC pause times (long pauses after adding caches = too much heap usage).
Use Micrometer + Actuator to expose cache.size metric.

9. Cold Start Problem

What Is It?

When a new application instance starts (or after a cache flush), the cache is empty (cold). Every incoming request results in a cache miss and hits the database. If traffic is high at startup, the database can be overwhelmed during the cold start period.

Cold start scenario:
- Service instance restarts after a deployment
- Cache is empty
- 5,000 req/second immediately hit the service
- 5,000 req/second hit the database (all cache misses)
- Database: overwhelmed, latency spikes to 10+ seconds
- Other services calling this one: timeouts, cascading failure

Solutions

Solution 1: Pre-Warming at Startup

@Component
@Slf4j
public class CacheWarmupService implements ApplicationListener<ApplicationReadyEvent> {
 
    @Autowired
    private ProductRepository productRepository;
 
    @Autowired
    private UserRepository userRepository;
 
    @Autowired
    private RedisTemplate<String, Object> redisTemplate;
 
    @Override
    public void onApplicationEvent(ApplicationReadyEvent event) {
        log.info("Starting cache warmup...");
        warmupHotProducts();
        warmupActiveUsers();
        log.info("Cache warmup complete");
    }
 
    private void warmupHotProducts() {
        // Load top 10,000 products by view count
        List<Product> hotProducts = productRepository.findTopByViewCount(10_000);
        hotProducts.parallelStream().forEach(product -> {
            Duration ttl = Duration.ofMinutes(30)
                .plusSeconds(ThreadLocalRandom.current().nextInt(600)); // Jittered TTL
            redisTemplate.opsForValue().set("product:" + product.getId(), product, ttl);
        });
        log.info("Warmed {} hot products", hotProducts.size());
    }
 
    private void warmupActiveUsers() {
        // Load recently active users
        LocalDateTime since = LocalDateTime.now().minusDays(7);
        List<User> activeUsers = userRepository.findActiveUsersSince(since, 50_000);
        activeUsers.forEach(user -> {
            redisTemplate.opsForValue().set("user:" + user.getId(), user,
                Duration.ofMinutes(10).plusSeconds(ThreadLocalRandom.current().nextInt(300)));
        });
        log.info("Warmed {} active users", activeUsers.size());
    }
}

Solution 2: Gradual Traffic Ramp-Up (Load Balancer Level)

Configure the load balancer to gradually increase traffic to a new instance:

# AWS ALB / Nginx - Slow Start Configuration
# New instances receive a small fraction of traffic initially, increasing over time.
# This gives the cache time to warm before full traffic hits.
 
# Nginx upstream slow start:
upstream backend {
    server new_instance:8080 slow_start=60s;  # 60 seconds to ramp to full weight
    server existing_instance:8080;
}

Solution 3: Read-Through with Request Coalescing

When many requests come in for the same uncached key simultaneously, only fetch the data once:

// Caffeine's LoadingCache handles request coalescing automatically
LoadingCache<Long, Product> productCache = Caffeine.newBuilder()
    .maximumSize(100_000)
    .expireAfterWrite(10, TimeUnit.MINUTES)
    .build(id -> productRepository.findById(id).orElseThrow());
 
// Even if 1,000 threads simultaneously request product:123 during cold start,
// Caffeine ensures only ONE database query is made. All 1,000 wait for that one result.

10. Over-Caching vs Under-Caching

Under-Caching Problems

Application is slow because data that SHOULD be cached is not.
Database is under heavy load from repeated identical queries.
High infrastructure costs from unnecessarily large database clusters.

Signs of under-caching:

Database CPU and query throughput at high levels for simple read queries.
Application response times much higher than expected.
Many identical queries visible in slow query log.

Over-Caching Problems

Stale data: Users see outdated information because TTLs are too long.
Memory waste: Cache holds data that is never re-read.
Increased complexity: More cache entries to invalidate on data changes.
Debugging difficulty: Hard to reproduce bugs when state is spread across cache and DB.
Cache churn: Constantly caching data that expires before being read.

Decision Framework: Should This Be Cached?

Use this checklist before adding a cache entry:

Question                                        Yes = Cache  No = Skip
----------------------------------------------------------------------
Is this data read more than 10x vs written?     Cache it     Skip
Is this data requested by multiple users?        Cache it     Skip
Is the retrieval expensive (>10ms)?              Cache it     Skip
Is the data rate of change predictable?          Cache it     Skip
Can the business tolerate N seconds of staleness? Cache it   Skip
Is cache invalidation on write feasible?         Cache it     Reconsider

If 3 or more "Yes" answers: Cache it.
If 2 or fewer: Question whether caching adds enough value to justify complexity.

What Should Always Be Cached

Reference data (country codes, timezone lists, currency tables) - changes rarely
Configuration data fetched from external services - expensive to re-fetch
Results of expensive database aggregations - report data, dashboard totals
Results of external API calls with rate limits - geocoding, payments SDKs
User session data - frequent access, simple structure

What Should Generally NOT Be Cached

One-time generated tokens (password reset, email verification) - security risk
User-specific personalized data where each user sees different content (unless scoped by user ID)
Data that must always be consistent (account balance, inventory stock for purchasing)
Administrative/audit logs - must be durable and accurate
Data that changes more frequently than the cache TTL

11. Distributed Cache Split-Brain

What Is It?

In a distributed cache cluster, a network partition can cause nodes to become isolated, each believing they are the authoritative master. This is called split-brain.

Normal state:
[Master] <-> [Replica 1] <-> [Replica 2]

Network partition:
[Master] X [Replica 1] <-> [Replica 2]

- Master thinks it is still the master, serves writes.
- Replica 1 + 2 elect Replica 1 as the new master, also serving writes.
- Two "masters" exist, accepting conflicting writes.
- When partition heals: data conflict. Which master's data wins?

Impact on Caching

In a cache context, split-brain typically means:

Some clients write to the old master (stale node).
Some clients write to the new master (correct node).
When partition heals, one node's data is discarded.
Result: Cache inconsistency (different clients may have cached different values).

Redis Sentinel Protection

Sentinel's quorum mechanism prevents split-brain:

At least (quorum) Sentinels must agree that master is down before failover.
With 3 Sentinels (quorum = 2), a network partition that isolates 1 Sentinel cannot trigger a false failover.
min-replicas-to-write 1: Master refuses writes if it cannot replicate to at least 1 replica, preventing it from accepting writes in isolation.

# redis.conf - protection against split-brain writes to isolated master
min-replicas-to-write 1       # Refuse writes if no replicas can be reached
min-replicas-max-lag 10       # Refuse writes if replicas lag more than 10 seconds

12. Serialization and Deserialization Pitfalls

Problem 1: Java Native Serialization

// AVOID: Java serialization is slow, brittle, and produces large output
redisTemplate.opsForValue().set("user:123", user);  // Uses Java Serialization by default!
 
// The serialized class must match exactly on deserialization.
// If you add/remove a field to User class, deserialization FAILS for all existing cache entries.
// This causes a complete cache invalidation on every deployment that changes a cached class.

Problem 2: Class Version Mismatch on Deployment

// Version 1 of User class (in cache)
public class User implements Serializable {
    private Long id;
    private String name;
    // serialVersionUID = 1L
}
 
// Version 2 of User class (new deployment)
public class User implements Serializable {
    private Long id;
    private String name;
    private String email;  // New field added
    // serialVersionUID = 1L (unchanged - compatible)
}
// Result: Old cache entries deserialized, email field is null. Acceptable.
 
// But if serialVersionUID is not explicitly set and class changes:
// Java auto-generates a new serialVersionUID -> InvalidClassException on deserialization
// All cached entries become unreadable. Mass cache miss on deployment.

Best Practice: Use JSON Serialization

@Configuration
public class RedisConfig {
 
    @Bean
    public RedisTemplate<String, Object> redisTemplate(RedisConnectionFactory factory) {
        RedisTemplate<String, Object> template = new RedisTemplate<>();
        template.setConnectionFactory(factory);
 
        // Use String serializer for keys
        template.setKeySerializer(new StringRedisSerializer());
        template.setHashKeySerializer(new StringRedisSerializer());
 
        // Use Jackson2 JSON for values (human-readable, schema-flexible)
        ObjectMapper objectMapper = new ObjectMapper()
                .registerModule(new JavaTimeModule())
                .configure(DeserializationFeature.FAIL_ON_UNKNOWN_PROPERTIES, false); // Tolerates added fields
        GenericJackson2JsonRedisSerializer jsonSerializer =
                new GenericJackson2JsonRedisSerializer(objectMapper);
 
        template.setValueSerializer(jsonSerializer);
        template.setHashValueSerializer(jsonSerializer);
        return template;
    }
}

Key settings:

FAIL_ON_UNKNOWN_PROPERTIES = false: New fields added to the class do not break deserialization of old cache entries.
JavaTimeModule: Handles Java 8+ date/time types (LocalDate, Instant, etc.).

Problem 3: Caching Objects with Circular References

// PROBLEM: User -> Orders -> User (circular reference)
public class User {
    private List<Order> orders;  // Each Order references back to User
}
 
// JSON serialization will fail or produce infinite recursion
// Fix: break the cycle with @JsonIgnore or use DTOs for caching
@JsonIgnoreProperties({"user"})
public class Order {
    @JsonIgnore
    private User user;  // Don't serialize the back-reference
    private Long userId;  // Store ID instead of object reference
}

Problem 4: Caching Lazy-Loaded JPA Entities

// PROBLEM: Caching a JPA entity with a lazy-loaded collection
@Entity
public class Product {
    @OneToMany(fetch = FetchType.LAZY)
    private List<Review> reviews;  // NOT loaded until accessed within a session
}
 
// When you cache the entity and later retrieve it from cache,
// the reviews are not loaded AND there is no active Hibernate session to load them.
// Accessing reviews on a cached entity throws LazyInitializationException.
 
// FIX 1: Use EAGER loading for cached entities (can cause N+1 problems)
// FIX 2: Cache a DTO (Data Transfer Object) with all needed data already populated
// FIX 3: Initialize collections before caching: Hibernate.initialize(product.getReviews())

13. TTL Misconfiguration

Too Short TTL

Problem: Cache is constantly expiring. Every few seconds, entries expire and must be re-fetched from the database. High cache miss rate despite having a cache.

Signs: High cache miss rate, high database query rate, cache not providing meaningful latency reduction.

Example: Setting TTL to 5 seconds on data that takes 200ms to fetch. Every 5 seconds, all traffic hits the database. Hit rate: ~10-20%.

Too Long TTL

Problem: Stale data persists in cache. Users see outdated information for extended periods.

Example: Setting TTL to 7 days on user profile data. If a user changes their email, every service will show the old email for up to 7 days unless explicit invalidation is implemented.

Guidelines for TTL Selection

Data Type                         Recommended TTL
--------------------------------  ----------------
Static reference data             1 hour to 1 day (or indefinitely with event-driven invalidation)
Product catalog (rarely changes)  30 minutes to 2 hours
User session data                 30 minutes (sliding TTL)
User profile (non-critical)       5 to 30 minutes
Real-time data (stock price)      1 to 5 seconds
API rate limit counters           60 seconds (per minute window)
Search results                    2 to 10 minutes
Dashboard aggregates              1 to 5 minutes
Authentication tokens             Match token expiry
Null values (negative caching)    30 seconds to 2 minutes

Dynamic TTL Based on Data Volatility

public Duration calculateTTL(String dataType, Object data) {
    return switch (dataType) {
        case "product_price" -> {
            // Shorter TTL during sale hours, longer at night
            int hour = LocalTime.now().getHour();
            yield (hour >= 9 && hour <= 21)
                    ? Duration.ofMinutes(2)
                    : Duration.ofMinutes(15);
        }
        case "user_profile" -> Duration.ofMinutes(10);
        case "reference_data" -> Duration.ofHours(4);
        default -> Duration.ofMinutes(5);
    };
}

14. Cache Key Design Anti-Patterns

Anti-Pattern 1: No Namespace

// BAD: No namespace - easy to collide between services
cache.set("123", user);           // Is this a user? product? order?
cache.set("config", settings);   // Which service's config?
 
// GOOD: Namespaced keys
cache.set("users-svc:user:123", user);
cache.set("catalog-svc:config", settings);

Anti-Pattern 2: Using Full SQL as Cache Key

// BAD: SQL query as cache key
String key = "SELECT * FROM products WHERE category='Electronics' AND price < 500 ORDER BY rating";
// Security risk: user-controlled input in SQL = SQL injection AND cache poisoning
// Key is too long (Redis max key size is 512 MB, but performance degrades for large keys)
// Slight SQL formatting difference = cache miss despite same semantics
 
// GOOD: Structured key from canonicalized parameters
String key = "catalog:products:category=electronics:maxPrice=500:sort=rating";

Anti-Pattern 3: Inconsistent Key Format

// BAD: Multiple formats for the same data
cache.set("user_123", user);
cache.set("user:123", user);    // Which one is correct?
cache.set("USER:123", user);    // Case inconsistency
 
// GOOD: Consistent format, defined in a constant or factory
public static String userKey(Long userId) {
    return "users:user:" + userId;
}

Anti-Pattern 4: Missing TTL (Infinite Cache Entries)

// BAD: No TTL - key lives forever
cache.set("user:123", user);
 
// GOOD: Always set a TTL
cache.set("user:123", user, Duration.ofMinutes(10));

Anti-Pattern 5: Per-User Keys for Shared Data

// BAD: Caching the same global data under per-user keys
cache.set("user:123:top_products", topProducts);  // Same for all users!
cache.set("user:456:top_products", topProducts);  // Duplicate data, N copies!
 
// GOOD: One key for globally shared data
cache.set("catalog:top_products", topProducts);

Summary

Understanding these pitfalls is what separates a developer who "adds a cache" from one who builds a reliable, high-performance caching system:

Cache Stampede: Use mutex locks, stale-while-revalidate, or Refresh-Ahead for hot keys.
Cache Penetration: Cache null values with short TTL, or use a Bloom filter for scale.
Cache Avalanche: Randomize TTLs at cache population time. Use circuit breakers.
Cache Breakdown: Use logical expiry (never-expire Redis key) for extreme hot keys.
Hot Key: Local L1 cache for top hot keys. Key sharding across replicas.
Stale Data: Use @TransactionalEventListener(AFTER_COMMIT) for post-commit invalidation.
Cache Poisoning: Validate all inputs used in key construction. Never trust user data.
Memory Pressure: Set maxmemory with allkeys-lru. Monitor fragmentation ratio.
Cold Start: Pre-warm cache at startup with staggered TTLs.
Serialization: Use JSON (not Java native serialization). Set FAIL_ON_UNKNOWN_PROPERTIES=false.

Previous: Part 3 - Technologies
Next: Part 5 - Interview Questions

Series: Caching Demystified

Caching Demystified - Part 4: Pitfalls, Problems, and Solutions

Table of Contents

1. Cache Stampede / Thundering Herd

What Is It?

Visual Representation

The Problem in Code

Solutions

Solution 1: Mutex / Distributed Lock (Only One Thread Refreshes)

Solution 2: Stale-While-Revalidate (Return Stale, Refresh in Background)

Solution 3: Refresh-Ahead with Caffeine (Automatic)

Solution 4: Randomized TTL (Distribute Expiry Times)

2. Cache Penetration

What Is It?

Solution 1: Cache Null Values (Negative Caching)

Solution 2: Bloom Filter (More Scalable)

3. Cache Avalanche

What Is It?

Solutions

Solution 1: Randomized TTL (Primary Prevention)

Solution 2: Circuit Breaker (Protect the Database)

Solution 3: Multi-Level Cache (L1 Absorbs the Blow)

Solution 4: Cache Warmup with Staggered Loading

4. Cache Breakdown (Hot Key Expiry Under Concurrency)

What Is It?

Solutions

Solution 1: Logical Expiry (Never Actually Expire the Redis Key)

Solution 2: Distributed Lock (Same as Stampede Prevention)

5. Hot Key Problem (Cache Hotspot)

What Is It?

Solutions

Solution 1: Local Cache Replication of Hot Keys

Solution 2: Key Sharding (Read Replicas for a Key)

Solution 3: Redis Read Replicas

6. Stale Data and Cache Inconsistency

The Problem

The Dangerous Race Condition

Solutions

Solution 1: Write-Invalidate with Short TTL Safety Net

Solution 2: Versioned Cache Keys

Solution 3: Cache-Aside with Optimistic Locking

7. Cache Poisoning (Security)

What Is It?

Prevention

8. Memory Pressure and Out-Of-Memory Situations

Redis Hitting maxmemory

JVM In-Process Cache Memory Pressure

9. Cold Start Problem

What Is It?

Solutions

Solution 1: Pre-Warming at Startup

Solution 2: Gradual Traffic Ramp-Up (Load Balancer Level)

Solution 3: Read-Through with Request Coalescing

10. Over-Caching vs Under-Caching

Under-Caching Problems

Over-Caching Problems

Decision Framework: Should This Be Cached?

What Should Always Be Cached

What Should Generally NOT Be Cached

11. Distributed Cache Split-Brain

What Is It?

Impact on Caching

Redis Sentinel Protection

12. Serialization and Deserialization Pitfalls

Problem 1: Java Native Serialization

Problem 2: Class Version Mismatch on Deployment

Best Practice: Use JSON Serialization

Problem 3: Caching Objects with Circular References

Problem 4: Caching Lazy-Loaded JPA Entities

13. TTL Misconfiguration

Too Short TTL

Too Long TTL

Guidelines for TTL Selection

Dynamic TTL Based on Data Volatility

14. Cache Key Design Anti-Patterns

Anti-Pattern 1: No Namespace

Anti-Pattern 2: Using Full SQL as Cache Key

Anti-Pattern 3: Inconsistent Key Format

Anti-Pattern 4: Missing TTL (Infinite Cache Entries)

Anti-Pattern 5: Per-User Keys for Shared Data