Consistency Models - Part 2: All Models Deep Dive

Linearizability (Strong Consistency)
Serializability
Strict Serializability
Sequential Consistency
Causal Consistency
Eventual Consistency
Strong Eventual Consistency and CRDTs
Read-Your-Writes Consistency
Monotonic Reads
Monotonic Writes
Consistent Prefix Reads
Bounded Staleness
Session Consistency
Transaction Isolation Levels (MySQL Deep Dive)
Multi-Version Concurrency Control (MVCC)
Quorum-Based Consistency
Vector Clocks and Causal Ordering
Conflict Resolution Strategies
Consistency Model Comparison Matrix

1. Linearizability (Strong Consistency)

Definition

Linearizability is the strongest consistency model. It guarantees that:

Every operation appears to execute instantaneously at some point between its invocation and its completion
All operations are globally ordered -- every client in the system sees the same single, consistent view of history
Operations respect real-time ordering -- if operation A completes before operation B begins, then A appears before B in the global order

The Analogy

Think of a bank teller at a physical branch. There is one teller (the single source of truth). Every transaction happens at a specific moment in real time. If you deposit $500 at 10:00 AM, anyone who checks the balance after 10:00 AM sees the$ 500. There is no ambiguity about ordering. All clients share the same reality.

Formal Intuition

In a linearizable system, you can draw a timeline where every operation has a single point (its "linearization point") where it takes effect. All operations are consistent with this timeline.

Client A: [WRITE x=5 |-----------| done]
Client B:                   [READ x -------> 5]    (sees the write)
Client C:              [READ x --> 5 or old]        (may see either, but consistent forever after)

Once Client C reads x=5, any subsequent read by any client must also return 5 (or a newer value). No "going back" in time.

When to Use

Distributed locks (a lock must be exclusive -- no two clients can hold it)
Leader election (only one node should be leader at a time)
Unique ID generation
Counter operations where correctness is critical
Configuration management (ZooKeeper, etcd)
Sequence numbers for ordered operations

Real-World Systems

System	Linearizable?
ZooKeeper	Yes -- all writes go through a single leader
etcd	Yes -- Raft consensus provides linearizability
MySQL (single node, SERIALIZABLE)	Yes -- for a single node
MySQL with synchronous replication	Yes -- if read from primary only
Google Spanner	Yes -- uses TrueTime
DynamoDB with strong reads	Yes -- within a single partition
Redis (single master)	Yes -- for a single master
Redis Cluster	No -- different slots on different masters

Cost of Linearizability

Latency: Every operation must coordinate with all replicas
Throughput: Serialization of operations limits parallelism
Availability: If the coordinator or a quorum is unavailable, operations block

2. Serializability

Definition

Serializability is a property of transactions (not individual operations). A concurrent execution of transactions is serializable if the result is equivalent to some serial (one-at-a-time) execution of those transactions.

Key difference from linearizability: Serializability does not require real-time ordering. The equivalent serial order may not match the real-time order in which transactions ran.

Example

Three transactions T1, T2, T3 ran concurrently. The actual execution was interleaved. The result is serializable if you can find some ordering T1 -> T2 -> T3 (or T2 -> T3 -> T1, etc.) that would produce the same final state.

Why It Matters

Serializability prevents all concurrency anomalies:

Dirty reads
Non-repeatable reads
Phantom reads
Lost updates
Write skew

MySQL achieves serializability at ISOLATION LEVEL SERIALIZABLE.

In Spring Boot

@Transactional(isolation = Isolation.SERIALIZABLE)
public void criticalOperation() {
    // This transaction is fully serializable
    // Slowest, safest isolation level
    // Uses range locks to prevent phantom reads
}

3. Strict Serializability

Definition

Strict Serializability = Linearizability + Serializability

It requires both:

Transactions are equivalent to some serial execution (serializability)
That serial execution respects real-time ordering (linearizability)

This is the gold standard of consistency but extremely expensive to achieve.

Used by: Google Spanner, CockroachDB

4. Sequential Consistency

Definition

Sequential consistency is weaker than linearizability but still quite strong:

All operations appear to execute in some sequential order
The operations from each individual client appear in the order they were issued by that client
But the global order need not respect real-time -- the sequential order can differ from wall-clock time

The Analogy

Imagine a message board in an office. Each person posts messages in the order they write them. But you do not know the exact time each message was posted. The board shows a sequence of messages that is consistent with each person's posting order, but you cannot determine who posted first based on real time.

Difference from Linearizability

Linearizability: If A completed before B started, A must appear before B in global order.
Sequential:      Each client's operations are in order, but different clients' orders
                 can be interleaved arbitrarily -- real-time not required.

Example

Client 1: WRITE x=1, then WRITE x=2
Client 2: WRITE y=1, then WRITE y=2

Sequential consistent orderings (all valid):
  x=1, x=2, y=1, y=2
  y=1, y=2, x=1, x=2
  x=1, y=1, x=2, y=2
  x=1, y=1, y=2, x=2

NOT valid (violates Client 1's order):
  x=2, x=1, y=1, y=2  <-- Client 1's writes are reversed

Used By

CPU memory models (multiprocessor systems), some distributed systems research.

5. Causal Consistency

Definition

Causal consistency tracks causal relationships between operations. Operations that are causally related must be seen in causal order. Operations that are concurrent (not causally related) can be seen in any order.

Operation B is causally dependent on Operation A if:

B reads a value written by A
B and A are operations from the same client (program order)
B is causally dependent on some C that depends on A (transitivity)

The Analogy

Think of a conversation thread in Slack. If someone posts a message and someone replies to it, the reply must always appear after the original message -- because the reply is causally dependent on it. But two independent messages sent at the same time can appear in any order.

Why Causal Consistency Matters

Consider a social media example:

User A posts: "I'm going to delete my account"
User B replies: "Don't do it!"
User C reads the thread

Without causal consistency:
  User C might see: "Don't do it!" (reply)
                    ... no original post yet ...

With causal consistency:
  User C always sees the original post BEFORE the reply

Implementation: Vector Clocks

Causal consistency is typically implemented using vector clocks or hybrid logical clocks (HLC). See Section 17 for details.

Systems Using Causal Consistency

MongoDB (causal sessions)
Cassandra (lightweight transactions with lightweight consistency)
COPS (research system)
Some configurations of DynamoDB Streams

6. Eventual Consistency

Definition

Eventual consistency is the weakest useful consistency model:

If no new updates are made to a data item, all replicas will eventually converge to the same value.

No guarantee about:

When convergence happens (could be milliseconds, seconds, or longer)
What value is returned during the convergence window
Whether reads within the convergence window are "fresh"

The Analogy

Think of DNS propagation. When you update a DNS record (e.g., point your domain to a new IP), the change propagates across DNS servers worldwide. During propagation (which can take minutes to hours), different users resolve the domain to different IPs. Eventually, all DNS servers have the updated record and all users get the same IP.

What "Eventually" Means in Practice

In most well-engineered systems, "eventually" is:

Within the same data center: milliseconds to tens of milliseconds
Across availability zones: tens to hundreds of milliseconds
Across regions (global tables): hundreds of milliseconds to seconds

When Eventual Consistency Is Acceptable

Product catalogs (price showing $9.99 vs$ 10.00 for a second is fine)
Social media timelines (post appears 1 second later is fine)
Recommendation engines (stale recommendations are fine)
Analytics dashboards (near-real-time is acceptable)
Configuration flags with low urgency (feature flags propagating over seconds)
User preferences that are not security-critical

When Eventual Consistency Is NOT Acceptable

Financial balances (you cannot read $0 when the balance is$ 1000)
Inventory counts (cannot sell items you don't have)
Authentication tokens (must be valid or invalid immediately)
Distributed locks (two clients cannot both believe they hold the lock)
Unique constraint enforcement (two users cannot get the same username)

Conflict Resolution in Eventual Consistency

When replicas accept concurrent writes to the same key, conflicts arise. Resolution strategies:

Strategy	How It Works	Risk
Last-Write-Wins (LWW)	Higher timestamp wins	Data loss if clocks are skewed
Multi-Value (siblings)	Return all conflicting versions	Application must resolve
Application-Level	Application merges conflicts	Most flexible, most work
CRDTs	Math-based auto-merge	Limited to specific data types

7. Strong Eventual Consistency and CRDTs

Definition

Strong Eventual Consistency (SEC) adds one guarantee to eventual consistency:

All replicas that have received the same set of updates are in the same state, regardless of the order in which they received updates.

With SEC, there are no conflicts because the data structure is designed to commute -- the result is the same regardless of operation order.

Conflict-Free Replicated Data Types (CRDTs)

CRDTs are data structures mathematically designed to merge concurrent updates without conflict. They achieve SEC.

Types of CRDTs

G-Counter (Grow-only Counter)

Each node maintains its own counter. The global counter is the sum of all node counters. Increments are always safe -- no conflicts possible.

// Simplified G-Counter CRDT
public class GCounter {
    private final String nodeId;
    private final Map<String, Long> counts = new ConcurrentHashMap<>();
 
    public GCounter(String nodeId) {
        this.nodeId = nodeId;
        this.counts.put(nodeId, 0L);
    }
 
    public void increment() {
        counts.merge(nodeId, 1L, Long::sum);
    }
 
    public long value() {
        return counts.values().stream().mapToLong(Long::longValue).sum();
    }
 
    // Merge another replica's state -- always safe, commutative, idempotent
    public void merge(GCounter other) {
        other.counts.forEach((node, count) ->
            counts.merge(node, count, Math::max)
        );
    }
}

PN-Counter (Positive-Negative Counter)

Two G-Counters: one for increments (P) and one for decrements (N). Value = P - N. Used for distributed counters that can go up and down.

public class PNCounter {
    private final GCounter positive;
    private final GCounter negative;
 
    public PNCounter(String nodeId) {
        this.positive = new GCounter(nodeId);
        this.negative = new GCounter(nodeId);
    }
 
    public void increment() { positive.increment(); }
    public void decrement() { negative.increment(); }
    public long value() { return positive.value() - negative.value(); }
 
    public void merge(PNCounter other) {
        positive.merge(other.positive);
        negative.merge(other.negative);
    }
}

LWW-Register (Last-Write-Wins Register)

Stores a single value with a timestamp. Merge always picks the value with the higher timestamp. Simple but risks data loss with clock skew.

OR-Set (Observed-Remove Set)

A set that handles concurrent add/remove without conflicts. Each add gets a unique tag. Remove removes specific tagged elements. If you add then remove and add concurrently, the element stays.

CRDTs in Production

System	CRDT Usage
Redis	HyperLogLog (approximate counting CRDT)
Riak	Full CRDT support
Cassandra	Counters (PN-Counter like behavior)
SoundCloud	Conflict resolution in distributed systems
Akka Distributed Data	Full CRDT library

8. Read-Your-Writes Consistency

Definition

After a client performs a write, any subsequent read by that same client will always see the written value. Other clients may or may not see it immediately.

This is a session-level guarantee, not a global one.

Why It Matters

Imagine a user changes their profile photo. They hit "save" and are immediately shown their profile. If the read goes to a stale replica, they see their old photo and think the save failed. They hit save again. Frustrating.

With read-your-writes:

The user always sees their own changes immediately
Other users might see the old photo for a second -- acceptable
The user experience is consistent

Implementation Strategies

Strategy 1: Route Writes and Reads to Primary

Simplest solution: after a write, always read from the primary (not the replica).

@Service
public class UserService {
 
    @Transactional  // routes to primary (write datasource)
    public void updateProfile(Long userId, ProfileUpdateRequest request) {
        User user = userRepository.findById(userId).orElseThrow();
        user.updateFrom(request);
        userRepository.save(user);
    }
 
    @Transactional  // also routes to primary for read-your-writes
    public UserProfile getProfile(Long userId) {
        return userRepository.findById(userId)
            .map(UserProfile::from)
            .orElseThrow();
    }
}

Strategy 2: Track Last Write Timestamp

After a write, store the timestamp in the user's session. For subsequent reads, if the replica's replication position is behind the timestamp, route to primary.

@Service
public class ProfileService {
 
    @Autowired
    private SessionContext sessionContext;
 
    @Transactional
    public void updateProfile(Long userId, ProfileUpdateRequest request) {
        userRepository.save(/* ... */);
        // Record that this session has a pending read-your-writes guarantee
        sessionContext.setLastWriteTimestamp(Instant.now());
    }
 
    public UserProfile getProfile(Long userId) {
        // If we wrote recently, read from primary
        if (sessionContext.hasRecentWrite(Duration.ofSeconds(5))) {
            return readFromPrimary(userId);
        }
        return readFromReplica(userId);
    }
}

Strategy 3: Sticky Sessions

Route all requests from the same user session to the same replica. That replica will have the user's writes. This is a simpler but less precise approach.

9. Monotonic Reads

Definition

If a client reads a value of X, any subsequent read of X by that same client will return the same value or a more recent value. The client will never read an older value than what it has already seen.

The Problem Without Monotonic Reads

t=1: Replica A has: x = 5 (latest)
t=2: Replica B has: x = 3 (lagging)

Client reads x from Replica A --> gets 5
Client reads x from Replica B --> gets 3  (GOES BACKWARDS!)

This is deeply confusing to users. Imagine a bank balance going from $1000 (after deposit) back to$ 500 (before deposit). Users would think the deposit was reversed.

Implementation

The common approach is sticky reads -- route a client's reads to the same replica consistently. Since replicas are monotonically applying the replication log, once the client has seen a value, they will never see an older one from the same replica.

In AWS Aurora, this is achieved by reading from a specific reader endpoint or by using the writer endpoint for critical reads.

10. Monotonic Writes

Definition

Writes from the same client are applied to all replicas in the order they were issued. If client writes W1 then W2, then W1 must be reflected on any replica before W2.

Why It Matters

Client writes:
  W1: INSERT INTO orders (id, status) VALUES (100, 'PENDING');
  W2: UPDATE orders SET status = 'APPROVED' WHERE id = 100;

Without monotonic writes:
  W2 might arrive at Replica before W1
  --> UPDATE on a non-existent row does nothing
  --> Row is eventually inserted with status 'PENDING' (wrong!)

With monotonic writes:
  W1 always arrives and applies before W2

MySQL Replication and Monotonic Writes

MySQL replication is inherently monotonic writes within a single thread (single client). The binary log is sequential. However, with multi-threaded replication (parallel apply), writes from the same client can be applied out of order if not configured carefully.

Configuration:

-- Ensure writes from the same client are applied in order
SET GLOBAL slave_preserve_commit_order = ON;  -- MySQL 5.7+
-- In MySQL 8.0:
SET GLOBAL replica_preserve_commit_order = ON;

11. Consistent Prefix Reads

Definition

If a sequence of writes happened in a specific order (W1, W2, W3), a reader will never see a state that violates the order -- they might see (W1), or (W1, W2), or (W1, W2, W3), but never (W2 without W1) or (W3 without W2).

The Analogy

Think of reading a comic strip. You might only see the first three panels, but you never see panel 5 before panel 4. The story always makes sense up to the point you've read.

Why It Matters

A user posts:  "I'm going to order pizza"
A user posts:  "Actually, I ordered sushi"

Consistent prefix read scenarios (valid):
  Reader sees: Nothing yet
  Reader sees: "I'm going to order pizza"
  Reader sees: Both posts in order

Inconsistent prefix (violates this model):
  Reader sees: "Actually, I ordered sushi" -- without the first post
  (This is confusing without the context)

In MySQL

Consistent prefix reads are guaranteed within a single transaction using REPEATABLE READ isolation (MVCC snapshot). Within the same transaction, you always see a consistent snapshot -- never a partial write.

12. Bounded Staleness

Definition

Reads are guaranteed to be no more than K versions behind or T time units behind the most recent write. A weaker form of strong consistency with a defined staleness window.

Practical Value

Instead of "reads may be arbitrarily stale" (eventual consistency), bounded staleness says "reads are at most 5 seconds stale." This is much more predictable and allows for SLA-based guarantees.

AWS DynamoDB and Bounded Staleness

DynamoDB does not natively support bounded staleness, but you can approximate it:

// Read with explicit consistency level
GetItemRequest request = GetItemRequest.builder()
    .tableName("Products")
    .key(Map.of("productId", AttributeValue.builder().s(productId).build()))
    .consistentRead(false)  // eventually consistent (cheaper)
    // For critical operations:
    // .consistentRead(true)  // strongly consistent (2x read cost)
    .build();

For Aurora MySQL read replicas, you can check replication lag before routing:

@Service
public class SmartReadRouter {
 
    private static final Duration MAX_ACCEPTABLE_LAG = Duration.ofSeconds(5);
 
    public Connection getReadConnection(boolean acceptStaleness) {
        if (!acceptStaleness) {
            return primaryConnection();
        }
        long replicaLagSeconds = getReplicaLag();
        if (replicaLagSeconds <= MAX_ACCEPTABLE_LAG.getSeconds()) {
            return replicaConnection();
        }
        // Replica is too stale, fallback to primary
        return primaryConnection();
    }
 
    private long getReplicaLag() {
        // Query replica: SHOW SLAVE STATUS
        // or use CloudWatch metric: AuroraReplicaLag
        return auroraMontioringService.getReplicaLagSeconds();
    }
}

13. Session Consistency

Definition

Session consistency is a bundle of four guarantees, all scoped to a single user session:

Read-Your-Writes
Monotonic Reads
Monotonic Writes
Consistent Prefix Reads

This is the practical sweet spot for most user-facing applications. It gives users a consistent experience (their own actions are always visible and sensible) without requiring global strong consistency.

Real-World Systems

DynamoDB with client-side session token tracking
MongoDB sessions (causal consistency mode)
AWS ElastiCache with session-pinned reads

In Spring Boot with DynamoDB

// DynamoDB SDK v2 with client token for causal consistency
@Service
public class CartService {
 
    @Autowired
    private DynamoDbClient dynamoDbClient;
 
    @Autowired
    private SessionTokenStore sessionTokenStore;
 
    public void addToCart(String sessionId, String productId, int quantity) {
        // Write operation
        PutItemRequest request = PutItemRequest.builder()
            .tableName("Cart")
            .item(buildCartItem(sessionId, productId, quantity))
            .build();
 
        PutItemResponse response = dynamoDbClient.putItem(request);
        // Store the sequence token to enable read-your-writes
        sessionTokenStore.updateToken(sessionId, response.consumedCapacity());
    }
}

14. Transaction Isolation Levels (MySQL Deep Dive)

MySQL InnoDB supports four isolation levels. Understanding them is critical for production Java applications.

READ UNCOMMITTED

What it allows: A transaction can read uncommitted (dirty) data from other transactions.

Anomalies possible: Dirty reads, non-repeatable reads, phantom reads

Real risk: You read data from a transaction that is later rolled back. You made a decision based on data that never officially existed.

SET SESSION TRANSACTION ISOLATION LEVEL READ UNCOMMITTED;

In Production: Almost never use this. The only valid use case is approximate counts on very large tables where locking overhead is unacceptable and the approximate result is fine.

READ COMMITTED

What it allows: Only reads committed data. Each statement within a transaction sees the latest committed snapshot at the time of that statement.

Anomalies possible: Non-repeatable reads, phantom reads
Prevents: Dirty reads

Non-repeatable read example:

T1: SELECT balance FROM accounts WHERE id=1;  --> 1000
T2: UPDATE accounts SET balance=500 WHERE id=1; COMMIT;
T1: SELECT balance FROM accounts WHERE id=1;  --> 500 (changed!)

T1 read the same row twice and got different values. Non-repeatable.

@Transactional(isolation = Isolation.READ_COMMITTED)
public void processOrder(Long orderId) {
    // Each SELECT sees the latest committed state
    // Two reads of the same row may return different values
}

Good for: Most OLTP operations where you want to see the latest committed data but don't need strict consistency within a transaction. Used in many financial systems for non-critical reads.

REPEATABLE READ (MySQL Default)

What it allows: Within a single transaction, every read of the same data returns the same value (the snapshot at transaction start).

Anomalies possible: Phantom reads (but InnoDB uses gap locks to prevent most)
Prevents: Dirty reads, non-repeatable reads

How it works: InnoDB uses MVCC. At transaction start, a consistent read view (snapshot) is created. All reads use this snapshot. Writes by other transactions are invisible.

@Transactional(isolation = Isolation.REPEATABLE_READ)  // This is the DEFAULT
public void auditTransaction(Long accountId) {
    // All reads within this transaction see the same snapshot
    BigDecimal balance1 = accountRepository.findBalance(accountId);
    // ... some processing ...
    BigDecimal balance2 = accountRepository.findBalance(accountId);
    // balance1 == balance2 guaranteed (even if another transaction updated it)
}

Write Skew under REPEATABLE READ:

T1: SELECT * FROM doctors WHERE on_call = true;  --> 5 doctors
T1: If count >= 2: UPDATE doctors SET on_call=false WHERE id=1;

T2: SELECT * FROM doctors WHERE on_call = true;  --> 5 doctors (same snapshot)
T2: If count >= 2: UPDATE doctors SET on_call=false WHERE id=2;

Result: Both doctors go off call. Only 3 doctors remain on call.
The constraint "at least 2 doctors on call" is violated.

This is write skew -- a concurrency anomaly that REPEATABLE READ does not prevent.

SERIALIZABLE

What it allows: Full isolation. Transactions are serialized. Uses range locks (gap locks + next-key locks) to prevent all anomalies.

Anomalies possible: None
Prevents: All anomalies including write skew and phantom reads

@Transactional(isolation = Isolation.SERIALIZABLE)
public void criticalScheduleUpdate(Long departmentId) {
    // Full serialization -- no concurrent anomalies possible
    // Also the slowest -- use only when necessary
    List<Doctor> onCall = doctorRepository.findOnCall(departmentId);
    if (onCall.size() >= 2) {
        onCall.get(0).setOnCall(false);
        doctorRepository.save(onCall.get(0));
    }
}

Performance impact of SERIALIZABLE:

Acquires shared range locks on every read
Can cause deadlocks more frequently
Significantly reduces concurrency
Use only for critical, low-volume operations

Isolation Level Summary

Level               | Dirty | Non-Repeatable | Phantom | Write Skew
--------------------+-------+----------------+---------+-----------
READ UNCOMMITTED    |  Yes  |      Yes       |   Yes   |    Yes
READ COMMITTED      |  No   |      Yes       |   Yes   |    Yes
REPEATABLE READ     |  No   |      No        |   No*   |    Yes
SERIALIZABLE        |  No   |      No        |   No    |    No

* InnoDB prevents most phantom reads in REPEATABLE READ using gap locks

Setting Isolation Level in application.yml

spring:
  datasource:
    url: jdbc:mysql://your-rds-endpoint:3306/yourdb?useSSL=true&serverTimezone=UTC
    username: ${DB_USERNAME}
    password: ${DB_PASSWORD}
 
  jpa:
    properties:
      hibernate:
        connection:
          isolation: 2 # READ_COMMITTED = 2, REPEATABLE_READ = 4, SERIALIZABLE = 8

15. Multi-Version Concurrency Control (MVCC)

What Is MVCC?

MVCC is a concurrency control mechanism that allows readers and writers to coexist without blocking each other. Instead of locking data for reads, the database keeps multiple versions of the same data. Readers see a consistent snapshot; writers create new versions.

Core Principle: "Readers don't block writers. Writers don't block readers."

How MySQL InnoDB Implements MVCC

Every row in InnoDB has two hidden columns:

DB_TRX_ID: The transaction ID of the last transaction that modified this row
DB_ROLL_PTR: Pointer to the undo log for the previous version

When a transaction starts, InnoDB records the current transaction ID as the "read view." The read view determines which row versions are visible.

Versions of row (id=1, balance):

Version 3: balance=700 | trx_id=500 | roll_ptr --> Version 2
Version 2: balance=1000| trx_id=400 | roll_ptr --> Version 1
Version 1: balance=800 | trx_id=300 | roll_ptr --> null

Transaction T1 (started at trx_id=450):
  - Can see Version 2 (trx_id=400 < 450) -- VISIBLE
  - Cannot see Version 3 (trx_id=500 > 450) -- TOO NEW
  - T1 reads balance = 1000

MVCC in Practice (Spring Boot)

@Service
public class ReportService {
 
    // This long-running transaction sees a consistent snapshot throughout
    // Even if other transactions modify data while this runs, the snapshot is unchanged
    @Transactional(readOnly = true)
    public FinancialReport generateReport(LocalDate date) {
        // All reads in this method see the SAME snapshot
        // No blocking of writes by other transactions
        List<Transaction> transactions = transactionRepository.findByDate(date);
        BigDecimal total = transactions.stream()
            .map(Transaction::getAmount)
            .reduce(BigDecimal.ZERO, BigDecimal::add);
        return new FinancialReport(date, transactions, total);
    }
}

MVCC Benefits and Costs

Benefits:

High concurrency: readers and writers don't block each other
Consistent snapshots for long-running queries/transactions
Efficient for read-heavy workloads

Costs:

Storage overhead: multiple versions of rows stored in undo logs
Cleanup overhead: old versions must be purged (VACUUM in PostgreSQL, InnoDB purge thread in MySQL)
Long-running transactions prevent cleanup of old versions (undo log bloat)

Production Warning: Long-running transactions in MySQL hold open their read view, preventing InnoDB from purging old row versions. This causes undo log bloat. A single transaction open for hours can cause the undo tablespace to grow to gigabytes. Always set transaction timeouts.

# In application.yml
spring:
  transaction:
    default-timeout: 30  # seconds -- fail transactions running longer than 30s

16. Quorum-Based Consistency

The Concept

In a leaderless distributed system with N replicas, you can guarantee consistency by requiring:

W replicas to acknowledge a write before it is considered successful
R replicas to be read before returning a value
When W + R > N, every read is guaranteed to overlap with at least one node that has the latest write

The Formula

W + R > N  ==>  Guaranteed to read latest write (Strong Consistency)
W + R <= N ==>  Potentially stale reads (Eventual Consistency)

Example with N=3

R	W	W+R	Consistency	Notes
3	3	6 > 3	Strong	All nodes agree -- very slow
2	2	4 > 3	Strong	Most common production setting
1	3	4 > 3	Strong	All writes confirmed, fast reads
3	1	4 > 3	Strong	Fast writes, slow reads
1	1	2 < 3	Eventual	Fastest, may return stale
2	1	3 = 3	Borderline	Not safe -- must be > N

DynamoDB Quorum (Simplified)

DynamoDB uses a quorum-based approach internally. When you request a strongly consistent read, DynamoDB reads from a majority of replicas (quorum) and returns the value agreed upon by the majority.

// Strongly consistent read (reads from quorum)
GetItemRequest strongRead = GetItemRequest.builder()
    .tableName("Inventory")
    .key(Map.of("productId", AttributeValue.fromS(productId)))
    .consistentRead(true)  // quorum read
    .build();
 
// Eventually consistent read (reads from one replica, possibly stale)
GetItemRequest eventualRead = GetItemRequest.builder()
    .tableName("Inventory")
    .key(Map.of("productId", AttributeValue.fromS(productId)))
    .consistentRead(false)  // single replica read -- costs 50% less
    .build();

17. Vector Clocks and Causal Ordering

The Problem with Physical Timestamps

Physical clocks across servers are never perfectly synchronized. Two events with the same millisecond timestamp could have happened in either order, or concurrently. You cannot reliably use physical time to order distributed events.

Lamport Timestamps (Logical Clocks)

A simple logical clock:

Each node maintains a counter
Increment counter before each event
When sending a message, include the counter
When receiving a message, set counter to max(local, received) + 1

Guarantee: If A happens before B (causally), then timestamp(A) < timestamp(B).

Limitation: If timestamp(A) < timestamp(B), you CANNOT conclude A happened before B. They might be concurrent.

Vector Clocks

Vector clocks solve the limitation. Each node maintains a vector of counters, one for each node in the system.

3-node system: [NodeA, NodeB, NodeC]

Initial state:   [0, 0, 0]

NodeA does event: [1, 0, 0]
NodeA sends to B: B receives [1, 0, 0]
  NodeB increments: [1, 1, 0]

NodeC does event: [0, 0, 1]  (concurrent with A and B's events)

Comparison rules:

[1,2,3] happens-before [2,2,3] if all components of first <= second AND at least one is strictly less
[1,2,3] and [2,1,3] are concurrent (neither dominates)

Java Implementation:

public class VectorClock {
    private final String nodeId;
    private final Map<String, Long> clock;
 
    public VectorClock(String nodeId, List<String> allNodes) {
        this.nodeId = nodeId;
        this.clock = new HashMap<>();
        allNodes.forEach(node -> clock.put(node, 0L));
    }
 
    // Increment this node's counter before an event
    public void tick() {
        clock.merge(nodeId, 1L, Long::sum);
    }
 
    // Merge received vector clock (when receiving a message)
    public void merge(Map<String, Long> received) {
        received.forEach((node, time) ->
            clock.merge(node, time, Math::max)
        );
        tick(); // also increment own counter
    }
 
    // Check if this clock happens-before other
    public boolean happensBefore(VectorClock other) {
        // All entries in this clock must be <= other
        // At least one entry must be strictly less
        boolean allLeq = clock.entrySet().stream()
            .allMatch(e -> e.getValue() <= other.clock.getOrDefault(e.getKey(), 0L));
        boolean oneStrict = clock.entrySet().stream()
            .anyMatch(e -> e.getValue() < other.clock.getOrDefault(e.getKey(), 0L));
        return allLeq && oneStrict;
    }
 
    // Check if concurrent (neither happens-before the other)
    public boolean isConcurrentWith(VectorClock other) {
        return !this.happensBefore(other) && !other.happensBefore(this);
    }
 
    public Map<String, Long> getClock() {
        return Collections.unmodifiableMap(clock);
    }
}

18. Conflict Resolution Strategies

When concurrent writes happen to the same data (multi-leader or leaderless replication), conflicts must be resolved.

Last-Write-Wins (LWW)

The write with the highest timestamp wins. All other concurrent writes are discarded.

Pro: Simple, automatic
Con: Data loss -- valid writes can be silently discarded. Clock skew can cause incorrect ordering.

Used by: Cassandra (default), DynamoDB (within a single partition, LWW with server-side timestamps)

// DynamoDB conditional write to implement optimistic LWW safely
Map<String, AttributeValue> expressionValues = Map.of(
    ":currentVersion", AttributeValue.fromN(String.valueOf(currentVersion)),
    ":newVersion", AttributeValue.fromN(String.valueOf(currentVersion + 1))
);
 
UpdateItemRequest request = UpdateItemRequest.builder()
    .tableName("Products")
    .key(Map.of("productId", AttributeValue.fromS(productId)))
    .updateExpression("SET price = :price, version = :newVersion")
    .conditionExpression("version = :currentVersion")  // optimistic locking
    .expressionAttributeValues(expressionValues)
    .build();

Merge Functions

Application-specific merge logic. Given two conflicting versions, a merge function produces the correct result.

Example: Shopping cart -- if a user adds item A on device 1 and item B on device 2 concurrently, the merge result should contain both A and B (union semantics).

public ShoppingCart merge(ShoppingCart cart1, ShoppingCart cart2) {
    // Union: take all items from both carts
    Map<String, CartItem> merged = new HashMap<>(cart1.getItems());
    cart2.getItems().forEach((productId, item) ->
        merged.merge(productId, item, (a, b) ->
            a.getQuantity() > b.getQuantity() ? a : b  // take higher quantity
        )
    );
    return new ShoppingCart(merged);
}

CRDTs (See Section 7)

Mathematical data structures where all merges are conflict-free. The preferred solution for systems that need SEC.

19. Consistency Model Comparison Matrix

Model	Ordering	Scope	Real-Time	Cost	Use Case
Linearizability	Global	All clients	Yes	Very High	Locks, leader election, unique IDs
Strict Serializability	Global + Transactional	All clients	Yes	Very High	Banking, financial systems
Serializability	Transactional	All clients	No	High	Critical business transactions
Sequential	Global	All clients	No	High	Shared memory systems
Causal	Causal dependencies	All clients	No	Medium	Chat, social, collaborative editing
Eventual	None	All clients	No	Low	Catalog, analytics, social feeds
Strong Eventual (SEC)	Merge-based	All clients	No	Low-Medium	Collaborative docs, distributed counters
Read-Your-Writes	Client's own writes	Single client	Yes (self)	Low	User profiles, session data
Monotonic Reads	Monotonically forward	Single client	No	Low	Any user-facing read
Session	Multiple guarantees	Single session	Partial	Low-Medium	Most user-facing apps
Bounded Staleness	Within time window	All clients	Approximate	Medium	Inventory, near-real-time dashboards

Next: Part 3: Java and Spring Boot Implementation -- Production-ready code patterns for every consistency challenge.

Part of the Consistency Models Demystified series
Stack: Java 17, Spring Boot 3.x, MySQL 8.0, AWS

Series: Consistency Models Demystified

Consistency Models - Part 2: All Models Deep Dive

Table of Contents

1. Linearizability (Strong Consistency)

Definition

The Analogy

Formal Intuition

When to Use

Real-World Systems

Cost of Linearizability

2. Serializability

Definition

Example

Why It Matters

In Spring Boot

3. Strict Serializability

Definition

4. Sequential Consistency

Definition

The Analogy

Difference from Linearizability

Example

Used By

5. Causal Consistency

Definition

The Analogy

Why Causal Consistency Matters

Implementation: Vector Clocks

Systems Using Causal Consistency

6. Eventual Consistency

Definition

The Analogy

What "Eventually" Means in Practice

When Eventual Consistency Is Acceptable

When Eventual Consistency Is NOT Acceptable

Conflict Resolution in Eventual Consistency

7. Strong Eventual Consistency and CRDTs

Definition

Conflict-Free Replicated Data Types (CRDTs)

Types of CRDTs

G-Counter (Grow-only Counter)

PN-Counter (Positive-Negative Counter)

LWW-Register (Last-Write-Wins Register)

OR-Set (Observed-Remove Set)

CRDTs in Production

8. Read-Your-Writes Consistency

Definition

Why It Matters

Implementation Strategies

Strategy 1: Route Writes and Reads to Primary

Strategy 2: Track Last Write Timestamp

Strategy 3: Sticky Sessions

9. Monotonic Reads

Definition

The Problem Without Monotonic Reads

Implementation

10. Monotonic Writes

Definition

Why It Matters

MySQL Replication and Monotonic Writes

11. Consistent Prefix Reads

Definition

The Analogy

Why It Matters

In MySQL

12. Bounded Staleness

Definition

Practical Value

AWS DynamoDB and Bounded Staleness

13. Session Consistency

Definition

Real-World Systems

In Spring Boot with DynamoDB

14. Transaction Isolation Levels (MySQL Deep Dive)

READ UNCOMMITTED

READ COMMITTED

REPEATABLE READ (MySQL Default)

SERIALIZABLE

Isolation Level Summary

Setting Isolation Level in application.yml