Consistency Models - Part 1: Fundamentals
Navigation: Index | Part 1 | Part 2 | Part 3 | Part 4 | Part 5 | Part 6
Table of Contents
- What Is Consistency?
- Why Single-Node Systems Are Easy
- Why Distributed Systems Are Hard
- The 8 Fallacies of Distributed Computing
- ACID Properties Deep Dive
- BASE Properties Deep Dive
- ACID vs BASE - The Full Comparison
- CAP Theorem - The Deep Dive
- PACELC Theorem - Beyond CAP
- Consistency vs Isolation - Critical Distinction
- Replication and Why It Creates Consistency Challenges
- The Consistency Spectrum Overview
- Key Takeaways
1. What Is Consistency?
Informal Definition
Consistency is the guarantee that when you read data, you see a correct, agreed-upon, and up-to-date view of the world.
More Precisely
In the context of distributed systems, consistency is about the agreement between multiple copies (replicas) of data. If you update data on one server, consistency models define when and how other servers reflect that update.
Why Multiple Definitions Exist
The word "consistency" is overloaded in computer science:
| Context | What "Consistency" Means |
|---|---|
| ACID (databases) | Data integrity constraints are maintained after every transaction |
| CAP Theorem | Every read reflects the most recent write (no stale data) |
| Consistency Models | A contract defining what a system guarantees about the order and visibility of operations |
| Eventual Consistency | Given no new updates, replicas will eventually converge |
Critical Point: The "C" in ACID and the "C" in CAP refer to completely different things. This confusion trips up engineers in interviews constantly. See Section 10 for a detailed explanation.
2. Why Single-Node Systems Are Easy
Consider a single-server MySQL database with no replicas:
[Client] --> [MySQL Single Node] --> [Disk]
On a single node:
- There is exactly one copy of the data
- All reads and writes go to the same place
- The database engine serializes concurrent access with locks
- After a
COMMIT, the data is immediately visible to all subsequent reads - Consistency is guaranteed by the database engine itself
This is the world most developers assume they live in. It is clean, simple, and largely correct. But it does not scale.
The moment you add a replica, a cache, a second service, or a second data center, the assumptions of the single-node world shatter.
3. Why Distributed Systems Are Hard
Problem 1: Network Partitions
A network partition occurs when nodes in a distributed system cannot communicate with each other. This is not theoretical -- it happens in production due to:
- Router failures
- Misconfigured firewall rules
- Data center outages
- Cable cuts between availability zones
- Software bugs in network stack
- Overloaded network interfaces dropping packets
When a partition occurs, nodes that cannot communicate must decide: do I stop accepting requests (stay consistent) or continue with potentially stale data (stay available)?
Problem 2: Partial Failures
In a distributed system, components can fail partially. A message might be delivered once, twice, or never. A write might succeed on 2 out of 3 replicas. A node might crash mid-transaction after writing to disk but before sending acknowledgment. These partial states are impossible in a single-node system.
Problem 3: Latency and Asynchrony
Distributed systems communicate over networks with non-zero, variable latency. The speed of light is a real constraint:
- Within an AWS Availability Zone: ~0.5ms
- Between Availability Zones in the same region: ~1-2ms
- Between AWS regions (US to Europe): ~80-150ms
This means that if you want every read to see the latest write, you must synchronously wait for all replicas to confirm the write. This introduces latency proportional to the number of replicas and their distance.
Problem 4: Concurrent Operations
Multiple clients reading and writing the same data simultaneously creates race conditions. Without careful coordination:
- Two users might both read a quantity of 1, both decide to purchase it, and both successfully decrement -- resulting in quantity of -1
- A write might overwrite another write that happened slightly earlier
- A read might see data halfway through a multi-step update
Problem 5: Clock Skew
Each machine has its own physical clock. Even with NTP synchronization, clocks across servers differ by milliseconds. You cannot reliably use timestamps to determine the order of events in a distributed system. Two events with the same millisecond timestamp could have happened in either order.
Problem 6: The Observability Gap
In a distributed system, you cannot observe the global state at a single point in time. By the time you observe all nodes, some of them have already changed. This makes reasoning about "current state" fundamentally difficult.
4. The 8 Fallacies of Distributed Computing
Peter Deutsch at Sun Microsystems identified 8 assumptions that distributed system newcomers incorrectly make:
| Fallacy | What You Assume | The Reality |
|---|---|---|
| 1. The network is reliable | Messages always arrive | Packets drop, connections timeout |
| 2. Latency is zero | Communication is instant | Network latency is real and variable |
| 3. Bandwidth is infinite | You can send unlimited data | Bandwidth is finite and shared |
| 4. The network is secure | Your messages are safe | Assume breach; encrypt in transit |
| 5. Topology never changes | Network layout is static | IPs change, nodes scale up/down |
| 6. There is one administrator | One team manages everything | Multi-team ownership, conflicting configs |
| 7. Transport cost is zero | Serialization is free | JSON/Protobuf has CPU and network cost |
| 8. The network is homogeneous | All parts use same protocols | Mixed versions, protocols, operating systems |
Every fallacy above has direct implications for consistency. For example:
- Fallacy 1 (reliable network) means your distributed lock might not be released if the network drops during release
- Fallacy 2 (zero latency) is why synchronous replication is expensive
- Fallacy 6 (one admin) is why distributed systems have conflicting writes from multiple leaders
5. ACID Properties Deep Dive
ACID is a set of properties that guarantee reliable processing of database transactions. MySQL with InnoDB engine is ACID-compliant.
A - Atomicity
Definition: A transaction is all-or-nothing. Either all operations in the transaction succeed and are committed, or none of them are.
In MySQL: Implemented via the Write-Ahead Log (undo log). If a transaction is rolled back (or the server crashes mid-transaction), the undo log is used to reverse any partial changes.
Example:
BEGIN;
UPDATE accounts SET balance = balance - 500 WHERE id = 1; -- debit
UPDATE accounts SET balance = balance + 500 WHERE id = 2; -- credit
COMMIT; -- both happen, or neitherIf the server crashes between the two UPDATEs, MySQL rolls back the debit. The account never ends up in a state where $500 was debited but not credited.
Java / Spring Boot:
@Service
public class TransferService {
@Transactional // atomicity guaranteed
public void transfer(Long fromId, Long toId, BigDecimal amount) {
Account from = accountRepository.findById(fromId).orElseThrow();
Account to = accountRepository.findById(toId).orElseThrow();
from.debit(amount); // both happen atomically
to.credit(amount); // or neither happens
}
}C - Consistency (ACID Context)
Definition: A transaction brings the database from one valid state to another valid state. All data integrity constraints (foreign keys, unique constraints, check constraints) must be satisfied before and after the transaction.
IMPORTANT: This is completely different from the "C" in CAP theorem. ACID Consistency is about data integrity constraints. CAP Consistency is about replicas having the same data.
Example: If you have a constraint that balance >= 0, no transaction should leave the balance negative. The database enforces this at commit time.
ALTER TABLE accounts ADD CONSTRAINT chk_balance CHECK (balance >= 0);I - Isolation
Definition: Concurrent transactions execute as if they were serial (one at a time). Intermediate states of a transaction are not visible to other transactions.
MySQL Isolation Levels (from weakest to strongest):
| Level | Dirty Read | Non-Repeatable Read | Phantom Read |
|---|---|---|---|
| READ UNCOMMITTED | Possible | Possible | Possible |
| READ COMMITTED | Prevented | Possible | Possible |
| REPEATABLE READ (default) | Prevented | Prevented | Possible* |
| SERIALIZABLE | Prevented | Prevented | Prevented |
*MySQL InnoDB uses gap locks to prevent phantom reads even in REPEATABLE READ.
Setting isolation level in Spring Boot:
@Transactional(isolation = Isolation.SERIALIZABLE)
public void criticalOperation() {
// Full isolation - no concurrent anomalies possible
}
@Transactional(isolation = Isolation.READ_COMMITTED)
public void standardOperation() {
// Good default for most non-financial operations
}D - Durability
Definition: Once a transaction is committed, it remains committed even in the case of system failure. The data is persisted to non-volatile storage.
In MySQL InnoDB: Durability is achieved through:
innodb_flush_log_at_trx_commit = 1-- flushes redo log to disk on every commit (safest)sync_binlog = 1-- syncs binary log to disk on every commit- Double-write buffer -- prevents partial page writes during a crash
AWS RDS Configuration for Durability:
# These are the production defaults in AWS RDS for MySQL
innodb_flush_log_at_trx_commit = 1 (cannot change in RDS)
sync_binlog = 1 (cannot change in RDS)
Production Tip: AWS RDS sets
innodb_flush_log_at_trx_commit = 1andsync_binlog = 1by default and does not allow changing them. This is intentional -- it maximizes durability. If you see these set to 2 or 0 anywhere, it is a durability risk.
6. BASE Properties Deep Dive
BASE emerged as the alternative philosophy for distributed systems where ACID guarantees are too expensive to maintain at scale.
BA - Basically Available
The system guarantees availability (not 100%, but "basic" availability). During partial failures, the system continues to respond, possibly with degraded or stale data instead of errors.
Example: Amazon's shopping cart continues to function (you can add items) even if the inventory service is temporarily unreachable. You might add an item that turns out to be out of stock -- the system resolves this later (during checkout).
S - Soft State
The state of the system may change over time, even without any new input. Data is not guaranteed to be consistent at all times. Replicas might diverge temporarily.
Example: A DynamoDB table has 3 replicas. After a write to one replica, the other two are in a "soft state" -- they are being updated asynchronously. During this window, they may have older data.
E - Eventually Consistent
Given no new updates, all replicas will eventually converge to the same value. The system guarantees convergence, but not when.
Example: DNS is the canonical example. When you update a DNS record, the change propagates across DNS servers globally over minutes to hours. During propagation, different DNS servers return different values. Eventually, all return the new value.
7. ACID vs BASE - The Full Comparison
| Dimension | ACID | BASE |
|---|---|---|
| Consistency Guarantee | Strong, immediate | Eventual |
| Availability | May sacrifice availability for consistency | Prioritizes availability |
| Performance | Lower throughput due to coordination | Higher throughput |
| Scalability | Harder to scale horizontally | Designed for horizontal scale |
| Complexity | Managed by database engine | Application must handle conflicts |
| Use Cases | Financial, medical, legal data | Social media, catalog, analytics |
| Examples | MySQL, PostgreSQL, Oracle | Cassandra, DynamoDB (eventual), Redis |
The Spectrum in Practice
Real-world systems rarely pick pure ACID or pure BASE. They use different models for different data:
E-Commerce Platform:
- Orders table: ACID (MySQL) - money involved
- Product catalog: BASE (DynamoDB) - eventual is fine
- User sessions: BASE (Redis) - speed matters
- Inventory counts: ACID with atomic operations - prevent overselling
- Product reviews: BASE - seconds of lag is fine
- Payment records: ACID (MySQL SERIALIZABLE) - strictest
8. CAP Theorem - The Deep Dive
What CAP Actually Says
Eric Brewer's CAP theorem (proved formally by Gilbert and Lynch in 2002):
In any distributed data system, it is impossible to simultaneously guarantee all three of:
- Consistency: Every read receives the most recent write or an error
- Availability: Every request receives a non-error response
- Partition Tolerance: The system continues operating despite network partitions
The Key Insight Everyone Gets Wrong
"P is not optional." Network partitions will happen in any real distributed system. You cannot design them away. Therefore:
The real choice is not "pick 2 of 3." The real choice is:
When a network partition occurs, what do you sacrifice: Consistency or Availability?
- CP systems choose Consistency: stop serving requests (or return errors) during a partition to avoid returning stale/inconsistent data. ZooKeeper, etcd.
- AP systems choose Availability: keep serving requests during a partition, potentially returning stale data. Cassandra, CouchDB, DynamoDB (default).
A Concrete Example
Imagine you have a distributed key-value store with two nodes, A and B, and a client:
+--------+
| Client |
+--------+
/ \
writes reads
/ \
+------+ +------+
| Node A| | Node B|
+------+ +------+
| |
+--- Network Partition --+
(nodes cannot talk)
Scenario: Client writes x = 5 to Node A. Network partition occurs. Client reads x from Node B.
- CP choice: Node B detects the partition. It refuses to answer (returns error). Consistency preserved. Client frustrated.
- AP choice: Node B answers with the old value of
x. Client gets stale data. System keeps running.
Where Common Systems Fall
| System | Choice | Reasoning |
|---|---|---|
| MySQL RDS (single node) | Not applicable | No replication = no partition |
| MySQL with synchronous replicas | CP | Waits for replica acknowledgment |
| MySQL Aurora (writes) | CP | Single writer enforces consistency |
| MySQL Aurora (reads from replicas) | AP | Replicas may lag |
| DynamoDB (eventual reads) | AP | Returns possibly stale data |
| DynamoDB (strong reads) | CP | Waits for quorum |
| Cassandra (quorum reads/writes) | CP | With tuning |
| Cassandra (default) | AP | Prioritizes availability |
| ZooKeeper | CP | Stops serving if cannot reach majority |
| Redis (single master) | CP (sort of) | Single node = no partition |
| Redis Cluster | AP (default) | Can accept writes to minority partitions |
CAP Misconceptions to Avoid
Misconception 1: "CA systems exist (without P)."
- Reality: In a true distributed system (multiple nodes), partition tolerance is a physical reality. A "CA" system is just a single node -- which is not distributed.
Misconception 2: "CAP is a permanent choice."
- Reality: Systems can be tuned per operation. DynamoDB lets you choose per-request. You can do strongly consistent reads for critical data and eventually consistent for others.
Misconception 3: "Consistency in CAP = Consistency in ACID."
- Reality: They are completely different. See Section 10.
Misconception 4: "CAP means you can only have 2 properties."
- Reality: You always have P (in a real distributed system). You always have A and C during normal operation (no partition). CAP only matters during a partition.
9. PACELC Theorem - Beyond CAP
Why CAP Is Not Enough
CAP only addresses what happens during a network partition. But network partitions are rare. What about the 99.9% of the time when everything is healthy?
Daniel Abadi (2012) extended CAP with PACELC:
If Partition (P):
choose between Availability (A) and Consistency (C)
Else (E, normal operation):
choose between Latency (L) and Consistency (C)
The Latency-Consistency Trade-Off
In normal operation (no partition), synchronous replication means:
- Write must be acknowledged by all (or quorum) replicas before returning success
- Every write has higher latency (proportional to replica count and network distance)
- But reads are always consistent
Asynchronous replication means:
- Write returns as soon as the primary writes
- Replication happens in the background
- Lower write latency
- But reads from replicas may be stale
PACELC Classification of Common Systems
| System | During Partition | During Normal Operation | Classification |
|---|---|---|---|
| MySQL RDS Multi-AZ (sync) | CP | EC (higher latency for consistency) | PC/EC |
| Aurora (writer) | CP | EC | PC/EC |
| Aurora (reader endpoints) | AP | EL | PA/EL |
| DynamoDB (strong reads) | CP | EC | PC/EC |
| DynamoDB (eventual reads) | AP | EL | PA/EL |
| Cassandra | AP | EL | PA/EL |
| ZooKeeper | CP | EC | PC/EC |
| MongoDB (primary reads) | CP | EC | PC/EC |
Key Insight for Architects: Most systems that are AP during partitions are also EL during normal operation (they consistently prefer availability/low-latency over consistency). Most CP systems are also EC. The choice is usually consistent across both scenarios.
10. Consistency vs Isolation - Critical Distinction
This is one of the most commonly confused topics in interviews and system design discussions. Let us be absolutely clear.
ACID Consistency (The "C" in ACID)
What it is: Data integrity constraints are satisfied before and after every transaction. The database never enters a state that violates its defined rules.
Who enforces it: The database engine, based on constraints you define.
Examples:
UNIQUEconstraint: No two rows can have the same email addressFOREIGN KEYconstraint: An order must reference a valid customerCHECKconstraint: Balance cannot be negative- Application-level constraints enforced within a transaction
This is about data validity, not about replicas.
CAP Consistency (The "C" in CAP)
What it is: In a distributed system, every read returns the most recent write across all replicas, or an error. No stale reads.
Who enforces it: The distributed system's replication and coordination protocol.
This is about replica agreement, not data validity.
Isolation (The "I" in ACID)
What it is: Concurrent transactions are isolated from each other. Each transaction sees a consistent snapshot of the data, as if it were the only transaction running.
This is about concurrency anomalies within a single database, not about replicas.
Side-by-Side Comparison
| Property | Question It Answers | Scope | Enforced By |
|---|---|---|---|
| ACID Consistency | Is the data valid according to defined rules? | Single node, data integrity | DB constraints |
| CAP Consistency | Do all replicas agree on the current value? | Multiple nodes, replica state | Replication protocol |
| Isolation | Are concurrent transactions separated? | Single node, concurrency | Locking / MVCC |
Why This Confusion Matters in Production
A common mistake: "We use MySQL, which is ACID, so we have consistency."
This is true for ACID Consistency (data constraints are enforced). But if you have read replicas with asynchronous replication, you do not have CAP Consistency -- your replicas may lag and return stale data.
// This might return stale data if routed to a replica
@Transactional(readOnly = true)
public UserProfile getProfile(Long userId) {
// ACID Consistency: profile satisfies all constraints (valid)
// CAP Consistency: may be seconds behind the primary (stale)
return userRepository.findById(userId).orElseThrow();
}11. Replication and Why It Creates Consistency Challenges
Why Replication Exists
Replication (having multiple copies of data) serves three goals:
- High Availability: If one node fails, others take over
- Read Scaling: Multiple replicas serve read requests, reducing load on primary
- Disaster Recovery: Replicas in different regions survive regional failures
Types of Replication
Synchronous Replication
Client
|
| Write request
v
Primary -----(replicate)-----> Replica 1
| |
|<-------(acknowledge)---------|
|
| Respond to client (after replica confirms)
- Consistency: Strong -- data is on all nodes before success
- Latency: Higher -- must wait for replica
- Availability: Lower -- if replica is slow or down, writes stall
- Used by: Aurora (synchronously replicates across AZs), Synchronous MySQL replicas
Asynchronous Replication
Client
|
| Write request
v
Primary -----> Respond to client (immediately)
|
| (background replication, may lag 10ms to 10 seconds)
v
Replica 1, Replica 2, ...
- Consistency: Eventual -- replicas may lag
- Latency: Lower -- no wait for replicas
- Availability: Higher -- replica failure does not block writes
- Used by: MySQL RDS read replicas (default), Aurora read replicas, DynamoDB global tables
Leaderless Replication (Quorum-Based)
No designated primary. Client writes to multiple nodes and reads from multiple nodes. Uses quorum (majority) to determine consistency.
Client writes to 3 of 5 nodes (W = 3)
Client reads from 3 of 5 nodes (R = 3)
W + R > N (3 + 3 > 5) -- guaranteed to overlap with the write
- Used by: Cassandra, DynamoDB (internally), Riak
The Replication Lag Problem
With asynchronous replication, the replica is always behind the primary. This lag creates consistency windows where:
t=0: Primary has: balance = 1000
t=1: Client writes balance = 500 to primary
t=2: Client reads from replica --> gets 1000 (STALE!)
t=3: Replication completes
t=4: Client reads from replica --> gets 500 (CURRENT)
The window between t=1 and t=3 is the replication lag window. During this window, reads from the replica are stale.
Typical replication lag in AWS Aurora: 10-100 milliseconds
In RDS MySQL read replicas under heavy load: Can be seconds or even minutes
MySQL Binary Log (Binlog) and Replication
MySQL replication works via the binary log:
- Primary records all changes to the binlog
- Replica's I/O thread reads the binlog from the primary
- Replica's SQL thread applies the changes
Primary:
[Transaction] --> [InnoDB Commit] --> [Binlog Write]
|
(async transfer)
v
Replica:
[Relay Log] --> [SQL Thread Apply]
The time between "Binlog Write" on primary and "SQL Thread Apply" on replica is the replication lag.
12. The Consistency Spectrum Overview
Consistency models form a spectrum from strongest to weakest. Each model is a trade-off:
STRONGER CONSISTENCY WEAKER CONSISTENCY
| |
| Linearizability (Atomic Consistency) |
| "Real-time globally ordered" |
| |
| Sequential Consistency |
| "Same order, not real-time" |
| |
| Causal Consistency |
| "Causally related ops in order" |
| |
| FIFO Consistency (Monotonic Writes) |
| "Per-client ordering preserved" |
| |
| Eventual Consistency |
| "Will converge, no time guarantee" |
| |
v v
STRONGER CONSISTENCY WEAKER CONSISTENCY
Session-level models (often layered on top):
- Read-Your-Writes
- Monotonic Reads
- Consistent Prefix Reads
- Bounded Staleness
Deep dive into each model is in Part 2.
13. Key Takeaways
-
Consistency is not binary. It exists on a spectrum from linearizability (strongest) to eventual consistency (weakest).
-
ACID Consistency != CAP Consistency != Isolation. These are three different concepts. Mixing them up is a serious mistake.
-
CAP theorem means P is mandatory. The real choice is Consistency vs Availability during a partition.
-
PACELC extends CAP. Even without partitions, you trade consistency for lower latency.
-
Replication is the root cause. Consistency challenges arise because replication is asynchronous in most real systems.
-
There is no free lunch. Stronger consistency = more coordination = higher latency = lower availability.
-
Domain matters. Financial data needs strong consistency. Social media feeds work fine with eventual consistency.
-
Most production systems use a mix. Different services, tables, and operations within the same application use different consistency models.
Next: Part 2: Consistency Models Deep Dive -- Every model explained with analogies, trade-offs, and real system examples.
Part of the Consistency Models Demystified series
Stack: Java 17, Spring Boot 3.x, MySQL 8.0, AWS