Part 2: Database Modelling and ID Strategies
Navigation: Index | Part 1 | Part 2 | Part 3 | Part 4 | Part 5 | Part 6
Table of Contents
- JPA Entity Fundamentals
- Entity and Table Annotations
- Column Mappings and Data Types
- Primary Key Strategies
- UUID as a Primary Key - Deep Dive
- Custom ID Generators: ULID and Snowflake
- Composite Primary Keys
- Natural Keys vs Surrogate Keys
- Enum Mapping in JPA
- Database Design and Normalization
- Index Strategies for MySQL
- Flyway Database Migrations
- The Base Auditable Entity
- Tips and Best Practices
- Anti-Patterns and Pitfalls
1. JPA Entity Fundamentals
A JPA entity is a plain Java class that maps to a database table. Every row in the table becomes an instance of the entity class.
The Persistence Context
Before diving into entities, you must understand the Persistence Context (also called the Hibernate Session). It is a first-level cache and a unit of work.
PERSISTENT STATE: Entity exists in the database AND in the Persistence Context.
Changes are automatically tracked and flushed on commit.
TRANSIENT STATE: Entity was created with new(), not associated with any session.
Not tracked, not saved unless you call persist() or save().
DETACHED STATE: Entity was once managed but the session was closed.
Changes are NOT tracked. Need to merge() to re-attach.
REMOVED STATE: Entity is scheduled for deletion on next flush/commit.
Understanding this lifecycle is essential for debugging unexpected behavior in JPA.
2. Entity and Table Annotations
package com.company.ecommerce.domain.product;
import com.company.ecommerce.common.audit.AuditableEntity;
import com.company.ecommerce.domain.order.OrderItem;
import jakarta.persistence.*;
import lombok.*;
import java.math.BigDecimal;
import java.util.HashSet;
import java.util.Set;
/*
* @Entity: Marks this class as a JPA entity.
* Every entity must have a no-arg constructor (Lombok @NoArgsConstructor handles this).
*
* @Table: Maps to a specific database table.
* Without @Table, Hibernate uses the class name as the table name.
* Always use @Table to be explicit and to define indexes.
*
* uniqueConstraints: Define unique constraints at the table level.
* These are used both for DDL generation and for validation.
*
* indexes: Define database indexes. Critical for query performance.
* Hibernate WILL create these if ddl-auto=create/update.
* With Flyway (recommended), define them in your SQL migration scripts.
*/
@Entity
@Table(
name = "products",
uniqueConstraints = {
@UniqueConstraint(name = "uk_products_sku", columnNames = {"sku"})
},
indexes = {
@Index(name = "idx_products_category_id", columnList = "category_id"),
@Index(name = "idx_products_status", columnList = "status"),
@Index(name = "idx_products_price", columnList = "price"),
@Index(name = "idx_products_status_price", columnList = "status, price")
}
)
/*
* @Getter, @Setter: Lombok generates all getters and setters.
*
* @NoArgsConstructor: JPA REQUIRES a no-arg constructor to instantiate entities
* when loading from the database. Without this, you get InstantiationException.
*
* @AllArgsConstructor: Useful for test data builders.
*
* IMPORTANT: Do NOT use @Data on entities.
* @Data generates equals/hashCode based on ALL fields, including relationships.
* This causes StackOverflowError for bidirectional relationships and
* incorrect behavior when entities are used in Sets or as Map keys.
* Always write equals/hashCode manually or use @EqualsAndHashCode(of = "id").
*/
@Getter
@Setter
@NoArgsConstructor
@AllArgsConstructor
@Builder
@ToString(exclude = {"categories", "orderItems"}) // Exclude collections to prevent lazy loading
public class Product extends AuditableEntity {
@Id
@GeneratedValue(strategy = GenerationType.IDENTITY)
@Column(name = "id")
private Long id;
@Column(name = "sku", nullable = false, unique = true, length = 50)
private String sku;
@Column(name = "name", nullable = false, length = 200)
private String name;
/*
* columnDefinition = "TEXT": Maps to MySQL TEXT type.
* Without this, Hibernate maps String to VARCHAR(255) by default.
* TEXT in MySQL can hold up to 65,535 characters.
* For very large content, use MEDIUMTEXT (16MB) or LONGTEXT (4GB).
*/
@Column(name = "description", columnDefinition = "TEXT")
private String description;
/*
* precision: Total number of digits (including decimal places)
* scale: Number of digits after the decimal point
* NUMERIC(10,2) in MySQL can store values up to 99,999,999.99
* BigDecimal is the ONLY correct type for money. Never use double or float.
*/
@Column(name = "price", nullable = false, precision = 10, scale = 2)
private BigDecimal price;
@Column(name = "stock_quantity", nullable = false)
@Builder.Default
private Integer stockQuantity = 0;
/*
* @Enumerated(EnumType.STRING): Store enum as VARCHAR, not as integer ordinal.
* NEVER use EnumType.ORDINAL in production.
* With ORDINAL, if you reorder or insert a new value in the middle of the enum,
* all existing database values become wrong silently.
* With STRING, the enum name is stored. Reordering has zero impact.
*/
@Enumerated(EnumType.STRING)
@Column(name = "status", nullable = false, length = 20)
@Builder.Default
private ProductStatus status = ProductStatus.ACTIVE;
// Relationships are defined in Part 3 and Part 4
@ManyToMany(fetch = FetchType.LAZY)
@JoinTable(
name = "product_categories",
joinColumns = @JoinColumn(name = "product_id"),
inverseJoinColumns = @JoinColumn(name = "category_id")
)
@Builder.Default
private Set<Category> categories = new HashSet<>();
/*
* Custom equals and hashCode based only on business key (sku).
* This is correct behavior:
* - Two products with the same SKU are the same product
* - Does NOT depend on id (which may be null before persist)
* - Works correctly in Sets and as HashMap keys
*/
@Override
public boolean equals(Object o) {
if (this == o) return true;
if (!(o instanceof Product product)) return false;
return sku != null && sku.equals(product.sku);
}
@Override
public int hashCode() {
return getClass().hashCode();
/*
* Why getClass().hashCode() and not Objects.hashCode(sku)?
* The hash code must remain consistent before and after persist().
* Before persist, sku could be null. Using a constant hash code
* from the class ensures correct Set behavior throughout the
* entity lifecycle. The cost is more hash collisions, which is
* an acceptable trade-off for correctness.
*/
}
}3. Column Mappings and Data Types
MySQL Type Mappings Reference
| Java Type | MySQL Type | JPA Annotation | Notes |
|---|---|---|---|
| String | VARCHAR(255) | @Column(length=255) | Default length is 255 |
| String | TEXT | @Column(columnDefinition="TEXT") | Up to 65K chars |
| String | MEDIUMTEXT | @Column(columnDefinition="MEDIUMTEXT") | Up to 16MB |
| Long | BIGINT | - | Auto-detected |
| Integer | INT | - | Auto-detected |
| BigDecimal | DECIMAL(p,s) | @Column(precision=10, scale=2) | Always use for money |
| Boolean | TINYINT(1) or BIT | - | MySQL uses TINYINT |
| LocalDate | DATE | - | Date only |
| LocalDateTime | DATETIME(6) | @Column(columnDefinition="DATETIME(6)") | 6 = microseconds |
| Instant | DATETIME(6) | @Column(columnDefinition="DATETIME(6)") | UTC recommended |
| byte[] | BLOB | @Lob | Binary data |
| UUID | CHAR(36) or BINARY(16) | See UUID section | |
| Enum | VARCHAR | @Enumerated(EnumType.STRING) | Always use STRING |
Complete Entity with All Common Column Types
@Entity
@Table(name = "orders")
@Getter
@Setter
@NoArgsConstructor
public class Order extends AuditableEntity {
@Id
@GeneratedValue(strategy = GenerationType.IDENTITY)
private Long id;
// String with explicit max length
@Column(name = "order_number", nullable = false, unique = true, length = 30)
private String orderNumber;
// Money - always BigDecimal
@Column(name = "subtotal", nullable = false, precision = 12, scale = 2)
private BigDecimal subtotal;
@Column(name = "tax_amount", nullable = false, precision = 10, scale = 2)
private BigDecimal taxAmount;
@Column(name = "total_amount", nullable = false, precision = 12, scale = 2)
private BigDecimal totalAmount;
// Enum stored as string
@Enumerated(EnumType.STRING)
@Column(name = "status", nullable = false, length = 30)
private OrderStatus status;
// Timestamp - store in UTC, let the application handle timezone conversion
@Column(name = "placed_at", nullable = false)
private Instant placedAt;
@Column(name = "shipped_at") // nullable - not yet shipped
private Instant shippedAt;
@Column(name = "delivered_at")
private Instant deliveredAt;
// JSON stored as TEXT (alternatively use a JSON column type in MySQL 5.7+)
@Column(name = "shipping_address_snapshot", columnDefinition = "TEXT")
private String shippingAddressSnapshot; // Serialized JSON of address at order time
// Optimistic locking - covered in Part 5
@Version
@Column(name = "version", nullable = false)
private Long version;
// Computed field - not stored in DB
@Transient
private BigDecimal discountedTotal;
}4. Primary Key Strategies
Choosing the right primary key strategy is one of the most consequential decisions in your data model. It affects performance, scalability, and distributed system compatibility.
Strategy 1: IDENTITY (MySQL AUTO_INCREMENT)
@Id
@GeneratedValue(strategy = GenerationType.IDENTITY)
private Long id;How it works: MySQL auto-increments the value with each INSERT. Spring/Hibernate reads the generated value after insertion using LAST_INSERT_ID().
MySQL DDL:
id BIGINT NOT NULL AUTO_INCREMENT PRIMARY KEYPros:
- Simple, native MySQL support
- Integer type - small storage (8 bytes for BIGINT)
- Clustered index is sequential - excellent insert performance (pages fill in order)
- Easy to read in logs and debug
Cons:
- ID is only known AFTER insert - Hibernate CANNOT batch inserts with IDENTITY strategy
- Sequential IDs can be a security concern (enumeration attacks: just increment id by 1)
- Not suitable for distributed systems or data merging across databases
- Cannot pre-generate IDs for use before insertion
Best for: Single-database applications, most CRUD services, internal admin tools
Strategy 2: SEQUENCE (Not native to MySQL)
@Id
@GeneratedValue(strategy = GenerationType.SEQUENCE, generator = "product_seq")
@SequenceGenerator(name = "product_seq", sequenceName = "product_sequence",
allocationSize = 50)
private Long id;MySQL 8.0 does not support native sequences the way PostgreSQL does. Hibernate simulates sequences using a table, which introduces overhead. For MySQL, IDENTITY is strongly preferred over SEQUENCE.
Note: If you use PostgreSQL or Oracle, SEQUENCE with allocationSize > 1 is highly efficient because Hibernate pre-allocates a range of IDs in memory.
Strategy 3: UUID (Globally Unique, Distributed-Safe)
@Id
@GeneratedValue(strategy = GenerationType.UUID) // Spring Boot 3.x / Hibernate 6.x
@Column(name = "id", updatable = false, nullable = false, columnDefinition = "CHAR(36)")
private UUID id;How it works: Hibernate generates a random UUID (Version 4) before the INSERT. The ID is available immediately, before the database call.
MySQL DDL:
id CHAR(36) NOT NULL,
PRIMARY KEY (id)Pros:
- Globally unique across all databases, data centers, and time
- ID is available before insert - enables Hibernate batch inserts
- Safe for distributed systems and data merging
- Prevents enumeration attacks
Cons:
- 36 bytes (as CHAR) vs 8 bytes (BIGINT) - larger storage and indexes
- Random UUIDs cause B-tree index fragmentation (random inserts, not sequential)
- MySQL's clustered primary key becomes fragmented over time - this is a real performance issue at scale
- Harder to read and debug in logs
Performance Fix for UUID in MySQL: Use BINARY(16)
@Id
@Column(name = "id", columnDefinition = "BINARY(16)",
updatable = false, nullable = false)
private UUID id;
@PrePersist
protected void generateId() {
if (this.id == null) {
this.id = UUID.randomUUID();
}
}-- MySQL DDL
id BINARY(16) NOT NULL,
PRIMARY KEY (id)
-- Stores as 16 bytes instead of 36 bytes
-- Still random, still fragments - use UUID v7 or ULID for better performanceStrategy 4: TABLE (Deprecated - Avoid)
@Id
@GeneratedValue(strategy = GenerationType.TABLE)
private Long id;Hibernate uses a dedicated database table to track and increment ID values. This requires a SELECT + UPDATE for every single ID generation, creating a bottleneck. Never use this strategy. It is effectively deprecated.
ID Strategy Comparison Table
| Strategy | Storage | Batch Insert | Distributed Safe | Sequential | Security |
|---|---|---|---|---|---|
| IDENTITY (AUTO_INCREMENT) | 8 bytes | No | No | Yes | Enumerable |
| UUID v4 (CHAR/BINARY) | 16-36 bytes | Yes | Yes | No | Safe |
| UUID v7 | 16-36 bytes | Yes | Yes | Yes | Safe |
| ULID | 16 bytes | Yes | Yes | Yes | Safe |
| Snowflake | 8 bytes | Yes | Yes | Yes | Partially |
| TABLE | 8 bytes | No | No | Yes | Enumerable |
5. UUID as a Primary Key
UUID is the right choice when your application needs to generate IDs independently of the database (for distributed systems, event sourcing, offline operations).
UUID v4 (Random) - Standard Approach
package com.company.ecommerce.domain.user;
import jakarta.persistence.*;
import lombok.*;
import java.util.UUID;
@Entity
@Table(name = "users")
@Getter
@Setter
@NoArgsConstructor
public class User extends AuditableEntity {
@Id
@GeneratedValue(strategy = GenerationType.UUID)
@Column(name = "id", updatable = false, nullable = false, columnDefinition = "CHAR(36)")
private UUID id;
@Column(name = "email", nullable = false, unique = true, length = 255)
private String email;
@Column(name = "username", nullable = false, unique = true, length = 100)
private String username;
// Always store hashed passwords - never plaintext
@Column(name = "password_hash", nullable = false, length = 255)
private String passwordHash;
@Override
public boolean equals(Object o) {
if (this == o) return true;
if (!(o instanceof User user)) return false;
return id != null && id.equals(user.id);
}
@Override
public int hashCode() {
return getClass().hashCode();
}
}UUID v7 (Time-Ordered) - Better for MySQL
UUID v7 is time-ordered. The first 48 bits are a Unix timestamp in milliseconds, making UUIDs generated in sequence have sequential leading bytes. This dramatically reduces B-tree index fragmentation in MySQL.
// UUID v7 is available in Java through third-party libraries
// or you can implement it manually
import java.time.Instant;
import java.util.UUID;
public class UuidV7Generator {
public static UUID generate() {
long timestamp = Instant.now().toEpochMilli();
// High 32 bits: top 32 bits of timestamp
long msHigh = (timestamp >> 16) & 0xFFFFFFFFL;
// Bits 32-47: lower 16 bits of timestamp
long msLow = timestamp & 0xFFFFL;
// Version 7 bits
long ver = 0x7000L;
long random1 = (long)(Math.random() * 0x0FFFL);
long random2 = (long)(Math.random() * Long.MAX_VALUE);
long mostSigBits = (msHigh << 32) | (msLow << 16) | ver | random1;
long leastSigBits = (0x8000000000000000L) | random2;
return new UUID(mostSigBits, leastSigBits);
}
}6. Custom ID Generators
ULID (Universally Unique Lexicographically Sortable Identifier)
ULID is a modern alternative to UUID that is:
- Lexicographically sortable (time-ordered)
- URL-safe (no special characters)
- 128 bits encoded as 26 character string (Crockford Base32)
- Monotonically increasing within the same millisecond
<!-- pom.xml -->
<dependency>
<groupId>com.github.f4b6a3</groupId>
<artifactId>ulid-creator</artifactId>
<version>5.2.2</version>
</dependency>import com.github.f4b6a3.ulid.UlidCreator;
import jakarta.persistence.*;
@Entity
@Table(name = "events")
@Getter
@Setter
@NoArgsConstructor
public class DomainEvent {
@Id
@Column(name = "id", columnDefinition = "CHAR(26)",
updatable = false, nullable = false)
private String id;
@PrePersist
protected void generateId() {
if (this.id == null) {
// Generates monotonically increasing ULID
this.id = UlidCreator.getMonotonicUlid().toString();
}
}
@Column(name = "event_type", nullable = false, length = 100)
private String eventType;
@Column(name = "aggregate_id", nullable = false)
private String aggregateId;
@Column(name = "payload", columnDefinition = "JSON")
private String payload;
}Snowflake ID (Twitter-Style Distributed Integer ID)
Snowflake IDs are 64-bit integers composed of:
- 41 bits: timestamp (milliseconds since epoch)
- 10 bits: machine/datacenter ID
- 12 bits: sequence number within the same millisecond
Result: Sequential, globally unique, integer-sized, 8 bytes in MySQL.
@Component
public class SnowflakeIdGenerator {
private static final long EPOCH = 1700000000000L; // Nov 2023 custom epoch
private static final long MACHINE_ID_BITS = 10L;
private static final long SEQUENCE_BITS = 12L;
private static final long MAX_MACHINE_ID = ~(-1L << MACHINE_ID_BITS); // 1023
private static final long MAX_SEQUENCE = ~(-1L << SEQUENCE_BITS); // 4095
private final long machineId;
private long lastTimestamp = -1L;
private long sequence = 0L;
public SnowflakeIdGenerator(@Value("${app.snowflake.machine-id:1}") long machineId) {
if (machineId > MAX_MACHINE_ID || machineId < 0) {
throw new IllegalArgumentException("Machine ID must be between 0 and " + MAX_MACHINE_ID);
}
this.machineId = machineId;
}
public synchronized long nextId() {
long timestamp = System.currentTimeMillis();
if (timestamp < lastTimestamp) {
throw new IllegalStateException("Clock moved backwards. Refusing to generate ID.");
}
if (timestamp == lastTimestamp) {
sequence = (sequence + 1) & MAX_SEQUENCE;
if (sequence == 0) {
// Sequence exhausted - wait for next millisecond
while ((timestamp = System.currentTimeMillis()) <= lastTimestamp) {
Thread.onSpinWait();
}
}
} else {
sequence = 0L;
}
lastTimestamp = timestamp;
return ((timestamp - EPOCH) << (MACHINE_ID_BITS + SEQUENCE_BITS))
| (machineId << SEQUENCE_BITS)
| sequence;
}
}7. Composite Primary Keys
Use composite keys when no single column uniquely identifies a row but a combination of columns does.
Approach 1: @EmbeddedId (Recommended)
The composite key is an embedded value object.
package com.company.ecommerce.domain.order;
import jakarta.persistence.Column;
import jakarta.persistence.Embeddable;
import lombok.AllArgsConstructor;
import lombok.EqualsAndHashCode;
import lombok.Getter;
import lombok.NoArgsConstructor;
import java.io.Serializable;
/*
* @Embeddable: This class can be embedded inside an entity.
* Serializable: REQUIRED for composite keys used in second-level cache
* and for Hibernate internals.
* equals() and hashCode(): REQUIRED for correct behavior in persistence context.
* Hibernate uses these to identify entities in its cache.
*/
@Embeddable
@Getter
@NoArgsConstructor
@AllArgsConstructor
@EqualsAndHashCode // Generates equals/hashCode based on ALL fields - correct for IDs
public class OrderItemId implements Serializable {
@Column(name = "order_id")
private Long orderId;
@Column(name = "product_id")
private Long productId;
}@Entity
@Table(name = "order_items")
@Getter
@Setter
@NoArgsConstructor
public class OrderItem {
@EmbeddedId
private OrderItemId id;
/*
* @MapsId("orderId"): Maps this relationship to the "orderId" field
* in the @EmbeddedId. The foreign key column order_id serves dual purpose:
* part of the composite PK AND the FK to orders table.
*/
@ManyToOne(fetch = FetchType.LAZY)
@MapsId("orderId")
@JoinColumn(name = "order_id")
private Order order;
@ManyToOne(fetch = FetchType.LAZY)
@MapsId("productId")
@JoinColumn(name = "product_id")
private Product product;
@Column(name = "quantity", nullable = false)
private Integer quantity;
@Column(name = "unit_price", nullable = false, precision = 10, scale = 2)
private BigDecimal unitPrice;
@Column(name = "discount", precision = 5, scale = 2)
@Builder.Default
private BigDecimal discount = BigDecimal.ZERO;
// Constructor for easier creation
public OrderItem(Order order, Product product, Integer quantity, BigDecimal unitPrice) {
this.id = new OrderItemId(order.getId(), product.getId());
this.order = order;
this.product = product;
this.quantity = quantity;
this.unitPrice = unitPrice;
}
}Approach 2: @IdClass
The composite key fields are duplicated in both the key class and the entity.
// Key class
import java.io.Serializable;
import lombok.AllArgsConstructor;
import lombok.EqualsAndHashCode;
import lombok.NoArgsConstructor;
@NoArgsConstructor
@AllArgsConstructor
@EqualsAndHashCode
public class UserRoleId implements Serializable {
private Long userId; // Matches field name in UserRole entity
private Long roleId; // Matches field name in UserRole entity
}// Entity with @IdClass
@Entity
@Table(name = "user_roles")
@IdClass(UserRoleId.class)
@Getter
@Setter
@NoArgsConstructor
public class UserRole {
@Id
@ManyToOne(fetch = FetchType.LAZY)
@JoinColumn(name = "user_id", nullable = false)
private User user; // Field name "user" maps to "userId" in UserRoleId? No!
/*
* IMPORTANT: When using @IdClass with @ManyToOne,
* the field names in the ID class must match the relationship field names.
* This is why @EmbeddedId with @MapsId is usually cleaner.
*/
@Id
@Column(name = "user_id", insertable = false, updatable = false)
private Long userId;
@Id
@ManyToOne(fetch = FetchType.LAZY)
@JoinColumn(name = "role_id", nullable = false)
private Role role;
@Id
@Column(name = "role_id", insertable = false, updatable = false)
private Long roleId;
@Column(name = "assigned_at", nullable = false)
private Instant assignedAt;
@Column(name = "assigned_by", length = 100)
private String assignedBy;
}When to Use @EmbeddedId vs @IdClass
| Criteria | @EmbeddedId | @IdClass |
|---|---|---|
| Code clarity | Better - key is a proper value object | Fields duplicated in two places |
| JPQL queries | WHERE e.id.orderId = :id | WHERE e.orderId = :id |
| @MapsId support | Full support | Clunky |
| Overall preference | Recommended | Legacy / existing schemas |
8. Natural Keys vs Surrogate Keys
Surrogate Key (Recommended)
A surrogate key has no business meaning. It exists solely as a database identifier.
@Id
@GeneratedValue(strategy = GenerationType.IDENTITY)
private Long id; // Means nothing in the real worldWhy surrogate keys win:
- Business keys change (email can be updated, product codes can be reassigned)
- Joins are faster on a single integer than on multi-column natural keys
- Consistent behavior across all entities
Natural Key
A natural key comes from the data itself.
// Example: Using ISO country code as PK
@Entity
@Table(name = "countries")
public class Country {
@Id
@Column(name = "code", length = 3)
private String code; // "USA", "GBR", "IND" - natural, stable, meaningful
@Column(name = "name", nullable = false, length = 100)
private String name;
}Natural keys work well when:
- The key value is truly immutable (ISO codes, currency codes)
- The key is naturally short (3-letter codes)
- It prevents orphan data (natural FK enforces existence)
Hybrid approach (best practice): Use a surrogate key as the PK for JPA and indexing, but define a unique constraint on the natural key for business integrity.
@Entity
@Table(
name = "products",
uniqueConstraints = {
@UniqueConstraint(name = "uk_products_sku", columnNames = "sku")
}
)
public class Product {
@Id
@GeneratedValue(strategy = GenerationType.IDENTITY)
private Long id; // Surrogate key - PK for JPA
@Column(name = "sku", nullable = false, unique = true, length = 50)
private String sku; // Natural key - business identifier, unique constraint
}9. Enum Mapping
Basic Enum Mapping
// The Enum
public enum OrderStatus {
PENDING,
CONFIRMED,
PROCESSING,
SHIPPED,
DELIVERED,
CANCELLED,
REFUNDED
}
// In Entity
@Enumerated(EnumType.STRING) // Store "PENDING", "CONFIRMED", etc.
@Column(name = "status", nullable = false, length = 30)
private OrderStatus status;NEVER Use EnumType.ORDINAL
// BAD - ordinal stores the position (0, 1, 2, 3...)
@Enumerated(EnumType.ORDINAL) // DANGER
private OrderStatus status;
// If you later insert AWAITING_PAYMENT between PENDING and CONFIRMED:
// public enum OrderStatus { PENDING, AWAITING_PAYMENT, CONFIRMED, ... }
// All existing "CONFIRMED" records (stored as 1) now map to AWAITING_PAYMENT!
// This silently corrupts your data with no database error.Enum with Display Name and Descriptions
public enum ProductStatus {
ACTIVE("Active", "Product is available for purchase"),
INACTIVE("Inactive", "Product is hidden from catalog"),
OUT_OF_STOCK("Out of Stock", "Product is temporarily unavailable"),
DISCONTINUED("Discontinued", "Product is permanently removed");
private final String displayName;
private final String description;
ProductStatus(String displayName, String description) {
this.displayName = displayName;
this.description = description;
}
public String getDisplayName() { return displayName; }
public String getDescription() { return description; }
}10. Database Design and Normalization
Understanding normalization is essential for designing a schema that Spring Data JPA can work with efficiently.
First Normal Form (1NF)
Every column contains atomic (indivisible) values. No repeating groups.
-- VIOLATES 1NF: phone_numbers is a multi-value field
CREATE TABLE contacts (
id BIGINT PRIMARY KEY,
name VARCHAR(200),
phone_numbers VARCHAR(500) -- "555-1234,555-5678,555-9012" - WRONG
);
-- SATISFIES 1NF: One row per phone number
CREATE TABLE contacts (
id BIGINT PRIMARY KEY,
name VARCHAR(200)
);
CREATE TABLE contact_phones (
id BIGINT PRIMARY KEY AUTO_INCREMENT,
contact_id BIGINT NOT NULL,
phone_number VARCHAR(20) NOT NULL,
phone_type ENUM('MOBILE', 'HOME', 'WORK'),
FOREIGN KEY (contact_id) REFERENCES contacts(id)
);Second Normal Form (2NF)
Meets 1NF. Every non-key column depends on the WHOLE primary key (eliminates partial dependencies in composite keys).
-- VIOLATES 2NF: product_name depends only on product_id, not on (order_id, product_id)
CREATE TABLE order_items (
order_id BIGINT,
product_id BIGINT,
product_name VARCHAR(200), -- Partial dependency - WRONG
quantity INT,
unit_price DECIMAL(10,2),
PRIMARY KEY (order_id, product_id)
);
-- SATISFIES 2NF: product_name belongs in products table
CREATE TABLE order_items (
order_id BIGINT,
product_id BIGINT,
quantity INT,
unit_price DECIMAL(10,2), -- snapshot price at time of order - OK
PRIMARY KEY (order_id, product_id)
);Third Normal Form (3NF)
Meets 2NF. Every non-key column depends directly on the primary key (eliminates transitive dependencies).
-- VIOLATES 3NF: city and postal_code depend on zip_code, not on order_id
CREATE TABLE orders (
id BIGINT PRIMARY KEY,
customer_id BIGINT,
zip_code VARCHAR(10),
city VARCHAR(100), -- Depends on zip_code, not id - WRONG
state VARCHAR(50) -- Depends on zip_code, not id - WRONG
);
-- SATISFIES 3NF: Address data normalized into its own table
CREATE TABLE orders (
id BIGINT PRIMARY KEY,
customer_id BIGINT,
shipping_address_id BIGINT, -- FK to addresses table
FOREIGN KEY (shipping_address_id) REFERENCES addresses(id)
);When to Denormalize
In production, sometimes controlled denormalization is correct:
Order item price snapshot: When a user places an order, you record the price at that moment. Even if the product price changes later, the order history is preserved. This is intentional denormalization.
@Column(name = "unit_price", nullable = false, precision = 10, scale = 2)
private BigDecimal unitPrice; // Snapshot of product.price at order time - intentional11. Index Strategies
Indexes are the most impactful performance tool in MySQL. A missing index on a WHERE clause column can turn a 10ms query into a 10 second query.
MySQL Index Types
| Index Type | Use Case | MySQL Syntax |
|---|---|---|
| B-Tree (default) | Equality, range, ORDER BY | INDEX |
| Unique | Enforce uniqueness + fast lookup | UNIQUE INDEX |
| Full-Text | Text search (LIKE '%word%') | FULLTEXT INDEX |
| Clustered | Primary Key index (automatic) | PRIMARY KEY |
| Covering | All query columns in the index | INDEX (col1, col2, col3) |
Index Strategy in JPA Entity
@Entity
@Table(
name = "orders",
indexes = {
// Single column indexes
@Index(name = "idx_orders_customer_id", columnList = "customer_id"),
@Index(name = "idx_orders_status", columnList = "status"),
// Composite index for common query: WHERE status = ? ORDER BY placed_at DESC
@Index(name = "idx_orders_status_placed_at", columnList = "status, placed_at"),
// Composite index for: WHERE customer_id = ? AND status = ?
@Index(name = "idx_orders_customer_status", columnList = "customer_id, status"),
// Unique index for order number
@Index(name = "uk_orders_order_number", columnList = "order_number", unique = true)
}
)
public class Order { ... }Index Design Rules
Rule 1: Index columns used in WHERE clauses
-- This query needs an index on status
SELECT * FROM orders WHERE status = 'PENDING';Rule 2: The leftmost prefix rule for composite indexes
A composite index on (customer_id, status, placed_at) supports:
- WHERE customer_id = ?
- WHERE customer_id = ? AND status = ?
- WHERE customer_id = ? AND status = ? AND placed_at > ?
But NOT:
- WHERE status = ? (cannot use index without leading customer_id)
Rule 3: Covering index for read-heavy queries
-- If this query runs millions of times per day:
SELECT order_number, status, total_amount FROM orders WHERE customer_id = ?
-- Create a covering index that contains all needed columns:
CREATE INDEX idx_orders_customer_covering
ON orders (customer_id, order_number, status, total_amount);
-- MySQL can satisfy this query entirely from the index without touching the data pagesRule 4: Foreign key columns always need indexes
MySQL does not automatically create indexes on foreign key columns. Always add them.
-- FK column without index causes full table scan when querying related records
ALTER TABLE order_items ADD INDEX idx_order_items_order_id (order_id);
ALTER TABLE order_items ADD INDEX idx_order_items_product_id (product_id);12. Flyway Database Migrations
Flyway is the industry standard for managing MySQL schema changes in Spring Boot applications. It tracks which scripts have been applied and ensures all environments have identical schema.
Flyway Concepts
- Versioned migrations (V prefix): Applied once in version order. Never modified after applied.
- Repeatable migrations (R prefix): Re-applied when their checksum changes. Used for views, stored procedures.
- Undo migrations (U prefix): Reverse a versioned migration. Requires Flyway Pro.
Naming Convention
V{version}__{description}.sql
V1__create_users_table.sql # First migration
V2__create_products_table.sql # Second migration
V3__create_orders_table.sql # Third migration
V4__add_index_to_products.sql # Additive change
V5__add_user_profile_table.sql # New table
V6__add_product_sku_column.sql # Add column
V1 - Create Users Table
-- V1__create_users_table.sql
-- Character set utf8mb4 supports all Unicode including emoji
-- Collation utf8mb4_unicode_ci provides correct case-insensitive sorting
CREATE TABLE users (
id BIGINT NOT NULL AUTO_INCREMENT,
email VARCHAR(255) NOT NULL,
username VARCHAR(100) NOT NULL,
password_hash VARCHAR(255) NOT NULL,
status ENUM('ACTIVE', 'INACTIVE', 'SUSPENDED') NOT NULL DEFAULT 'ACTIVE',
created_at DATETIME(6) NOT NULL DEFAULT CURRENT_TIMESTAMP(6),
updated_at DATETIME(6) NOT NULL DEFAULT CURRENT_TIMESTAMP(6)
ON UPDATE CURRENT_TIMESTAMP(6),
created_by VARCHAR(100),
updated_by VARCHAR(100),
version BIGINT NOT NULL DEFAULT 0,
PRIMARY KEY (id),
UNIQUE KEY uk_users_email (email),
UNIQUE KEY uk_users_username (username),
INDEX idx_users_status (status)
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_unicode_ci;V2 - Create User Profiles Table
-- V2__create_user_profiles_table.sql
CREATE TABLE user_profiles (
id BIGINT NOT NULL AUTO_INCREMENT,
user_id BIGINT NOT NULL,
first_name VARCHAR(100),
last_name VARCHAR(100),
bio TEXT,
avatar_url VARCHAR(500),
date_of_birth DATE,
phone_number VARCHAR(20),
created_at DATETIME(6) NOT NULL DEFAULT CURRENT_TIMESTAMP(6),
updated_at DATETIME(6) NOT NULL DEFAULT CURRENT_TIMESTAMP(6)
ON UPDATE CURRENT_TIMESTAMP(6),
PRIMARY KEY (id),
UNIQUE KEY uk_user_profiles_user_id (user_id),
CONSTRAINT fk_user_profiles_user_id
FOREIGN KEY (user_id) REFERENCES users (id) ON DELETE CASCADE
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_unicode_ci;V3 - Create Products and Categories
-- V3__create_products_categories.sql
CREATE TABLE categories (
id BIGINT NOT NULL AUTO_INCREMENT,
name VARCHAR(100) NOT NULL,
slug VARCHAR(100) NOT NULL,
description TEXT,
parent_id BIGINT,
created_at DATETIME(6) NOT NULL DEFAULT CURRENT_TIMESTAMP(6),
updated_at DATETIME(6) NOT NULL DEFAULT CURRENT_TIMESTAMP(6)
ON UPDATE CURRENT_TIMESTAMP(6),
PRIMARY KEY (id),
UNIQUE KEY uk_categories_slug (slug),
CONSTRAINT fk_categories_parent_id
FOREIGN KEY (parent_id) REFERENCES categories (id)
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_unicode_ci;
CREATE TABLE products (
id BIGINT NOT NULL AUTO_INCREMENT,
sku VARCHAR(50) NOT NULL,
name VARCHAR(200) NOT NULL,
description TEXT,
price DECIMAL(10,2) NOT NULL,
stock_quantity INT NOT NULL DEFAULT 0,
status ENUM('ACTIVE','INACTIVE','OUT_OF_STOCK','DISCONTINUED')
NOT NULL DEFAULT 'ACTIVE',
created_at DATETIME(6) NOT NULL DEFAULT CURRENT_TIMESTAMP(6),
updated_at DATETIME(6) NOT NULL DEFAULT CURRENT_TIMESTAMP(6)
ON UPDATE CURRENT_TIMESTAMP(6),
created_by VARCHAR(100),
updated_by VARCHAR(100),
version BIGINT NOT NULL DEFAULT 0,
PRIMARY KEY (id),
UNIQUE KEY uk_products_sku (sku),
INDEX idx_products_status (status),
INDEX idx_products_price (price),
INDEX idx_products_status_price (status, price)
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_unicode_ci;
CREATE TABLE product_categories (
product_id BIGINT NOT NULL,
category_id BIGINT NOT NULL,
PRIMARY KEY (product_id, category_id),
INDEX idx_product_categories_category_id (category_id),
CONSTRAINT fk_product_categories_product_id
FOREIGN KEY (product_id) REFERENCES products (id) ON DELETE CASCADE,
CONSTRAINT fk_product_categories_category_id
FOREIGN KEY (category_id) REFERENCES categories (id) ON DELETE CASCADE
) ENGINE=InnoDB;V4 - Create Orders
-- V4__create_orders_table.sql
CREATE TABLE orders (
id BIGINT NOT NULL AUTO_INCREMENT,
order_number VARCHAR(30) NOT NULL,
customer_id BIGINT NOT NULL,
status ENUM('PENDING','CONFIRMED','PROCESSING','SHIPPED',
'DELIVERED','CANCELLED','REFUNDED')
NOT NULL DEFAULT 'PENDING',
subtotal DECIMAL(12,2) NOT NULL,
tax_amount DECIMAL(10,2) NOT NULL DEFAULT 0.00,
total_amount DECIMAL(12,2) NOT NULL,
placed_at DATETIME(6) NOT NULL DEFAULT CURRENT_TIMESTAMP(6),
shipped_at DATETIME(6),
delivered_at DATETIME(6),
shipping_address_id BIGINT,
created_at DATETIME(6) NOT NULL DEFAULT CURRENT_TIMESTAMP(6),
updated_at DATETIME(6) NOT NULL DEFAULT CURRENT_TIMESTAMP(6)
ON UPDATE CURRENT_TIMESTAMP(6),
version BIGINT NOT NULL DEFAULT 0,
PRIMARY KEY (id),
UNIQUE KEY uk_orders_order_number (order_number),
INDEX idx_orders_customer_id (customer_id),
INDEX idx_orders_status (status),
INDEX idx_orders_customer_status (customer_id, status),
INDEX idx_orders_placed_at (placed_at),
CONSTRAINT fk_orders_customer_id
FOREIGN KEY (customer_id) REFERENCES users (id)
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_unicode_ci;
CREATE TABLE order_items (
order_id BIGINT NOT NULL,
product_id BIGINT NOT NULL,
quantity INT NOT NULL,
unit_price DECIMAL(10,2) NOT NULL,
discount DECIMAL(5,2) NOT NULL DEFAULT 0.00,
PRIMARY KEY (order_id, product_id),
INDEX idx_order_items_product_id (product_id),
CONSTRAINT fk_order_items_order_id
FOREIGN KEY (order_id) REFERENCES orders (id) ON DELETE CASCADE,
CONSTRAINT fk_order_items_product_id
FOREIGN KEY (product_id) REFERENCES products (id)
) ENGINE=InnoDB;Flyway Configuration in application.yml
spring:
flyway:
enabled: true
locations: classpath:db/migration
baseline-on-migrate: false # Only set true for FIRST migration on an existing DB
validate-on-migrate: true # Validates script checksums before applying
out-of-order: false # Never allow out-of-order in production
table: flyway_schema_history # Default table Flyway uses to track state
connect-retries: 3 # Retry DB connection 3 times before failing
placeholders:
schema: ecommerce # Can reference ${schema} in SQL scriptsFlyway Best Practices
Never modify an applied migration script. Flyway checksums each script. Modifying it will cause FlywayException: Validate failed: Migration checksum mismatch and prevent your application from starting.
Always test migrations on a copy of production data before applying. Schema changes like adding a NOT NULL column without a default will fail if the table has existing rows.
Use descriptive migration names. V12__add_index_to_orders_placed_at.sql is infinitely more useful than V12__fix.sql.
13. The Base Auditable Entity
All business entities should track who created them, when, who last modified them, and when.
package com.company.ecommerce.common.audit;
import jakarta.persistence.*;
import lombok.Getter;
import lombok.Setter;
import org.springframework.data.annotation.CreatedBy;
import org.springframework.data.annotation.CreatedDate;
import org.springframework.data.annotation.LastModifiedBy;
import org.springframework.data.annotation.LastModifiedDate;
import org.springframework.data.jpa.domain.support.AuditingEntityListener;
import java.time.Instant;
/*
* @MappedSuperclass: This class is NOT a table itself.
* Its fields are mapped into the tables of all subclasses.
* Every entity that extends AuditableEntity gets these 4 columns automatically.
*
* @EntityListeners(AuditingEntityListener.class): Activates Spring Data JPA's
* automatic population of @CreatedDate, @LastModifiedDate, @CreatedBy, @LastModifiedBy.
* Requires @EnableJpaAuditing on the main class or a @Configuration class.
*/
@MappedSuperclass
@EntityListeners(AuditingEntityListener.class)
@Getter
@Setter
public abstract class AuditableEntity {
/*
* @CreatedDate: Automatically populated with current timestamp on INSERT.
* updatable = false: Never changed after initial insert.
*/
@CreatedDate
@Column(name = "created_at", nullable = false, updatable = false)
private Instant createdAt;
/*
* @LastModifiedDate: Automatically updated on every UPDATE.
*/
@LastModifiedDate
@Column(name = "updated_at", nullable = false)
private Instant updatedAt;
/*
* @CreatedBy: Populated with the current user's identifier.
* The value comes from AuditorAware<String> bean.
*/
@CreatedBy
@Column(name = "created_by", length = 100, updatable = false)
private String createdBy;
/*
* @LastModifiedBy: Updated with the current user on every UPDATE.
*/
@LastModifiedBy
@Column(name = "updated_by", length = 100)
private String updatedBy;
}package com.company.ecommerce.common.audit;
import org.springframework.data.domain.AuditorAware;
import org.springframework.security.core.Authentication;
import org.springframework.security.core.context.SecurityContextHolder;
import org.springframework.stereotype.Component;
import java.util.Optional;
/*
* Spring Data JPA calls getCurrentAuditor() before every INSERT and UPDATE.
* This implementation returns the username of the currently authenticated user.
* For system operations (scheduled jobs, migrations), it returns "SYSTEM".
*/
@Component("auditorAware")
public class AuditorAwareImpl implements AuditorAware<String> {
@Override
public Optional<String> getCurrentAuditor() {
Authentication auth = SecurityContextHolder.getContext().getAuthentication();
if (auth == null || !auth.isAuthenticated()
|| "anonymousUser".equals(auth.getPrincipal())) {
return Optional.of("SYSTEM");
}
return Optional.of(auth.getName());
}
}14. Tips and Best Practices
Tip 1: Always Use BIGINT for IDs, Not INT
MySQL INT supports up to 2,147,483,647 rows. A busy production system can exhaust INT in years. Use BIGINT (up to 9.2 quintillion). The storage difference is only 4 bytes per row.
Tip 2: Always Include Version Column for Optimistic Locking
Add version BIGINT NOT NULL DEFAULT 0 to every table that will have concurrent writes. This enables Hibernate's optimistic locking out of the box without significant overhead.
Tip 3: Use DATETIME(6) Not TIMESTAMP for MySQL
DATETIME:
- Range: 1000-01-01 to 9999-12-31
- Not affected by timezone changes
- Stores what you give it
TIMESTAMP:
- Range: 1970-2038 (the year 2038 problem)
- Converts to UTC on insert, back to session timezone on read
- Affected by MySQL timezone configuration
Use DATETIME(6) for application-managed timestamps (Java manages the value, stored as-is).
Tip 4: Use ENGINE=InnoDB and utf8mb4
InnoDB supports transactions, foreign keys, and row-level locking. MyISAM does not support transactions. Always use InnoDB.
utf8mb4 is the correct character set for modern applications. The older utf8 in MySQL is actually UTF-8 with 3-byte limit, which cannot store emoji or some rare Unicode characters.
Tip 5: Store Prices in Minor Units for High-Precision Finance
// Option A: BigDecimal (standard for most applications)
@Column(name = "price", precision = 10, scale = 2)
private BigDecimal price; // $49.99 stored as 49.99
// Option B: Minor units (cents) for financial systems
@Column(name = "price_cents")
private Long priceCents; // $49.99 stored as 4999
// Eliminates all floating point issues, easy arithmetic
// Used by Stripe and many financial APIs15. Anti-Patterns and Pitfalls
Pitfall 1: Using @Data on JPA Entities
// WRONG - @Data generates equals/hashCode including all fields and relationships
@Entity
@Data // NEVER use this on JPA entities
public class Order {
private Long id;
@OneToMany
private List<OrderItem> items; // Included in equals/hashCode -> StackOverflow!
}Use @Getter, @Setter, @NoArgsConstructor separately. Write equals/hashCode manually or use @EqualsAndHashCode(of = "id").
Pitfall 2: Not Having equals/hashCode in Composite Key Classes
// WRONG - missing equals/hashCode on @Embeddable
@Embeddable
public class OrderItemId implements Serializable {
private Long orderId;
private Long productId;
// No equals/hashCode -> Hibernate cannot correctly identify entities in cache
// findById() may return wrong results
}
// CORRECT
@Embeddable
@EqualsAndHashCode // Lombok generates correct equals/hashCode for all fields
public class OrderItemId implements Serializable {
private Long orderId;
private Long productId;
}Pitfall 3: Missing Flyway Migration After Changing Entity
// You added a new field to Product entity
@Column(name = "weight_grams")
private Integer weightGrams;
// But forgot to create a Flyway migration to add the column
// Result with ddl-auto=validate: Application fails to start with
// "Schema-validation: missing column [weight_grams] in table [products]"
// This is actually a feature - it forces you to write the migration.Pitfall 4: Using String for Money
// WRONG
private String price = "49.99"; // Parsing errors, no arithmetic, useless for sorting
// ALSO WRONG
private double price = 49.99; // Floating point imprecision: 49.99 + 0.01 != 50.00
// CORRECT
private BigDecimal price = new BigDecimal("49.99"); // Always use String constructor for BigDecimalPitfall 5: Not Indexing Foreign Key Columns
MySQL does not automatically create an index on foreign key columns. Without an index, any query joining from child to parent (or querying children by parent ID) does a full table scan.
-- Common mistake: FK without index
ALTER TABLE order_items
ADD CONSTRAINT fk_order_items_order FOREIGN KEY (order_id) REFERENCES orders(id);
-- No INDEX on order_id!
-- Every call to order.getItems() or WHERE order_id = ? does a full table scan
-- Correct:
ALTER TABLE order_items
ADD INDEX idx_order_items_order_id (order_id);Continue to Part 3: Relationships - One-To-One and One-To-Many