Good schema design is the foundation of a reliable application. A well-designed schema makes queries simple, prevents data inconsistencies, and scales as your application grows.
The Design Process
Follow these steps when designing a schema:
- Identify the entities (things you need to track)
- Define the attributes (properties) of each entity
- Determine the relationships between entities
- Choose primary keys and add foreign keys
- Apply normalization rules
- Add indexes for common query patterns
Relationship Types in Practice
One-to-One
Each row in table A relates to exactly one row in table B. Use when you want to split a table for performance or security reasons.
CREATE TABLE users (
id INTEGER PRIMARY KEY AUTOINCREMENT,
email TEXT UNIQUE NOT NULL,
password_hash TEXT NOT NULL
);
CREATE TABLE profiles (
id INTEGER PRIMARY KEY AUTOINCREMENT,
user_id INTEGER UNIQUE NOT NULL,
display_name TEXT,
bio TEXT,
avatar_url TEXT,
FOREIGN KEY (user_id) REFERENCES users(id) ON DELETE CASCADE
);The UNIQUE constraint on user_id ensures the one-to-one relationship.
One-to-Many
The most common relationship. One row in table A relates to many rows in table B.
CREATE TABLE authors (
id INTEGER PRIMARY KEY AUTOINCREMENT,
name TEXT NOT NULL
);
CREATE TABLE books (
id INTEGER PRIMARY KEY AUTOINCREMENT,
title TEXT NOT NULL,
author_id INTEGER NOT NULL,
published_year INTEGER,
FOREIGN KEY (author_id) REFERENCES authors(id)
);One author writes many books. Each book has exactly one author.
Many-to-Many
Requires a junction table to connect the two entities:
CREATE TABLE students (
id INTEGER PRIMARY KEY AUTOINCREMENT,
name TEXT NOT NULL
);
CREATE TABLE courses (
id INTEGER PRIMARY KEY AUTOINCREMENT,
title TEXT NOT NULL
);
CREATE TABLE enrollments (
student_id INTEGER NOT NULL,
course_id INTEGER NOT NULL,
enrolled_at DATETIME DEFAULT CURRENT_TIMESTAMP,
grade TEXT,
PRIMARY KEY (student_id, course_id),
FOREIGN KEY (student_id) REFERENCES students(id) ON DELETE CASCADE,
FOREIGN KEY (course_id) REFERENCES courses(id) ON DELETE CASCADE
);The junction table can hold additional data about the relationship (like enrolled_at and grade).
JOIN Queries
JOINs combine data from related tables:
INNER JOIN
Returns rows that have matches in both tables:
SELECT
b.title,
a.name AS author_name,
b.published_year
FROM books b
INNER JOIN authors a ON b.author_id = a.id
ORDER BY b.published_year DESC;LEFT JOIN
Returns all rows from the left table, plus matched rows from the right:
-- All authors, even those with no books
SELECT
a.name,
COUNT(b.id) AS book_count
FROM authors a
LEFT JOIN books b ON a.id = b.author_id
GROUP BY a.id, a.name;Multiple JOINs
-- Students with their courses
SELECT
s.name AS student,
c.title AS course,
e.grade
FROM enrollments e
JOIN students s ON e.student_id = s.id
JOIN courses c ON e.course_id = c.id
ORDER BY s.name, c.title;JOIN Comparison
| JOIN Type | Returns |
|---|---|
| INNER JOIN | Only rows with matches in both tables |
| LEFT JOIN | All rows from left table + matched right rows |
| RIGHT JOIN | All rows from right table + matched left rows |
| FULL JOIN | All rows from both tables (matched and unmatched) |
Cascade Actions
Define what happens when a referenced row is deleted or updated:
CREATE TABLE comments (
id INTEGER PRIMARY KEY AUTOINCREMENT,
post_id INTEGER NOT NULL,
body TEXT NOT NULL,
FOREIGN KEY (post_id) REFERENCES posts(id)
ON DELETE CASCADE
ON UPDATE CASCADE
);| Action | Behavior |
|---|---|
CASCADE | Delete/update child rows when parent changes |
SET NULL | Set foreign key to NULL when parent is deleted |
RESTRICT | Prevent deletion of parent if children exist |
SET DEFAULT | Set foreign key to its default value |
Normalization Levels
First Normal Form (1NF)
Every column holds a single atomic value. No repeating groups.
Violation:
| id | name | phone_numbers |
|----|-------|------------------------|
| 1 | Alice | 555-0100, 555-0101 |Fixed:
| id | name | phone_number |
|----|-------|--------------|
| 1 | Alice | 555-0100 |
| 1 | Alice | 555-0101 |Or better, use a separate phone_numbers table.
Second Normal Form (2NF)
Must be in 1NF, and every non-key column depends on the entire primary key.
Third Normal Form (3NF)
Must be in 2NF, and no non-key column depends on another non-key column.
Violation:
| order_id | customer_id | customer_name | customer_email |customer_name and customer_email depend on customer_id, not on order_id. Move them to a customers table.
When to Denormalize
Sometimes breaking normalization rules improves performance:
- Read-heavy applications — duplicate data to avoid expensive JOINs
- Reporting dashboards — pre-compute aggregates into summary tables
- Caching layers — store computed values to reduce query load
Always start normalized and denormalize only when you have a measured performance problem.
Practical Exercise
Design an e-commerce database:
CREATE TABLE customers (
id INTEGER PRIMARY KEY AUTOINCREMENT,
name TEXT NOT NULL,
email TEXT UNIQUE NOT NULL,
created_at DATETIME DEFAULT CURRENT_TIMESTAMP
);
CREATE TABLE categories (
id INTEGER PRIMARY KEY AUTOINCREMENT,
name TEXT UNIQUE NOT NULL,
parent_id INTEGER,
FOREIGN KEY (parent_id) REFERENCES categories(id)
);
CREATE TABLE products (
id INTEGER PRIMARY KEY AUTOINCREMENT,
name TEXT NOT NULL,
description TEXT,
price REAL NOT NULL CHECK (price > 0),
stock INTEGER NOT NULL DEFAULT 0,
category_id INTEGER,
FOREIGN KEY (category_id) REFERENCES categories(id)
);
CREATE TABLE orders (
id INTEGER PRIMARY KEY AUTOINCREMENT,
customer_id INTEGER NOT NULL,
status TEXT NOT NULL DEFAULT 'pending'
CHECK (status IN ('pending', 'processing', 'shipped', 'delivered', 'cancelled')),
total REAL NOT NULL,
ordered_at DATETIME DEFAULT CURRENT_TIMESTAMP,
FOREIGN KEY (customer_id) REFERENCES customers(id)
);
CREATE TABLE order_items (
id INTEGER PRIMARY KEY AUTOINCREMENT,
order_id INTEGER NOT NULL,
product_id INTEGER NOT NULL,
quantity INTEGER NOT NULL CHECK (quantity > 0),
unit_price REAL NOT NULL,
FOREIGN KEY (order_id) REFERENCES orders(id) ON DELETE CASCADE,
FOREIGN KEY (product_id) REFERENCES products(id)
);
-- Useful indexes
CREATE INDEX idx_orders_customer ON orders(customer_id);
CREATE INDEX idx_order_items_order ON order_items(order_id);
CREATE INDEX idx_products_category ON products(category_id);Query the schema:
-- Customer order history with totals
SELECT
c.name AS customer,
COUNT(DISTINCT o.id) AS order_count,
SUM(o.total) AS lifetime_value
FROM customers c
JOIN orders o ON c.id = o.customer_id
WHERE o.status != 'cancelled'
GROUP BY c.id
ORDER BY lifetime_value DESC;Key Takeaways
- Identify entities, attributes, and relationships before writing any SQL.
- Use junction tables for many-to-many relationships — they can hold additional metadata.
- JOINs combine related tables; INNER JOIN returns only matches, LEFT JOIN returns all from the left side.
- Cascade actions define behavior when parent rows are deleted or updated.
- Start with a normalized schema and denormalize only when performance requires it.