Skip to main content
Database Design·Lesson 3 of 5

Schema Design

Good schema design is the foundation of a reliable application. A well-designed schema makes queries simple, prevents data inconsistencies, and scales as your application grows.

The Design Process

Follow these steps when designing a schema:

  1. Identify the entities (things you need to track)
  2. Define the attributes (properties) of each entity
  3. Determine the relationships between entities
  4. Choose primary keys and add foreign keys
  5. Apply normalization rules
  6. Add indexes for common query patterns

Relationship Types in Practice

One-to-One

Each row in table A relates to exactly one row in table B. Use when you want to split a table for performance or security reasons.

CREATE TABLE users (
  id INTEGER PRIMARY KEY AUTOINCREMENT,
  email TEXT UNIQUE NOT NULL,
  password_hash TEXT NOT NULL
);

CREATE TABLE profiles (
  id INTEGER PRIMARY KEY AUTOINCREMENT,
  user_id INTEGER UNIQUE NOT NULL,
  display_name TEXT,
  bio TEXT,
  avatar_url TEXT,
  FOREIGN KEY (user_id) REFERENCES users(id) ON DELETE CASCADE
);

The UNIQUE constraint on user_id ensures the one-to-one relationship.

One-to-Many

The most common relationship. One row in table A relates to many rows in table B.

CREATE TABLE authors (
  id INTEGER PRIMARY KEY AUTOINCREMENT,
  name TEXT NOT NULL
);

CREATE TABLE books (
  id INTEGER PRIMARY KEY AUTOINCREMENT,
  title TEXT NOT NULL,
  author_id INTEGER NOT NULL,
  published_year INTEGER,
  FOREIGN KEY (author_id) REFERENCES authors(id)
);

One author writes many books. Each book has exactly one author.

Many-to-Many

Requires a junction table to connect the two entities:

CREATE TABLE students (
  id INTEGER PRIMARY KEY AUTOINCREMENT,
  name TEXT NOT NULL
);

CREATE TABLE courses (
  id INTEGER PRIMARY KEY AUTOINCREMENT,
  title TEXT NOT NULL
);

CREATE TABLE enrollments (
  student_id INTEGER NOT NULL,
  course_id INTEGER NOT NULL,
  enrolled_at DATETIME DEFAULT CURRENT_TIMESTAMP,
  grade TEXT,
  PRIMARY KEY (student_id, course_id),
  FOREIGN KEY (student_id) REFERENCES students(id) ON DELETE CASCADE,
  FOREIGN KEY (course_id) REFERENCES courses(id) ON DELETE CASCADE
);

The junction table can hold additional data about the relationship (like enrolled_at and grade).

JOIN Queries

JOINs combine data from related tables:

INNER JOIN

Returns rows that have matches in both tables:

SELECT
  b.title,
  a.name AS author_name,
  b.published_year
FROM books b
INNER JOIN authors a ON b.author_id = a.id
ORDER BY b.published_year DESC;

LEFT JOIN

Returns all rows from the left table, plus matched rows from the right:

-- All authors, even those with no books
SELECT
  a.name,
  COUNT(b.id) AS book_count
FROM authors a
LEFT JOIN books b ON a.id = b.author_id
GROUP BY a.id, a.name;

Multiple JOINs

-- Students with their courses
SELECT
  s.name AS student,
  c.title AS course,
  e.grade
FROM enrollments e
JOIN students s ON e.student_id = s.id
JOIN courses c ON e.course_id = c.id
ORDER BY s.name, c.title;

JOIN Comparison

JOIN TypeReturns
INNER JOINOnly rows with matches in both tables
LEFT JOINAll rows from left table + matched right rows
RIGHT JOINAll rows from right table + matched left rows
FULL JOINAll rows from both tables (matched and unmatched)

Cascade Actions

Define what happens when a referenced row is deleted or updated:

CREATE TABLE comments (
  id INTEGER PRIMARY KEY AUTOINCREMENT,
  post_id INTEGER NOT NULL,
  body TEXT NOT NULL,
  FOREIGN KEY (post_id) REFERENCES posts(id)
    ON DELETE CASCADE
    ON UPDATE CASCADE
);
ActionBehavior
CASCADEDelete/update child rows when parent changes
SET NULLSet foreign key to NULL when parent is deleted
RESTRICTPrevent deletion of parent if children exist
SET DEFAULTSet foreign key to its default value

Normalization Levels

First Normal Form (1NF)

Every column holds a single atomic value. No repeating groups.

Violation:

| id | name  | phone_numbers          |
|----|-------|------------------------|
| 1  | Alice | 555-0100, 555-0101     |

Fixed:

| id | name  | phone_number |
|----|-------|--------------|
| 1  | Alice | 555-0100     |
| 1  | Alice | 555-0101     |

Or better, use a separate phone_numbers table.

Second Normal Form (2NF)

Must be in 1NF, and every non-key column depends on the entire primary key.

Third Normal Form (3NF)

Must be in 2NF, and no non-key column depends on another non-key column.

Violation:

| order_id | customer_id | customer_name | customer_email |

customer_name and customer_email depend on customer_id, not on order_id. Move them to a customers table.

When to Denormalize

Sometimes breaking normalization rules improves performance:

  • Read-heavy applications — duplicate data to avoid expensive JOINs
  • Reporting dashboards — pre-compute aggregates into summary tables
  • Caching layers — store computed values to reduce query load

Always start normalized and denormalize only when you have a measured performance problem.

Practical Exercise

Design an e-commerce database:

CREATE TABLE customers (
  id INTEGER PRIMARY KEY AUTOINCREMENT,
  name TEXT NOT NULL,
  email TEXT UNIQUE NOT NULL,
  created_at DATETIME DEFAULT CURRENT_TIMESTAMP
);

CREATE TABLE categories (
  id INTEGER PRIMARY KEY AUTOINCREMENT,
  name TEXT UNIQUE NOT NULL,
  parent_id INTEGER,
  FOREIGN KEY (parent_id) REFERENCES categories(id)
);

CREATE TABLE products (
  id INTEGER PRIMARY KEY AUTOINCREMENT,
  name TEXT NOT NULL,
  description TEXT,
  price REAL NOT NULL CHECK (price > 0),
  stock INTEGER NOT NULL DEFAULT 0,
  category_id INTEGER,
  FOREIGN KEY (category_id) REFERENCES categories(id)
);

CREATE TABLE orders (
  id INTEGER PRIMARY KEY AUTOINCREMENT,
  customer_id INTEGER NOT NULL,
  status TEXT NOT NULL DEFAULT 'pending'
    CHECK (status IN ('pending', 'processing', 'shipped', 'delivered', 'cancelled')),
  total REAL NOT NULL,
  ordered_at DATETIME DEFAULT CURRENT_TIMESTAMP,
  FOREIGN KEY (customer_id) REFERENCES customers(id)
);

CREATE TABLE order_items (
  id INTEGER PRIMARY KEY AUTOINCREMENT,
  order_id INTEGER NOT NULL,
  product_id INTEGER NOT NULL,
  quantity INTEGER NOT NULL CHECK (quantity > 0),
  unit_price REAL NOT NULL,
  FOREIGN KEY (order_id) REFERENCES orders(id) ON DELETE CASCADE,
  FOREIGN KEY (product_id) REFERENCES products(id)
);

-- Useful indexes
CREATE INDEX idx_orders_customer ON orders(customer_id);
CREATE INDEX idx_order_items_order ON order_items(order_id);
CREATE INDEX idx_products_category ON products(category_id);

Query the schema:

-- Customer order history with totals
SELECT
  c.name AS customer,
  COUNT(DISTINCT o.id) AS order_count,
  SUM(o.total) AS lifetime_value
FROM customers c
JOIN orders o ON c.id = o.customer_id
WHERE o.status != 'cancelled'
GROUP BY c.id
ORDER BY lifetime_value DESC;

Key Takeaways

  • Identify entities, attributes, and relationships before writing any SQL.
  • Use junction tables for many-to-many relationships — they can hold additional metadata.
  • JOINs combine related tables; INNER JOIN returns only matches, LEFT JOIN returns all from the left side.
  • Cascade actions define behavior when parent rows are deleted or updated.
  • Start with a normalized schema and denormalize only when performance requires it.