Databases are the backbone of almost every application. They store, organize, and retrieve data efficiently. Before writing any queries, you need to understand the fundamental concepts that databases are built on.
What is a Database?
A database is an organized collection of data that can be easily accessed, managed, and updated. A Database Management System (DBMS) is the software that interacts with the database on your behalf.
Application → DBMS → Database (files on disk)The DBMS handles:
- Storage — how data is physically written to disk
- Retrieval — finding data quickly using indexes
- Concurrency — multiple users reading and writing at the same time
- Integrity — ensuring data follows rules (constraints)
- Security — controlling who can access what
SQL vs NoSQL
The two main categories of databases differ in how they structure data.
SQL (Relational) Databases
Data is organized into tables with rows and columns. Relationships between tables are defined through foreign keys.
┌──────────────────────────────┐
│ users │
├────────┬─────────┬───────────┤
│ id │ name │ email │
├────────┼─────────┼───────────┤
│ 1 │ Alice │ a@mail.co │
│ 2 │ Bob │ b@mail.co │
└────────┴─────────┴───────────┘Examples: PostgreSQL, MySQL, SQLite, SQL Server.
NoSQL (Non-Relational) Databases
Data is stored in flexible formats like documents (JSON), key-value pairs, graphs, or wide columns.
{
"_id": "user_1",
"name": "Alice",
"email": "a@mail.co",
"orders": [
{ "product": "Laptop", "price": 999 },
{ "product": "Mouse", "price": 29 }
]
}Examples: MongoDB, Redis, DynamoDB, Cassandra.
Comparison
| Feature | SQL | NoSQL |
|---|---|---|
| Structure | Fixed schema (tables) | Flexible schema (documents) |
| Relationships | Foreign keys, JOINs | Embedded or referenced |
| Query language | SQL | Varies by database |
| Scaling | Vertical (bigger server) | Horizontal (more servers) |
| ACID compliance | Yes (strong guarantees) | Varies (often eventual) |
| Best for | Complex queries, reporting | Rapid iteration, big data |
ACID Properties
ACID is a set of properties that guarantee reliable database transactions:
Atomicity
A transaction is all-or-nothing. Either every operation succeeds, or none of them do.
Transfer $100 from Account A to Account B:
1. Deduct $100 from A ✓
2. Add $100 to B ✗ (fails)
→ Both operations roll back. A keeps $100.Consistency
The database always moves from one valid state to another. Constraints (like "balance cannot be negative") are always enforced.
Isolation
Concurrent transactions do not interfere with each other. Each transaction sees a consistent snapshot of the data.
Durability
Once a transaction is committed, it survives system crashes. The data is written to permanent storage.
Data Modeling Basics
Data modeling is the process of deciding what data to store and how to structure it.
Entities and Attributes
An entity is a thing you want to track (user, order, product). Attributes are the properties of that entity.
Entity: User
Attributes: id, name, email, created_at
Entity: Order
Attributes: id, user_id, total, status, ordered_atRelationships
Entities relate to each other in three ways:
One-to-One — Each user has one profile.
users.id ← profiles.user_idOne-to-Many — One user has many orders.
users.id ← orders.user_idMany-to-Many — Students enroll in many courses; courses have many students.
students ← enrollments → coursesMany-to-many relationships require a junction table (also called a join table or bridge table).
Primary Keys and Foreign Keys
Primary Key
A unique identifier for each row in a table. Every table must have one.
CREATE TABLE users (
id INTEGER PRIMARY KEY AUTOINCREMENT,
name TEXT NOT NULL,
email TEXT UNIQUE NOT NULL
);Foreign Key
A column that references the primary key of another table, creating a relationship.
CREATE TABLE orders (
id INTEGER PRIMARY KEY AUTOINCREMENT,
user_id INTEGER NOT NULL,
total REAL NOT NULL,
FOREIGN KEY (user_id) REFERENCES users(id)
);Indexes
An index is a data structure that speeds up data retrieval. Think of it like a book's index — instead of reading every page, you jump directly to the relevant one.
CREATE INDEX idx_users_email ON users(email);
CREATE INDEX idx_orders_user_id ON orders(user_id);When to Add Indexes
| Add an index when... | Avoid indexing when... |
|---|---|
| Column is used in WHERE clauses | Table has very few rows |
| Column is used in JOIN conditions | Column has low cardinality (few unique values) |
| Column is used in ORDER BY | Table has heavy write operations |
Indexes speed up reads but slow down writes because the index must be updated on every insert, update, or delete.
Normalization
Normalization reduces data redundancy by organizing data into related tables.
Before Normalization (Denormalized)
orders:
| order_id | customer_name | customer_email | product | price |
|----------|---------------|----------------|---------|-------|
| 1 | Alice | a@mail.co | Laptop | 999 |
| 2 | Alice | a@mail.co | Mouse | 29 |Alice's name and email are duplicated. If her email changes, you must update every row.
After Normalization
customers:
| id | name | email |
|----|-------|----------|
| 1 | Alice | a@mail.co|
orders:
| id | customer_id | product | price |
|----|-------------|---------|-------|
| 1 | 1 | Laptop | 999 |
| 2 | 1 | Mouse | 29 |Now Alice's data lives in one place. Update it once, and all orders reflect the change.
Practical Exercise
Design a database schema for a blog platform:
-- Users who write posts
CREATE TABLE authors (
id INTEGER PRIMARY KEY AUTOINCREMENT,
name TEXT NOT NULL,
email TEXT UNIQUE NOT NULL,
bio TEXT,
created_at DATETIME DEFAULT CURRENT_TIMESTAMP
);
-- Blog posts
CREATE TABLE posts (
id INTEGER PRIMARY KEY AUTOINCREMENT,
author_id INTEGER NOT NULL,
title TEXT NOT NULL,
slug TEXT UNIQUE NOT NULL,
content TEXT NOT NULL,
published_at DATETIME,
FOREIGN KEY (author_id) REFERENCES authors(id)
);
-- Tags for categorizing posts
CREATE TABLE tags (
id INTEGER PRIMARY KEY AUTOINCREMENT,
name TEXT UNIQUE NOT NULL
);
-- Many-to-many: posts can have many tags, tags can have many posts
CREATE TABLE post_tags (
post_id INTEGER NOT NULL,
tag_id INTEGER NOT NULL,
PRIMARY KEY (post_id, tag_id),
FOREIGN KEY (post_id) REFERENCES posts(id),
FOREIGN KEY (tag_id) REFERENCES tags(id)
);Key Takeaways
- SQL databases use fixed schemas and are ideal for structured data with complex relationships.
- NoSQL databases offer flexible schemas and are great for rapid prototyping and horizontal scaling.
- ACID properties ensure database transactions are reliable and consistent.
- Primary keys uniquely identify rows; foreign keys create relationships between tables.
- Normalization eliminates data duplication but may require JOINs to reassemble related data.