Lesson 2 of 6

Relational Databases — Thinking in Spreadsheets

Estimated time: 2–2.5 hours

What You Will Learn

Understand what a relational database is and how it organizes data into tables, rows, and columns
Learn what primary keys and foreign keys are and why every table needs a unique identifier
Recognize the three types of relationships between tables: one-to-one, one-to-many, and many-to-many
Write essential SQL commands: SELECT, INSERT, UPDATE, DELETE, and JOIN
Design a real-world database schema for an e-commerce application from scratch
Understand ACID compliance and why it matters for data you cannot afford to lose
Know when a relational database is the right choice for your project — and when it might not be

1. Multiple Spreadsheets That Talk to Each Other

Imagine you are running a small online bookstore. You open up a spreadsheet application like Google Sheets or Excel to keep track of your business. At first, you create a single spreadsheet with everything in one place: the customer's name, their email, the book they ordered, the book's price, the order date, and the shipping address. It works fine for your first few orders.

But then something happens. A customer named Maria orders three different books. Now you have three rows in your spreadsheet, and each one repeats Maria's name, email, and shipping address. Another customer, James, updates his email address. You scroll through the spreadsheet looking for every row that mentions James so you can update them all. You miss one. Now you have two different email addresses for the same person, and you do not know which one is correct.

This is the problem that relational databases were invented to solve. Instead of putting everything into one giant spreadsheet, you split your data into multiple spreadsheets (called tables) that are connected to each other. You would have one table for customers, one for books, and one for orders. Maria's contact information lives in exactly one place — the customers table. When she orders three books, the orders table simply points back to her customer record. If James changes his email, you update it in one place, and every order he has ever made automatically reflects the change.

That is the core idea behind a relational database: organized data spread across connected tables, with each piece of information stored exactly once. If you have ever used a spreadsheet, you already understand the basics. A relational database is just a collection of spreadsheets that know how to talk to each other.

What Makes a Database “Relational”?

The word “relational” comes from the mathematical concept of relations — but you can think of it simply as relationships between tables. A relational database stores data in structured tables and uses special columns to create links (relationships) between those tables. This design eliminates duplicate data and keeps everything consistent.

In Lesson 14, you got your first taste of databases when you added SQLite to your Spring Boot application. You created a single table to store contact messages, and Spring handled the SQL behind the scenes. In Lesson 17, you learned about the different data types that databases use to store information. Now it is time to go deeper. In this lesson, you will learn how professional databases are designed — with multiple tables that work together — and you will write SQL by hand so you truly understand what is happening under the hood.

Let us start by looking at the building blocks of every relational database.

2. Tables, Rows, and Columns — The Building Blocks

Every relational database is built from three simple components: tables, rows, and columns. If you understand spreadsheets, you already understand these concepts. Let us make sure the vocabulary is crystal clear.

A table is like a single sheet in a spreadsheet workbook. Each table holds one type of thing. You might have a customers table, a products table, and an orders table. Each table has a name, and that name should describe what kind of data it holds. By convention, table names are lowercase and plural — customers instead of Customer, order_items instead of OrderItem.

A column (also called a field) defines one specific piece of information that the table stores. In a customers table, you might have columns for id, first_name, last_name, email, and created_at. Every column has a data type that defines what kind of data it can hold — remember the data types from Lesson 17? An INTEGER column holds whole numbers, a TEXT or VARCHAR column holds strings, a DATE column holds dates, and so on.

A row (also called a record) is one entry in the table. Each row represents one specific customer, one specific product, or one specific order. Here is an example of a customers table with three rows:

id	first_name	last_name	email	created_at
1	Maria	Garcia	maria@email.com	2025-01-15
2	James	Wilson	james@email.com	2025-02-03
3	Aisha	Patel	aisha@email.com	2025-02-20

Primary Keys — Every Row Needs an ID

Look at that first column: id. This is the primary key. A primary key is a column (or sometimes a combination of columns) that uniquely identifies every row in a table. No two rows can ever have the same primary key value. Think of it like a Social Security number or a student ID — it is a guaranteed way to refer to one specific record and only that record.

In most tables, the primary key is a column called id that contains an auto-incrementing integer. The database assigns these automatically: the first row gets 1, the second gets 2, the third gets 3, and so on. You never have to worry about choosing these numbers yourself.

Why Primary Keys Matter

You might wonder why we cannot just use a customer's name or email as the identifier. The problem is that names are not unique (there could be two “James Wilson” customers) and emails can change. A primary key must be unique and permanent. An auto-incrementing integer is perfect: it is always unique, it never changes, and it costs almost nothing to store. When other tables need to reference a specific customer, they store the customer's id rather than repeating the customer's name and email. This is the foundation of how relational databases connect tables together.

Here is what the SQL to create that customers table looks like:

CREATE TABLE customers (
    id          INTEGER PRIMARY KEY AUTOINCREMENT,
    first_name  TEXT NOT NULL,
    last_name   TEXT NOT NULL,
    email       TEXT NOT NULL UNIQUE,
    created_at  DATE DEFAULT CURRENT_DATE
);

Let us break down what each part means:

CREATE TABLE customers — creates a new table called customers
id INTEGER PRIMARY KEY AUTOINCREMENT — the id column is a whole number that serves as the primary key and automatically increases by 1 for each new row
first_name TEXT NOT NULL — the first_name column holds text and cannot be left empty (NOT NULL means a value is required)
email TEXT NOT NULL UNIQUE — the email must be provided and no two customers can have the same email address
created_at DATE DEFAULT CURRENT_DATE — the date column will automatically use today's date if you do not provide one

Do not worry about memorizing this syntax right now. The important thing is understanding the concepts: tables hold data, columns define the structure, rows are the actual records, and every table has a primary key that uniquely identifies each row. With those building blocks in place, we can start connecting tables together.

3. Relationships — Connecting Tables Together

The real power of a relational database comes from relationships — the connections between tables. When you split your data across multiple tables (instead of cramming everything into one giant table), you need a way to link them back together. That is where foreign keys come in.

A foreign key is a column in one table that stores the primary key value from another table. It is the bridge between tables. It says, “this row is connected to that row over there.” Let us look at the three types of relationships you will encounter.

3a. One-to-Many (The Most Common Relationship)

A one-to-many relationship means that one record in Table A can be connected to many records in Table B, but each record in Table B is connected to only one record in Table A. This is by far the most common type of relationship you will see in real databases.

The classic example is customers and orders. One customer can place many orders, but each order belongs to exactly one customer. Here is how that looks:

customers table:

id	first_name	last_name	email
1	Maria	Garcia	maria@email.com
2	James	Wilson	james@email.com

orders table:

id	customer_id	order_date	total
1	1	2025-03-01	29.99
2	1	2025-03-15	54.50
3	2	2025-03-10	12.99

See that customer_id column in the orders table? That is the foreign key. It stores the id of the customer who placed the order. Orders 1 and 2 both have customer_id = 1, which means they both belong to Maria. Order 3 has customer_id = 2, so it belongs to James. Maria has two orders. James has one. That is one-to-many in action.

The foreign key always goes on the “many” side of the relationship. Since one customer can have many orders, the customer_id foreign key goes in the orders table — not the other way around.

Here is the SQL to create the orders table with its foreign key:

CREATE TABLE orders (
    id          INTEGER PRIMARY KEY AUTOINCREMENT,
    customer_id INTEGER NOT NULL,
    order_date  DATE DEFAULT CURRENT_DATE,
    total       DECIMAL(10, 2) NOT NULL,
    FOREIGN KEY (customer_id) REFERENCES customers(id)
);

That last line is the key: FOREIGN KEY (customer_id) REFERENCES customers(id). This tells the database, “the customer_id column in this table must contain a value that exists in the id column of the customers table.” If you try to create an order with customer_id = 999 and there is no customer with id = 999, the database will reject it. This is called referential integrity — the database enforces the rule that every foreign key must point to a real record.

One-to-Many Examples in the Real World

One author writes many blog posts
One department has many employees
One category contains many products
One user sends many messages
One teacher teaches many classes

In every case, the foreign key goes on the “many” side: the blog post has an author_id, the employee has a department_id, the product has a category_id.

3b. One-to-One

A one-to-one relationship means that each record in Table A is connected to exactly one record in Table B, and vice versa. This is less common, but you will see it when you want to split a table into two pieces — usually because one part of the data is accessed frequently and the other part is rarely needed.

The classic example is users and profiles. Every user has one profile, and every profile belongs to one user. You might split them apart because the users table holds login information (which you check on every page load) while the profiles table holds detailed biographical information (which you only display on the profile page).

users table:

id	username	email	password_hash
1	maria_g	maria@email.com	$2b$12$abc...
2	james_w	james@email.com	$2b$12$xyz...

profiles table:

id	user_id	bio	avatar_url	date_of_birth
1	1	Coffee enthusiast and coder.	/avatars/maria.jpg	1995-06-15
2	2	Full-stack developer in Lansing.	/avatars/james.jpg	1990-11-02

The user_id column in the profiles table is a foreign key pointing to the users table, and it has a UNIQUE constraint — meaning each user can only have one profile. That is what makes it one-to-one instead of one-to-many.

CREATE TABLE profiles (
    id            INTEGER PRIMARY KEY AUTOINCREMENT,
    user_id       INTEGER NOT NULL UNIQUE,
    bio           TEXT,
    avatar_url    TEXT,
    date_of_birth DATE,
    FOREIGN KEY (user_id) REFERENCES users(id)
);

Notice the UNIQUE keyword on user_id. Without it, the same user could appear multiple times in the profiles table, turning the relationship into one-to-many. The UNIQUE constraint is what enforces the one-to-one rule.

3c. Many-to-Many

A many-to-many relationship means that records in Table A can be connected to many records in Table B, and records in Table B can also be connected to many records in Table A. Both sides can have multiple connections.

The classic example is students and classes. One student can enroll in many classes. And one class can have many students enrolled. Neither side is limited to just one connection.

Here is the challenge: you cannot represent a many-to-many relationship with a simple foreign key. If you put a class_id in the students table, each student could only be in one class. If you put a student_id in the classes table, each class could only have one student. Neither option works.

The solution is a join table (also called a junction table or linking table). This is a third table that sits between the other two and contains foreign keys to both of them. Each row in the join table represents one connection between a student and a class.

students table:

id	name	email
1	Maria Garcia	maria@email.com
2	James Wilson	james@email.com
3	Aisha Patel	aisha@email.com

classes table:

id	class_name	instructor
1	Intro to HTML	Prof. Chen
2	Database Design	Prof. Adams
3	Java Programming	Prof. Rivera

student_classes table (the join table):

id	student_id	class_id	enrolled_at
1	1	1	2025-01-10
2	1	2	2025-01-10
3	2	2	2025-01-12
4	2	3	2025-01-12
5	3	1	2025-01-15
6	3	2	2025-01-15
7	3	3	2025-01-15

Let us read this join table. Rows 1 and 2 show that Maria (student_id 1) is enrolled in Intro to HTML (class_id 1) and Database Design (class_id 2). Rows 3 and 4 show that James (student_id 2) is in Database Design and Java Programming. Rows 5, 6, and 7 show that Aisha (student_id 3) is in all three classes. Notice how Database Design (class_id 2) appears three times — because three students are enrolled in it. And Aisha appears three times because she is in three classes. That is many-to-many.

Here is the SQL to create the join table:

CREATE TABLE student_classes (
    id          INTEGER PRIMARY KEY AUTOINCREMENT,
    student_id  INTEGER NOT NULL,
    class_id    INTEGER NOT NULL,
    enrolled_at DATE DEFAULT CURRENT_DATE,
    FOREIGN KEY (student_id) REFERENCES students(id),
    FOREIGN KEY (class_id) REFERENCES classes(id)
);

The join table has two foreign keys: one pointing to students and one pointing to classes. Each row in the join table creates a single link between one student and one class. To add Aisha to a new class, you just add a new row to the join table. To remove a student from a class, you delete the corresponding row. The students and classes tables remain untouched.

Relationship Summary

Here is a quick reference table comparing all three relationship types:

Relationship	Example	How It Works	Foreign Key Location
One-to-Many	Customer → Orders	One customer has many orders	On the “many” side (orders table)
One-to-One	User → Profile	One user has one profile	On either side, with UNIQUE constraint
Many-to-Many	Students ↔ Classes	Many students in many classes	In a separate join table with two foreign keys

If you can identify which type of relationship exists between two pieces of data, you are well on your way to designing good database schemas. This is one of the most important skills in software development, and it gets easier with practice.

4. SQL Deep Dive — Talking to Your Database

SQL (Structured Query Language, often pronounced “sequel”) is the language you use to communicate with a relational database. Every relational database in the world — SQLite, MySQL, PostgreSQL, Oracle, SQL Server — understands SQL. The syntax varies slightly between them, but the core commands are the same everywhere. Learning SQL is one of the highest-value skills you can develop, because it is used by software developers, data analysts, business intelligence professionals, and many other roles.

In Lesson 14, Spring Boot generated SQL for you automatically. That was convenient, but now it is time to understand what those generated commands actually look like. You need to know SQL so you can debug problems, write custom queries, and truly understand what your application is doing with your data.

Let us walk through the five most important SQL commands. We will use our bookstore example with a customers table and an orders table.

4a. SELECT — Reading Data

SELECT is the command you will use the most. It retrieves data from a table. Think of it as asking the database a question: “Show me this information from that table.”

The most basic SELECT retrieves all columns from a table:

SELECT * FROM customers;

The * means “all columns.” This would return every column and every row in the customers table. While SELECT * is handy for quick exploration, in real applications you should name the specific columns you need:

SELECT first_name, last_name, email
FROM customers;

This returns only the three columns you asked for. It is faster (the database does not have to fetch columns you do not need) and clearer (anyone reading your code knows exactly what data you are working with).

Filtering with WHERE

Most of the time, you do not want all the rows — you want specific ones. The WHERE clause lets you filter results based on conditions:

-- Find a specific customer by email
SELECT * FROM customers
WHERE email = 'maria@email.com';

-- Find all orders over $30
SELECT * FROM orders
WHERE total > 30.00;

-- Find orders placed in March 2025
SELECT * FROM orders
WHERE order_date >= '2025-03-01'
  AND order_date <= '2025-03-31';

You can combine conditions with AND (both must be true) and OR (either can be true). Lines starting with -- are comments — the database ignores them, but they help humans understand what the query does.

Sorting with ORDER BY

By default, the database returns rows in no particular order. You can sort the results using ORDER BY:

-- Newest orders first
SELECT * FROM orders
ORDER BY order_date DESC;

-- Customers alphabetically by last name
SELECT * FROM customers
ORDER BY last_name ASC;

ASC means ascending (A to Z, smallest to largest, oldest to newest). DESC means descending (Z to A, largest to smallest, newest to oldest). If you do not specify, the default is ASC.

Limiting Results with LIMIT

If your table has millions of rows and you only want the first few, use LIMIT:

-- Get the 5 most recent orders
SELECT * FROM orders
ORDER BY order_date DESC
LIMIT 5;

-- Get the top 10 highest-value orders
SELECT * FROM orders
ORDER BY total DESC
LIMIT 10;

Combining WHERE, ORDER BY, and LIMIT together gives you powerful control over exactly what data you get back. Here is a complete example that uses all three:

-- Get Maria's 3 most expensive orders
SELECT id, order_date, total
FROM orders
WHERE customer_id = 1
ORDER BY total DESC
LIMIT 3;

Read it like English: “Select the id, order date, and total from orders where the customer id is 1, ordered by total from highest to lowest, and give me only the top 3.” That is the beauty of SQL — once you know the keywords, queries almost read like sentences.

4b. INSERT — Adding New Data

INSERT adds a new row to a table. You specify the table name, the columns you want to fill in, and the values for each column:

INSERT INTO customers (first_name, last_name, email)
VALUES ('Kenji', 'Tanaka', 'kenji@email.com');

Notice that we listed the column names explicitly: (first_name, last_name, email). We did not include id because it auto-increments, and we did not include created_at because it has a default value. The database handles those automatically.

Always list your column names explicitly in INSERT statements. You could write INSERT INTO customers VALUES (...) without naming the columns, but that is fragile — if someone adds a new column to the table later, your insert statement would break. Being explicit protects you.

You can insert multiple rows at once:

INSERT INTO customers (first_name, last_name, email)
VALUES
    ('Kenji', 'Tanaka', 'kenji@email.com'),
    ('Sofia', 'Martinez', 'sofia@email.com'),
    ('David', 'Kim', 'david@email.com');

4c. UPDATE — Changing Existing Data

UPDATE modifies existing rows. You specify which table to update, what to change, and which rows to change:

-- Update James's email address
UPDATE customers
SET email = 'james.wilson@newemail.com'
WHERE id = 2;

You can update multiple columns at once:

-- Update a customer's name and email
UPDATE customers
SET first_name = 'Jim',
    email = 'jim.wilson@email.com'
WHERE id = 2;

DANGER: Never run UPDATE without WHERE!

If you forget the WHERE clause, the database will update every single row in the table. This statement would set every customer's email to the same address:

-- DO NOT DO THIS! No WHERE clause!
UPDATE customers
SET email = 'oops@email.com';

This would overwrite every customer's email instantly. There is no “undo” button. This is one of the most common and devastating mistakes that database beginners (and sometimes even professionals) make. Always double-check that your UPDATE has a WHERE clause before you run it.

4d. DELETE — Removing Data

DELETE removes rows from a table. Like UPDATE, you use a WHERE clause to specify which rows to delete:

-- Delete a specific order
DELETE FROM orders
WHERE id = 3;

-- Delete all orders from a specific customer
DELETE FROM orders
WHERE customer_id = 2;

DANGER: Never run DELETE without WHERE!

Just like UPDATE, forgetting the WHERE clause is catastrophic. This statement would delete every single row in the orders table:

-- DO NOT DO THIS! No WHERE clause!
DELETE FROM orders;

All of your orders — gone in an instant. No confirmation dialog, no recycle bin, no undo. The data is simply erased. Always include a WHERE clause with DELETE. As a safety practice, many developers will first run a SELECT with the same WHERE clause to preview which rows will be affected before running the DELETE.

The Safety Check Pattern

Before running an UPDATE or DELETE, preview the affected rows first by running a SELECT with the same WHERE clause:

-- Step 1: Preview what will be deleted
SELECT * FROM orders WHERE customer_id = 2;

-- Step 2: If the results look correct, delete them
DELETE FROM orders WHERE customer_id = 2;

This two-step approach takes an extra few seconds but can save you from accidentally destroying data. Professional developers do this routinely, especially on production databases.

4e. JOIN — Combining Tables

Now we arrive at one of the most powerful features of SQL: the JOIN. A JOIN lets you combine rows from two or more tables based on a related column between them. This is how you bring your connected tables back together to answer questions that span multiple tables.

Let us say you want to see a list of orders that includes the customer's name — not just their customer_id number. The order information is in the orders table, and the customer's name is in the customers table. A JOIN brings them together:

SELECT
    customers.first_name,
    customers.last_name,
    orders.order_date,
    orders.total
FROM orders
JOIN customers ON orders.customer_id = customers.id;

Let us break this down piece by piece:

SELECT customers.first_name, customers.last_name, orders.order_date, orders.total — We specify which columns we want, using table_name.column_name format to be clear about which table each column comes from.
FROM orders — We start from the orders table.
JOIN customers — We join it with the customers table.
ON orders.customer_id = customers.id — We specify the connection: the customer_id in orders matches the id in customers.

The result would look like this:

first_name	last_name	order_date	total
Maria	Garcia	2025-03-01	29.99
Maria	Garcia	2025-03-15	54.50
James	Wilson	2025-03-10	12.99

The database matched each order's customer_id with the corresponding customer's id and combined the information into a single result set. This is the magic of relational databases: data is stored separately for efficiency, but you can combine it on the fly whenever you need to.

You can add WHERE, ORDER BY, and LIMIT to a JOIN query just like any other SELECT:

-- Get Maria's orders, sorted by date
SELECT
    customers.first_name,
    orders.order_date,
    orders.total
FROM orders
JOIN customers ON orders.customer_id = customers.id
WHERE customers.first_name = 'Maria'
ORDER BY orders.order_date DESC;

JOINs can also connect more than two tables. If you had an order_items table that linked orders to products, you could join all three tables together to see which customer bought which product on what date. We will do exactly that in the schema design exercise coming up.

SQL Command Summary

Command	Purpose	Example
`SELECT`	Read data from a table	`SELECT * FROM customers;`
`INSERT`	Add new rows to a table	`INSERT INTO customers (name) VALUES ('Maria');`
`UPDATE`	Change existing rows	`UPDATE customers SET email = '...' WHERE id = 1;`
`DELETE`	Remove rows from a table	`DELETE FROM customers WHERE id = 1;`
`JOIN`	Combine related tables	`SELECT * FROM orders JOIN customers ON ...;`

5. Schema Design Exercise — Building an E-Commerce Database

Now it is time to put everything together. You are going to design a database schema for a small e-commerce store — the kind of thing you might build as a real project. A schema is the blueprint for your database: it defines which tables exist, what columns each table has, and how the tables are related to each other.

Our e-commerce store needs to track four things: customers, products, orders, and the specific items in each order. Let us think through each one.

Step 1: Identify Your Tables

Start by asking, “What are the main things (nouns) in my system?” For an e-commerce store, the answer is:

Customers — the people who buy things
Products — the things being sold
Orders — a record of each purchase transaction
Order Items — the specific products in each order (because one order can contain multiple products)

Step 2: Define Columns for Each Table

For each table, ask “What information do I need to store about this thing?” Here is what we will use:

CREATE TABLE customers (
    id          INTEGER PRIMARY KEY AUTOINCREMENT,
    first_name  TEXT NOT NULL,
    last_name   TEXT NOT NULL,
    email       TEXT NOT NULL UNIQUE,
    address     TEXT,
    city        TEXT,
    state       TEXT,
    zip_code    TEXT,
    created_at  DATE DEFAULT CURRENT_DATE
);

CREATE TABLE products (
    id          INTEGER PRIMARY KEY AUTOINCREMENT,
    name        TEXT NOT NULL,
    description TEXT,
    price       DECIMAL(10, 2) NOT NULL,
    stock       INTEGER NOT NULL DEFAULT 0,
    category    TEXT,
    created_at  DATE DEFAULT CURRENT_DATE
);

CREATE TABLE orders (
    id          INTEGER PRIMARY KEY AUTOINCREMENT,
    customer_id INTEGER NOT NULL,
    order_date  DATE DEFAULT CURRENT_DATE,
    status      TEXT DEFAULT 'pending',
    total       DECIMAL(10, 2) NOT NULL,
    FOREIGN KEY (customer_id) REFERENCES customers(id)
);

CREATE TABLE order_items (
    id          INTEGER PRIMARY KEY AUTOINCREMENT,
    order_id    INTEGER NOT NULL,
    product_id  INTEGER NOT NULL,
    quantity    INTEGER NOT NULL DEFAULT 1,
    unit_price  DECIMAL(10, 2) NOT NULL,
    FOREIGN KEY (order_id) REFERENCES orders(id),
    FOREIGN KEY (product_id) REFERENCES products(id)
);

Step 3: Identify the Relationships

Now let us map out how these tables connect:

customers → orders: One-to-Many. One customer can place many orders. The customer_id foreign key in the orders table creates this link.
orders → order_items: One-to-Many. One order can contain many items. The order_id foreign key in the order_items table creates this link.
products → order_items: One-to-Many. One product can appear in many order items (different customers buying the same product). The product_id foreign key in the order_items table creates this link.
orders ↔ products: Many-to-Many (through order_items). One order can include many products, and one product can be in many orders. The order_items table serves as the join table between them.

Why Do We Need order_items?

You might wonder why we do not just put the product information directly in the orders table. The reason is that one order can contain multiple products. If Maria buys a book, a mug, and a t-shirt in a single order, we need three rows in order_items — one for each product. The orders table stores the overall order information (who placed it, when, the total), and the order_items table stores the details of each item in the order. This also lets us record the quantity (Maria might buy 2 mugs) and the price at the time of purchase (because prices can change over time).

Step 4: Fill In Some Sample Data

Let us populate our schema with some test data so you can see how it all works together:

-- Add some customers
INSERT INTO customers (first_name, last_name, email, city, state)
VALUES
    ('Maria', 'Garcia', 'maria@email.com', 'Lansing', 'MI'),
    ('James', 'Wilson', 'james@email.com', 'East Lansing', 'MI'),
    ('Aisha', 'Patel', 'aisha@email.com', 'Lansing', 'MI');

-- Add some products
INSERT INTO products (name, description, price, stock, category)
VALUES
    ('Learn SQL Book', 'A beginner-friendly guide to SQL', 24.99, 50, 'Books'),
    ('Coders Farm Mug', 'A ceramic mug for coffee lovers', 12.99, 100, 'Accessories'),
    ('JavaScript T-Shirt', 'A comfy tee for JS enthusiasts', 19.99, 75, 'Apparel'),
    ('USB-C Hub', '7-port hub for your laptop', 34.99, 30, 'Electronics');

-- Create an order for Maria
INSERT INTO orders (customer_id, order_date, status, total)
VALUES (1, '2025-03-01', 'shipped', 57.97);

-- Add items to Maria's order
INSERT INTO order_items (order_id, product_id, quantity, unit_price)
VALUES
    (1, 1, 1, 24.99),
    (1, 2, 1, 12.99),
    (1, 3, 1, 19.99);

-- Create an order for James
INSERT INTO orders (customer_id, order_date, status, total)
VALUES (2, '2025-03-10', 'pending', 69.98);

-- Add items to James's order
INSERT INTO order_items (order_id, product_id, quantity, unit_price)
VALUES
    (2, 4, 1, 34.99),
    (2, 4, 1, 34.99);

Maria's order (order 1) contains the SQL book, the mug, and the t-shirt. James's order (order 2) contains two USB-C hubs. Each order item row links back to both its order and its product.

Step 5: Query Across Tables

Now here is where the schema design pays off. You can write queries that pull information from multiple tables to answer real business questions:

-- What did Maria order? Show product names and prices.
SELECT
    customers.first_name,
    customers.last_name,
    products.name AS product_name,
    order_items.quantity,
    order_items.unit_price
FROM order_items
JOIN orders ON order_items.order_id = orders.id
JOIN customers ON orders.customer_id = customers.id
JOIN products ON order_items.product_id = products.id
WHERE customers.id = 1;

This query joins four tables together in one statement. It starts with order_items, joins to orders (to find out which order), joins to customers (to find out who placed the order), and joins to products (to get the product name). The result would look like this:

first_name	last_name	product_name	quantity	unit_price
Maria	Garcia	Learn SQL Book	1	24.99
Maria	Garcia	Coders Farm Mug	1	12.99
Maria	Garcia	JavaScript T-Shirt	1	19.99

From separated tables to a clear, human-readable report — that is the power of SQL joins and good schema design. Every piece of data is stored once, in the right table, and you combine them however you need.

-- Which products have been ordered the most?
SELECT
    products.name,
    SUM(order_items.quantity) AS total_sold
FROM order_items
JOIN products ON order_items.product_id = products.id
GROUP BY products.id
ORDER BY total_sold DESC;

This query introduces SUM() and GROUP BY, which let you aggregate data. SUM(order_items.quantity) adds up all the quantities for each product, and GROUP BY products.id groups the results by product. The result tells you which products are your best sellers. We will not dive deep into aggregation in this lesson, but know that SQL can do much more than simple lookups — it can crunch numbers and generate reports too.

6. When to Choose a Relational Database

Relational databases are not the only option. In the next lesson, you will learn about non-relational (NoSQL) databases, which take a completely different approach. So when should you choose a relational database? Here are the situations where they shine:

Structured Data with Clear Relationships

If your data naturally fits into tables with well-defined columns, a relational database is a great choice. E-commerce stores, banking systems, school enrollment systems, hospital records, employee directories — all of these have structured data with clear relationships between entities. When you can draw a diagram showing how your tables connect with foreign keys, you are in relational database territory.

Data Integrity Is Critical

Relational databases enforce rules about your data. Foreign keys ensure you cannot create an order for a customer that does not exist. NOT NULL constraints ensure required fields are always filled in. UNIQUE constraints prevent duplicate entries. If you are building a system where data accuracy matters — financial transactions, medical records, legal documents — these built-in guardrails are invaluable.

Complex Queries Are Needed

If you need to ask complex questions that combine data from multiple sources, SQL and relational databases are incredibly powerful. Queries like “Show me all customers in Michigan who placed more than three orders last month, sorted by total spending” are natural to express in SQL. Other types of databases can handle simple lookups, but they struggle with multi-table analysis.

ACID Compliance Is Required

This is a big one. ACID is a set of guarantees that relational databases provide about your data. If your application handles money, inventory, or any situation where partial operations would be disastrous, you need ACID compliance. We will explain ACID in detail in the next section.

Relational Database Strengths

Structured data — Data that fits neatly into tables with defined columns
Relationships matter — Entities are connected to each other (customers have orders, orders have items)
Complex queries — You need to combine, filter, aggregate, and analyze data across multiple tables
Data integrity — Constraints and foreign keys keep your data consistent and valid
ACID compliance — Transactions guarantee that operations either fully succeed or fully fail

Popular relational databases you will encounter in the real world include MySQL, PostgreSQL, SQLite, Microsoft SQL Server, and Oracle. They all use SQL and share the same foundational concepts you learned in this lesson. The differences between them are mostly about performance, scalability, and extra features — but the core ideas of tables, relationships, and SQL are universal.

7. ACID — Transactions That Cannot Be Half-Done

Imagine you are transferring $500 from your checking account to your savings account using your bank's website. Behind the scenes, the database needs to do two things:

Subtract $500 from your checking account balance
Add $500 to your savings account balance

Now imagine the power goes out after step 1 but before step 2. Your checking account lost $500, but your savings account never received it. The money has vanished. This would be catastrophic — and it is exactly the kind of problem that ACID prevents.

ACID stands for four properties that relational databases guarantee for every transaction:

A — Atomicity

Atomicity means “all or nothing.” A transaction is treated as a single, indivisible unit. Either all of its operations succeed, or none of them do. In the bank transfer example, if the second operation fails, the database automatically undoes the first operation too. Your money never disappears. The word “atomic” comes from the Greek word for “indivisible” — the idea is that you cannot split a transaction into parts.

C — Consistency

Consistency means that a transaction can only bring the database from one valid state to another valid state. If your database has a rule that account balances cannot be negative, and a transaction would make a balance go below zero, the entire transaction is rejected. The database never ends up in a state that violates its own rules.

I — Isolation

Isolation means that transactions running at the same time do not interfere with each other. If two people are both transferring money at the exact same moment, each transaction sees the database as if it were the only one running. One transaction cannot read partially completed data from another. This prevents bizarre situations where you see an account balance in the middle of being updated.

D — Durability

Durability means that once a transaction is confirmed as complete (committed), the data is permanently saved. Even if the power goes out, the server crashes, or a meteorite hits the data center one second after the transaction completes, the data will still be there when the system recovers. The database writes the data to disk before telling you “done.”

ACID in One Sentence

ACID guarantees that database transactions either happen completely and permanently or do not happen at all — there is no in-between state where your data is half-updated, inconsistent, or lost.

Here is what a transaction looks like in SQL:

-- Transfer $500 from checking to savings
BEGIN TRANSACTION;

UPDATE accounts
SET balance = balance - 500
WHERE account_type = 'checking' AND customer_id = 1;

UPDATE accounts
SET balance = balance + 500
WHERE account_type = 'savings' AND customer_id = 1;

COMMIT;

BEGIN TRANSACTION starts the transaction. The two UPDATE statements happen inside it. COMMIT tells the database, “I am done, save everything permanently.” If anything goes wrong between BEGIN and COMMIT, the database automatically rolls back all the changes. No money disappears.

If you realize something is wrong during a transaction and want to cancel it, you can use ROLLBACK instead of COMMIT:

BEGIN TRANSACTION;

UPDATE accounts
SET balance = balance - 500
WHERE account_type = 'checking' AND customer_id = 1;

-- Oops, wrong amount! Cancel everything.
ROLLBACK;

The ROLLBACK undoes all changes made since the BEGIN TRANSACTION. The checking account balance is restored to what it was before.

ACID compliance is one of the biggest reasons why relational databases have dominated the software industry for over 40 years. When you are dealing with money, inventory counts, user accounts, medical data, or anything where incorrect data has real consequences, ACID gives you confidence that your database will not leave things in a broken state.

8. Putting It All Together

Let us take a moment to step back and see the complete picture. You now understand the fundamental building blocks of relational databases:

Tables organize data into structured rows and columns, with each table representing one type of entity.
Primary keys give each row a unique identity so it can be referenced from other tables.
Foreign keys create relationships between tables, linking data together without duplication.
SQL is the universal language for reading, writing, updating, and deleting data.
JOINs reunite data from connected tables into a single result set.
Schema design is the art of deciding which tables to create, what columns they need, and how they relate to each other.
ACID guarantees that your transactions are reliable and your data is safe.

These concepts apply to every relational database you will ever use: MySQL, PostgreSQL, SQLite, SQL Server, Oracle — they all work this way. The SQL syntax may have small variations, but the ideas are identical. When you learn relational database concepts once, you know them for every relational database.

In the next lesson, we will explore the other side of the database world: non-relational databases. These databases throw away tables, rows, and SQL in favor of completely different data models. But do not worry — everything you learned in this lesson will help you understand why non-relational databases exist and when they are a better choice.

You are building a strong foundation. Keep going — you are doing great work.

Knowledge Check

1. In a one-to-many relationship between customers and orders, where does the foreign key go?

In the customers table, as a column called order_id In the orders table, as a column called customer_id In a separate join table that links customers and orders Foreign keys are not needed for one-to-many relationships

Correct! In a one-to-many relationship, the foreign key always goes on the “many” side. Since one customer can have many orders, the orders table gets a customer_id column that points back to the customers table. Each order stores the ID of the customer who placed it. A join table is only needed for many-to-many relationships.

2. What happens if you run DELETE FROM orders; without a WHERE clause?

The database will ask you to confirm before deleting anything It deletes only the first row in the orders table It deletes every single row in the orders table immediately It returns an error because a WHERE clause is required

That is right — and it is one of the most important safety lessons in SQL. Without a WHERE clause, DELETE removes all rows from the table instantly. There is no confirmation prompt, no undo button, and no recycle bin. The same danger applies to UPDATE without WHERE, which would modify every row. Always double-check your WHERE clause before running DELETE or UPDATE statements, especially on production databases.

3. What does the “A” in ACID stand for, and what does it guarantee?

Authorization — only logged-in users can access the database Atomicity — a transaction either fully completes or fully rolls back, with no partial state Availability — the database is always online and accessible Aggregation — the database can calculate sums and averages across tables

Exactly! Atomicity means “all or nothing.” When a transaction contains multiple operations (like subtracting from one account and adding to another), either all of them succeed together or none of them happen. If anything goes wrong partway through, the database automatically undoes all the changes that were already made. This prevents the nightmare scenario of data being left in a half-updated, inconsistent state.

Lesson Summary

Outstanding work — you just covered a massive amount of ground! Here is a recap of everything you learned in this lesson:

A relational database is a collection of structured tables that are connected to each other through relationships — like multiple spreadsheets that talk to each other.
Every table has columns (the fields of data), rows (individual records), and a primary key (a unique identifier for each row).
Foreign keys are columns that store the primary key from another table, creating links between tables.
There are three types of relationships: one-to-many (the most common), one-to-one, and many-to-many (which requires a join table).
The five essential SQL commands are SELECT (read), INSERT (create), UPDATE (modify), DELETE (remove), and JOIN (combine tables).
UPDATE and DELETE without a WHERE clause affect every row in the table — always include WHERE.
Good schema design means identifying your entities, defining their columns, and mapping the relationships between them.
ACID (Atomicity, Consistency, Isolation, Durability) guarantees that transactions are reliable — they either succeed completely or fail completely, with no half-done states.
Relational databases are the best choice when you have structured data, need data integrity, require complex queries, or cannot tolerate incomplete transactions.

In the next lesson, we will explore non-relational databases (also called NoSQL databases) — a completely different approach to storing data. You will learn about documents, key-value stores, and when it makes sense to trade the structure of relational databases for the flexibility of NoSQL. See you there!

Finished this lesson?

← Previous: Data & Data Types Next: Non-Relational Databases →