Home / Understanding Databases / Non-Relational Databases

Lesson 3 of 6

Non-Relational Databases — Thinking in Documents

Estimated time: 2–2.5 hours

What You Will Learn

  • Understand what NoSQL means and why it exists alongside traditional relational databases
  • Learn how document databases like MongoDB store data as flexible, JSON-like documents instead of rigid tables
  • See how a real-world data model looks different in SQL versus MongoDB — and why that matters
  • Discover key-value stores like Redis and why they are used for caching, sessions, and real-time data
  • Get a brief overview of other NoSQL types: column-family, graph, and search databases
  • Build a practical decision framework for choosing between SQL and NoSQL in real projects

In Lesson 18, you learned how relational databases organize data into neat rows and columns — like a well-structured spreadsheet. You learned about tables, primary keys, foreign keys, and how SQL lets you ask powerful questions about your data. That approach works beautifully for a lot of the data in the world. But not all of it.

Think about the data that describes you as a person. You have a name and an email address — those fit nicely into columns. But you also have a list of skills that could be any length. You have work experiences, each with its own job title, company name, start date, and description. You might have five social media profiles or none at all. You have a list of hobbies that your friend does not have, and your friend speaks three languages while you speak one.

Trying to squeeze all of that into a rigid table with fixed columns gets awkward fast. You end up creating extra tables, writing complex JOIN queries, and dealing with a lot of empty columns for data that some people have and others do not. It works, but it can feel like trying to organize a messy closet using only identical shoeboxes.

What if you could just describe each person as they are — with all their unique, nested, variable data — and store that whole description as a single unit? That is exactly the idea behind non-relational databases, and that is what this lesson is about.

1. What is NoSQL?

NoSQL stands for "Not Only SQL." It does not mean "no SQL at all" or "SQL is bad." It simply means there are other ways to store and organize data beyond the traditional table-and-row model. NoSQL databases come in several flavors, and each one is designed to solve a specific kind of problem particularly well.

NoSQL = Different Tradeoffs, Not Better or Worse

Relational databases and NoSQL databases are not competitors — they are different tools for different jobs. A hammer is not better than a screwdriver. It depends on whether you are working with nails or screws. Many real-world applications use both types of databases at the same time, each handling the kind of data it is best suited for.

Here is the core difference. In a relational database (like MySQL, PostgreSQL, or SQLite), you must define your data structure before you store any data. You create a table, specify exactly what columns it has and what types of data each column holds, and every row must follow that exact structure. This is called a schema, and it is strict. If you decide later that users need a "middle_name" column, you have to alter the entire table.

In most NoSQL databases, the structure is flexible. Each record can have its own shape. One user document might have a "middle_name" field and another might not. One product might have a "color" field while another has a "size" field. The database does not care. It stores whatever you give it.

This flexibility is both the greatest strength and the greatest risk of NoSQL. It makes development faster because you do not have to plan every detail of your schema in advance. But it also means your application code has to be more careful about handling data that might look different from one record to the next.

The major categories of NoSQL databases are:

We will spend the most time on document databases and key-value stores because they are by far the most common in everyday web development. But we will also give you a taste of the other types so you know they exist and when they shine.

2. Document Databases — MongoDB

The most popular type of NoSQL database is the document database, and the most popular document database is MongoDB. Instead of storing data in tables with rows and columns, MongoDB stores data as documents. Each document is a self-contained unit of data that looks a lot like JSON — the same format you have already seen in JavaScript and in API responses.

Note If you have worked through the earlier lessons in this track, you already know JSON from working with APIs and configuration files. MongoDB documents use a format called BSON (Binary JSON), which is essentially JSON with a few extra data types. For our purposes, you can think of MongoDB documents as JSON.

Let us start with a simple example. Imagine you are building a community platform for Lansing-area coders. You need to store information about each user. In a relational database, you would create a users table with columns for name, email, and so on. In MongoDB, each user is stored as a document in a collection (which is roughly equivalent to a table).

Here is what a simple user document looks like:

{
  "_id": "64f1a2b3c8e4d5f6a7b8c9d0",
  "name": "Marcus Johnson",
  "email": "marcus@example.com",
  "joinedDate": "2024-03-15",
  "role": "member"
}

So far, that looks a lot like a single row in a SQL table — just written in a different format. The _id field is like a primary key. MongoDB generates it automatically. Nothing too exciting yet.

But here is where documents get interesting. Unlike a SQL row, a document can contain nested objects and arrays. This means you can store complex, hierarchical data inside a single record. Let us look at a more realistic user document:

{
  "_id": "64f1a2b3c8e4d5f6a7b8c9d0",
  "name": "Marcus Johnson",
  "email": "marcus@example.com",
  "joinedDate": "2024-03-15",
  "role": "member",
  "skills": ["JavaScript", "HTML", "CSS", "React", "Node.js"],
  "address": {
    "city": "Lansing",
    "state": "MI",
    "zip": "48912"
  },
  "experience": [
    {
      "title": "Junior Web Developer",
      "company": "Lansing Web Co.",
      "startDate": "2024-06-01",
      "endDate": null,
      "current": true,
      "description": "Building responsive websites for local businesses"
    },
    {
      "title": "Intern",
      "company": "TechStart Michigan",
      "startDate": "2024-01-15",
      "endDate": "2024-05-30",
      "current": false,
      "description": "Assisted with front-end development and testing"
    }
  ],
  "socialLinks": {
    "github": "https://github.com/marcusjohnson",
    "linkedin": "https://linkedin.com/in/marcusjohnson"
  }
}

Look at how much information is packed into that single document. Marcus has an array of skills (which could be any length), a nested address object, an array of experience objects (each with its own fields), and a social links object. All of this lives together in one place. When your application needs to display Marcus's profile page, it makes one query to the database and gets everything it needs.

The SQL Comparison: Three Tables and JOINs

Now let us think about how you would store this same data in a relational database. Remember from Lesson 18 that each table can only have simple values in its columns — you cannot put an array or a nested object inside a single cell. So you would need to split this data across multiple tables:

Table 1: users

id name email joined_date role city state zip github linkedin
1 Marcus Johnson marcus@example.com 2024-03-15 member Lansing MI 48912 https://github.com/marcusjohnson https://linkedin.com/in/marcusjohnson

Table 2: user_skills

id user_id skill
11JavaScript
21HTML
31CSS
41React
51Node.js

Table 3: user_experience

id user_id title company start_date end_date current description
1 1 Junior Web Developer Lansing Web Co. 2024-06-01 NULL true Building responsive websites for local businesses
2 1 Intern TechStart Michigan 2024-01-15 2024-05-30 false Assisted with front-end development and testing

To get all of Marcus's data, your SQL query would need to JOIN all three tables together:

SELECT u.*, s.skill, e.title, e.company, e.start_date, e.end_date
FROM users u
LEFT JOIN user_skills s ON u.id = s.user_id
LEFT JOIN user_experience e ON u.id = e.user_id
WHERE u.id = 1;

That is three tables, two JOINs, and a query that returns multiple rows (one for each combination of skill and experience) that your application code then has to reassemble into a single user object. It works perfectly fine, but it is more complex than the MongoDB approach for this particular use case.

Tip Neither approach is "wrong." The SQL approach gives you strong data integrity and powerful querying across relationships. The MongoDB approach gives you simpler reads and more natural data modeling for nested, variable data. The best choice depends on your specific situation.

Advantages of Document Databases

Disadvantages of Document Databases

When Documents Shine

Document databases work best when your data is naturally hierarchical (like a user profile with nested details), when different records can have different shapes (like products in a catalog where a shirt has "size" and a laptop has "RAM"), and when your application usually reads and writes entire objects at once rather than individual fields across many records.

3. Key-Value Stores — Redis

If document databases are like flexible filing cabinets, then key-value stores are like a dictionary or a phonebook. You look something up by its name (the key), and you get back its definition (the value). That is it. No tables, no documents, no schema — just keys and values.

The most popular key-value store is Redis, and it is everywhere. If you have ever used a website that loads instantly, displays real-time data, or remembers that you are logged in, there is a good chance Redis is working behind the scenes.

Here is the mental model. Imagine a massive dictionary where every entry has a unique name and a value:

"session:abc123"       →  "{ userId: 42, role: 'admin', loginTime: '2024-09-15T10:30:00' }"
"cache:user:42"        →  "{ name: 'Marcus Johnson', email: 'marcus@example.com' }"
"leaderboard:weekly"   →  "[{ name: 'Alice', score: 2850 }, { name: 'Bob', score: 2340 }]"
"rate-limit:192.168.1" →  "47"

You give Redis a key, and it gives you back the value. You can set a key, get a key, delete a key, and set a key to automatically expire after a certain amount of time. That simplicity is what makes Redis incredibly fast.

Note Redis stores all of its data in memory (RAM), not on disk. That is what makes it so fast — reading from RAM is roughly 100,000 times faster than reading from a hard drive. The tradeoff is that RAM is limited and expensive. Redis is not meant to be your primary database for all data. It is a specialized tool for data that needs to be accessed extremely quickly.

Common Uses for Redis

Session storage: When you log into a website, the server creates a session — a small record that says "this person is logged in and here is who they are." Storing sessions in Redis means the server can verify your login status in microseconds instead of milliseconds. At scale, that difference matters enormously.

Caching: Imagine your application needs to display a user's profile, which requires querying the database, joining three tables, and formatting the result. That might take 200 milliseconds. If you store the result in Redis after the first query, every subsequent request for that same profile takes less than 1 millisecond. This is called caching — keeping a copy of expensive-to-compute data in a fast location.

Leaderboards and counters: Redis has built-in support for sorted sets, which makes it perfect for leaderboards, view counters, like counts, and any data that needs to be incremented or ranked in real time.

Rate limiting: If you want to prevent a user from making more than 100 API requests per minute, you can use Redis to count their requests. Set a key with their IP address, increment it with each request, and set it to expire after 60 seconds. Simple and fast.

Why Redis Is So Fast: O(1) Lookup

In computer science, O(1) means "constant time" — the operation takes the same amount of time regardless of how much data you have. Whether Redis has 100 keys or 100 million keys, looking up a single key takes roughly the same amount of time. This is because Redis uses a data structure called a hash table (the same structure behind JavaScript objects and Python dictionaries). Combined with storing everything in RAM, this makes Redis one of the fastest data stores in existence.

Redis in Practice

Redis commands are refreshingly simple. Here are the most common ones:

SET user:42:name "Marcus Johnson"     -- Store a value
GET user:42:name                      -- Retrieve it: "Marcus Johnson"
DEL user:42:name                      -- Delete it

SET session:abc123 "{ userId: 42 }"   -- Store session data
EXPIRE session:abc123 3600            -- Auto-delete after 1 hour (3600 seconds)

INCR page:home:views                  -- Increment a counter by 1
GET page:home:views                   -- "1847"

Notice how there are no tables, no schemas, no JOINs. Just set a key, get a key. This simplicity is what makes Redis so powerful for the specific problems it solves. You would never try to build an entire application on Redis alone — it is not designed for complex queries or relationships. But as a companion to your primary database (whether SQL or MongoDB), it is invaluable.

Tip A common architecture in the real world is to use a relational database (like PostgreSQL) as the primary data store for all your important data, and Redis as a caching and session layer that sits in front of it. Your app checks Redis first. If the data is there, it returns instantly. If not, it queries the main database, stores the result in Redis for next time, and then returns it. This pattern is called cache-aside and it dramatically improves performance.

4. Other Types of NoSQL Databases

Document databases and key-value stores are the NoSQL types you will encounter most often in everyday web development. But there are other specialized types that solve problems those two cannot. Let us take a brief tour so you know they exist and can recognize when they might be the right tool.

Column-Family Databases (Cassandra, HBase)

Imagine you have a table with billions of rows and hundreds of columns, but any given query only needs a few of those columns. In a traditional row-based database, the system has to read entire rows even if you only want two columns out of fifty. Column-family databases flip this around — they store data by column instead of by row, so reading just the columns you need is extremely efficient.

Apache Cassandra is the most well-known column-family database. It was originally built by Facebook to power their inbox search feature, and it is now used by companies like Netflix, Uber, and Apple to handle datasets with billions of rows spread across hundreds of servers around the world. Cassandra can handle millions of writes per second and is designed to never go down, even if entire data centers fail.

You probably will not need Cassandra for your early projects. It shines at a scale most applications never reach. But if you ever work for a company that deals with massive amounts of time-series data (like server logs, sensor readings, or financial transactions), you will likely encounter it.

Graph Databases (Neo4j)

Some data is all about relationships. Think about a social network: Alice follows Bob, Bob follows Carol, Carol and Alice are both members of the "Lansing Coders" group, and Alice recommended Carol for a job at the same company where Bob works. The interesting questions are not about individual people — they are about the connections: "Who are friends of my friends?" "What is the shortest path between two people?" "Which users have the most influence?"

You can model this in a relational database, but the SQL queries get complex and slow very quickly as the number of relationships grows. Graph databases like Neo4j are built specifically for this. They store data as nodes (people, places, things) and edges (relationships between them), and they can traverse millions of connections in milliseconds.

Graph databases are used for:

Search Engines (Elasticsearch)

Elasticsearch is technically a search engine, but it is often grouped with NoSQL databases because it stores and queries data in a non-relational way. Its superpower is full-text search — the ability to search through millions of text documents and return relevant results in milliseconds, ranked by how well they match your query.

When you type a search query on an e-commerce site and it finds products even when you misspell words, or when a news site lets you search through years of articles by keywords and phrases, that is usually Elasticsearch (or a similar search engine) at work. It handles fuzzy matching, relevance ranking, faceted search (filtering by category, price range, etc.), and autocomplete suggestions.

Like Redis, Elasticsearch is typically used alongside a primary database, not as a replacement. Your main data lives in PostgreSQL or MongoDB, and a copy is indexed in Elasticsearch specifically for searching.

Important You do not need to memorize all these database types right now. The goal is simply to know they exist. When you encounter a problem in the real world — like needing to search through millions of text records or model complex relationships — you will remember that there are specialized databases designed for exactly that. Then you can learn the specific one you need.

5. SQL vs NoSQL — How to Choose

One of the most common questions new developers ask is: "Should I use SQL or NoSQL?" The honest answer is: it depends. But "it depends" is not very helpful, so let us build a practical decision framework you can actually use.

The Comparison at a Glance

Factor SQL (Relational) NoSQL (Non-Relational)
Data structure Fixed schema — all rows have the same columns Flexible — each record can have different fields
Relationships Excellent — JOINs connect data across tables Limited — data is usually self-contained in documents
Consistency Strong — ACID transactions guarantee data integrity Varies — some sacrifice consistency for speed and scale
Scaling Vertical (bigger server) — harder to distribute Horizontal (more servers) — designed for distribution
Query language SQL — standardized, powerful, well-known Database-specific APIs and query languages
Schema changes Requires migrations — careful planning needed Flexible — add fields on the fly
Best for Structured data with clear relationships Variable data, rapid development, massive scale
Examples MySQL, PostgreSQL, SQLite, Oracle MongoDB, Redis, Cassandra, Neo4j, Elasticsearch

Choose SQL When:

Choose NoSQL When:

The Practical Answer for Most Beginners

Tip If you are building your first real project and are unsure which to pick, start with a relational database like PostgreSQL or MySQL. Here is why: relational databases can handle the vast majority of applications perfectly well. They have been battle-tested for decades. The SQL skills you learn are universal and transferable. And if you later discover that a specific part of your application would benefit from MongoDB or Redis, you can add that alongside your relational database. Starting with NoSQL and later realizing you need relational features is a harder problem to solve.

Many successful applications — and many successful companies — run entirely on relational databases. Instagram was serving tens of millions of users with PostgreSQL before they ever added other database types. The key is to understand both approaches, know when each one shines, and make an informed decision rather than following hype.

Knowledge Check

1. What does NoSQL stand for, and what does it mean?

Correct! NoSQL stands for "Not Only SQL." It does not reject SQL — it simply acknowledges that relational databases are not the only way to store and organize data. Different data problems call for different solutions.

2. You are building a user profile system where each user can have a different number of skills, work experiences, and social media links. Which database approach would most naturally model this data?

Correct! A document database handles variable, nested data naturally. Each user document can contain arrays of skills and sub-documents for experiences, all in one place, without needing multiple tables or JOINs. While a relational database can also model this (using multiple tables), the document approach is the most natural fit for this kind of hierarchical, variable data.

3. Why is Redis so fast compared to traditional databases?

Correct! Redis achieves its speed through two key design decisions: storing everything in RAM (which is roughly 100,000 times faster than disk) and using hash table lookups that take constant time regardless of how much data is stored. This makes it ideal for caching, sessions, and any data that needs to be accessed in microseconds.

Lesson Summary

In this lesson, you expanded your understanding of databases beyond the relational model you learned in Lesson 18. Here is what we covered:

In the next lesson, we will take a step back in time and explore legacy databases — the older systems that many businesses still rely on today. Understanding these systems is not just a history lesson; it is practical career knowledge that can set you apart in the job market, especially here in Michigan where many companies still run on legacy technology.

Finished this lesson?

← Previous: Relational Databases Next: Legacy Databases →