Skip to main content

Command Palette

Search for a command to run...

πŸ“ Database Normalization Explained Like You're 5

Published
β€’2 min read
S

Building AI systems and writing about how they actually work. Master of AI @ University of Technology Sydney. Previously B.Tech CS with focus on IoT. I believe the best way to learn is to explain. That's why I'm documenting tech concepts with simple analogies (@sreekarreddy.com). AWS Certified β€’ Azure AI Certified β€’ Neo4j Professional β€’ Google Data Analytics When not coding: exploring Sydney, working on side projects, and teaching tech to anyone who'll listen.

Organizing data to reduce redundancy

Day 125 of 149

πŸ‘‰ Full deep-dive with code examples


The Address Book Analogy

Bad address book:

  • "John Smith, 123 Main St, Sydney NSW 2000"
  • "John Smith, 123 Main St, Sydney NSW 2000" (repeated!)
  • What if John moves? Update every entry!

Good address book:

  • Contact list: John Smith β†’ Contact ID: 1
  • Address list: Contact ID: 1 β†’ 123 Main St, Sydney
  • One place to update!

Normalization organizes data to avoid repetition!


The Problems It Solves

Data repetition:

  • Same information in many places
  • Wastes storage space
  • Easy to have mismatches

Update anomalies:

  • Change one place, forget another
  • Data becomes inconsistent

Deletion anomalies:

  • Delete one thing, accidentally lose other info

How It Works

Split data into related tables:

Before (unnormalized):

| OrderID | Customer | CustomerEmail  | Product |
| 1       | Alice    | alice@mail.com | Laptop  |
| 2       | Alice    | alice@mail.com | Mouse   |

Alice's email repeated!

After (normalized):

Customers: | CustomerID | Name  | Email           |
           | 1          | Alice | alice@mail.com  |

Orders:    | OrderID | CustomerID | Product |
| 1 | 1 | Laptop |
| 2 | 1 | Mouse  |

Email stored once, linked by ID.


Benefits

  • No redundancy β†’ Data stored once
  • Consistency β†’ One source of truth
  • Easier updates β†’ Change in one place
  • Less storage β†’ No duplicate data

The Trade-off

More tables = more joins (slower queries sometimes).

Balance: Normalize for correctness, denormalize for performance when needed.


In One Sentence

Database Normalization organizes data into tables that minimize repetition, ensuring data is stored once and stays consistent.


πŸ”— Enjoying these? Follow for daily ELI5 explanations!

Making complex tech concepts simple, one day at a time.

More from this blog

esreekarreddy

132 posts