Database Normalization

Overview

Normalization is the process of splitting data into well-structured tables so each fact is stored in exactly one place. It removes duplication and prevents anomalies where updating one copy of a value leaves others stale. The common beginner goal is to reach third normal form (3NF), where every non-key column depends only on the primary key.

Syntax / Usage

Instead of repeating related details in every row, move them into their own table and link with a foreign key.

-- Unnormalized: customer name repeats on every order
-- orders(id, customer_name, customer_city, total)

-- Normalized: customer details live in one table
CREATE TABLE customers (
  id   INTEGER PRIMARY KEY,
  name TEXT NOT NULL,
  city TEXT
);

CREATE TABLE orders (
  id          INTEGER PRIMARY KEY,
  customer_id INTEGER NOT NULL REFERENCES customers(id),
  total       NUMERIC(10, 2) NOT NULL
);

Examples

Because the customer name is stored once, a single update fixes it everywhere:

UPDATE customers SET city = 'Berlin' WHERE id = 42;

Splitting a repeating group into its own table (first normal form):

-- One row per phone number instead of phone1, phone2, phone3
CREATE TABLE customer_phones (
  customer_id INTEGER REFERENCES customers(id),
  phone       TEXT NOT NULL
);

Common Mistakes

Repeating the same descriptive text across many rows instead of referencing one table
Storing multiple values in a single column, breaking first normal form
Over-normalizing tiny lookup values so every query needs extra joins
Confusing normalization (correctness) with performance tuning, which sometimes calls for controlled denormalization

Overview

Syntax / Usage

Examples

Common Mistakes

See Also