Schema Design Patterns: Normalization, Denormalization, Naming Conventions

Schema design is the foundation of application performance and maintainability. Good schemas are intuitive, performant, and resilient to change. This article covers normalization trade-offs, denormalization strategies, and practical naming conventions.
Normalization
Normalization eliminates data redundancy through a series of normal forms. Most production schemas aim for Third Normal Form (3NF).
First Normal Form (1NF)
Each column contains atomic values. No repeating groups or arrays.
\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\-- Violates 1NF: multi-valued column
CREATE TABLE orders (
id INTEGER PRIMARY KEY,
product_ids TEXT -- '1,2,3' as comma-separated values
);
\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\-- 1NF compliant: separate rows per product
CREATE TABLE order_items (
order_id INTEGER REFERENCES orders(id),
product_id INTEGER REFERENCES products(id),
quantity INTEGER NOT NULL,
PRIMARY KEY (order_id, product_id)
);
Second Normal Form (2NF)
1NF + every non-key column depends on the whole primary key (relevant for composite keys):
\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\-- Violates 2NF: product_name depends only on product_id, not on order_id
CREATE TABLE order_items (
order_id INTEGER,
product_id INTEGER,
product_name TEXT, -- depends on product_id only
quantity INTEGER,
PRIMARY KEY (order_id, product_id)
);
\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\-- 2NF compliant: product_name moved to products table
CREATE TABLE products (id INTEGER PRIMARY KEY, name TEXT);
Third Normal Form (3NF)
2NF + no transitive dependencies (non-key columns depend only on the primary key):
\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\-- Violates 3NF: category_name depends on category_id, not on order_id
CREATE TABLE products (
id INTEGER PRIMARY KEY,
name TEXT,
category_id INTEGER,
category_name TEXT -- transitively depends on category_id
);
\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\-- 3NF compliant
CREATE TABLE categories (id INTEGER PRIMARY KEY, name TEXT);
CREATE TABLE products (id INTEGER PRIMARY KEY, name TEXT, category_id INTEGER);
Denormalization
Denormalization intentionally adds redundancy for performance. Use it judiciously.
Read-Optimized Denormalization
\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\-- Normalized form
SELECT COUNT(*) FROM orders WHERE user_id = 42;
\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\-- Denormalized: add order_count to users table
ALTER TABLE users ADD COLUMN order_count INTEGER DEFAULT 0;
\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\-- Keep it in sync (via trigger or application logic)
UPDATE users SET order_count = order_count + 1 WHERE id = NEW.user_id;
Pre-Joined Tables
For read-heavy reporting, pre-join frequently accessed data:
CREATE TABLE order_summaries (
order_id INTEGER PRIMARY KEY,
user_email TEXT,
total NUMERIC,
item_count INTEGER,
first_item_name TEXT,
created_at TIMESTAMPTZ
);
\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\-- Refresh periodically
INSERT INTO order_summaries
SELECT o.id, u.email, o.total, COUNT(oi.id),
MIN(p.name), o.created_at
FROM orders o
JOIN users u ON u.id = o.user_id
JOIN order_items oi ON oi.order_id = o.id
JOIN products p ON p.id = oi.product_id
GROUP BY o.id, u.email, o.total, o.created_at
ON CONFLICT (order_id) DO NOTHING;
Naming Conventions
Consistent naming conventions reduce cognitive load and make schemas self-documenting.
Tables
-
Plural nouns :
users,orders,products,order_items -
Join tables : Combine table names:
user_roles,product_categories -
Avoid reserved words : Never name a table
user,order, orgroupwithout quoting
Columns
-
Primary keys :
id(singular, generic) -
Foreign keys :
{table}_id(e.g.,user_id,product_id) -
Timestamps :
created_at,updated_at,deleted_at -
Boolean flags :
is_active,has_billing,email_verified
Indexes
\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\-- Naming convention: idx_{table}_{column(s)}
CREATE INDEX idx_users_email ON users (email);
CREATE INDEX idx_orders_user_date ON orders (user_id, created_at);
\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\-- Unique indexes: uq_{table}_{column(s)}
CREATE UNIQUE INDEX uq_users_email ON users (email);
Timestamp Handling
Timestamps are a common source of bugs in schema design.
Always Use TIMESTAMPTZ
CREATE TABLE events (
id BIGSERIAL PRIMARY KEY,
occurred_at TIMESTAMPTZ NOT NULL DEFAULT NOW(),
created_at TIMESTAMPTZ NOT NULL DEFAULT NOW(),
updated_at TIMESTAMPTZ NOT NULL DEFAULT NOW()
);
TIMESTAMPTZ stores values in UTC internally and converts to the session timezone on display. Using TIMESTAMP without timezone invites timezone-related bugs.
Automatic Updated At
CREATE OR REPLACE FUNCTION trigger_set_updated_at()
RETURNS TRIGGER AS $$
BEGIN
NEW.updated_at = NOW();
RETURN NEW;
END;
$$ LANGUAGE plpgsql;
CREATE TRIGGER set_updated_at
BEFORE UPDATE ON users
FOR EACH ROW
EXECUTE FUNCTION trigger_set_updated_at();
Soft Deletes
ALTER TABLE users ADD COLUMN deleted_at TIMESTAMPTZ;
\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\-- Queries must always filter:
SELECT * FROM users WHERE deleted_at IS NULL;
\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\-- Create a partial index for active users
CREATE INDEX idx_users_active ON users (email) WHERE deleted_at IS NULL;
Schema Anti-Patterns
Entity-Value-Attribute (EAV)
\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\-- Anti-pattern: EAV makes queries painful
CREATE TABLE product_attributes (
product_id INTEGER,
attribute_name TEXT,
attribute_value TEXT,
PRIMARY KEY (product_id, attribute_name)
);
EAV requires complex pivoting for every query. Use JSONB or add columns instead.
Polymorphic Associations
\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\-- Anti-pattern: polymorphic foreign key
CREATE TABLE comments (
id BIGSERIAL PRIMARY KEY,
commentable_type TEXT, -- 'Post' or 'Video'
commentable_id INTEGER -- No foreign key constraint!
);
Use separate join tables or inheritance patterns instead.
Oversized VARCHAR
\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\-- Bad: arbitrary limit
CREATE TABLE users (name VARCHAR(10));
\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\-- Better: use TEXT with application-level validation
CREATE TABLE users (name TEXT);
PostgreSQL treats VARCHAR(n) and TEXT identically internally, but arbitrary limits cause unnecessary application bugs.
Practical Schema Checklist
-
[ ] Primary key on every table (
BIGSERIALorUUID) -
[ ] Foreign keys with indexes on referencing columns
-
[ ]
created_atandupdated_attimestamps -
[ ]
NOT NULLon columns that should never be null -
[ ] Check constraints for domain validation
-
[ ] Unique constraints for business-unique columns
-
[ ] Consistent naming (plurals,
_idsuffix for FKs) -
[ ]
TIMESTAMPTZeverywhere (neverTIMESTAMP) -
[ ] Indexes match query patterns (verified with EXPLAIN)
-
[ ] Appropriate normalization level (3NF typically, denormalize knowingly)
Schema design is a long-term investment. A well-designed schema reduces bugs, makes queries faster, and makes the codebase easier for new developers to understand. Invest the time upfront; the payoff compounds over years of maintenance.
Enjoy this article? Share your thoughts, questions, or experiences in the comments below — your insights help other readers too.
Join the discussion ↓