Partitioning vs Sharding

Partitioning vs Sharding

Partitioning splits a table within a single database. Sharding splits data across multiple database servers.

Table Partitioning

Divide a large table into smaller physical pieces within one database:

\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\-- Range partitioning

CREATE TABLE orders (

id BIGSERIAL, order_date DATE, total DECIMAL(10,2)

) PARTITION BY RANGE (order_date);

CREATE TABLE orders_2026_01 PARTITION OF orders

FOR VALUES FROM ('2026-01-01') TO ('2026-02-01');

\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\-- List partitioning

CREATE TABLE events (

id BIGSERIAL, event_type TEXT

) PARTITION BY LIST (event_type);

CREATE TABLE events_pageview PARTITION OF events

FOR VALUES IN ('pageview');

\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\-- Hash partitioning

CREATE TABLE sessions (

session_id UUID, user_id INT

) PARTITION BY HASH (session_id);

CREATE TABLE sessions_0 PARTITION OF sessions

FOR VALUES WITH (MODULUS 4, REMAINDER 0);

Partitioning benefits: partition pruning (skip irrelevant partitions), faster maintenance, efficient bulk deletes.

Database Sharding

Distribute data across multiple database servers:

class ShardRouter:

def init(self, shards):

self.shards = shards

def get_shard(self, key):

shard_id = hash(key) % len(self.shards)

return self.shards[shard_id]

Sharding benefits: horizontal scalability for writes, distributes load across servers.

Key Differences

| Aspect | Partitioning | Sharding | |--------|-------------|----------| | Scope | Within one DB | Across servers | | Complexity | Low | High | | Cross-partition queries | Possible | Difficult | | Cross-shard joins | Easy | Very hard | | Scaling | Limited | Near-unlimited |

When to Use

  • Partitioning: Tables > 100GB, time-series data, easy data lifecycle management

  • Sharding: Write throughput exceeds single server, dataset too large for one server

Conclusion

Start with partitioning before considering sharding. Partitioning solves many problems with less complexity. Only shard when a single database is insufficient, and use tools like Vitess or Citus to manage the complexity.