Database Scalability

Scalability Options

Database scalability options range from simple to complex. Start with the simplest approach and evolve.

Vertical Scaling

Upgrade to a larger server with more CPU, RAM, and storage.

AWS RDS instance upgrade

resource "aws_db_instance" "main" {

instance_class = "db.r6g.8xlarge" # 32 vCPU, 256GB RAM

allocated_storage = 5000 # 5TB SSD

}

Simple but has a cost ceiling and hardware limits.

Read Replicas

Offload read traffic to replicas:

class DatabaseRouter:

def init(self, primary, replicas):

self.primary = primary

self.replicas = replicas

def get_conn(self, write=False):

if write:

return self.primary

return random.choice(self.replicas)

Route reads to replicas, writes to primary

db_router.get_conn(write=True).execute("INSERT INTO ...")

results = db_router.get_conn(write=False).execute("SELECT ...")

Effective for read-heavy workloads. Does not help with write scaling.

Caching

Reduce database load with in-memory caching:

def get_user(user_id):

user = cache.get(f"user:{user_id}")

if not user:

user = db.query("SELECT * FROM users WHERE id = %s", user_id)

cache.setex(f"user:{user_id}", 3600, json.dumps(user))

return user

Horizontal Scaling (Sharding)

Distribute data across multiple database servers:

class ShardManager:

def init(self, shards):

self.shards = shards

def get_shard(self, customer_id):

return self.shards[hash(customer_id) % len(self.shards)]

Most complex. Use tools like Vitess, Citus, or CockroachDB.

Scaling Decision Tree

Is DB overloaded?

├── Read-heavy? → Add read replicas

├── Write-heavy?

│ ├── Can you cache? → Add Redis/memcached

│ └── Cache insufficient? → Shard

└── Both? → Scale vertically first, then shard

Conclusion

Scale vertically first (simple). Add read replicas for read loads. Add caching for repeated queries. Shard only when necessary. Monitor your bottleneck before choosing a strategy. Most applications never need sharding.