Scaling Databases¶

When to Scale `[B]`¶

Signs your database needs scaling attention:

CPU consistently > 70%
Connection pool exhausted (connection wait times increasing)
Query latency p99 degrading over time
Replication lag growing
Disk I/O saturation
Lock contention increasing

Before scaling hardware, exhaust software options: 1. Optimize slow queries (Performance Tuning) 2. Add appropriate indexes 3. Tune connection pooling (Fundamentals: Connection Pooling) 4. Cache frequently-read data 5. Then scale

Read Replicas `[I]`¶

Offload read traffic from the primary.

                    ┌─ Read Replica 1 ─── Analytics queries
Primary ─── WAL ──►─┤
(writes)            └─ Read Replica 2 ─── Application reads

When to use: - Read-heavy workloads (reporting, analytics) - Application can tolerate some replication lag - Separate analytics from OLTP queries

When NOT to use: - Reads that must see writes immediately (use primary for these) - If replication lag is regularly > your SLO tolerance

Routing Read/Write Traffic¶

# SQLAlchemy with read/write splitting
from sqlalchemy import create_engine
from sqlalchemy.orm import sessionmaker

write_engine = create_engine("postgresql://primary.db/mydb")
read_engine = create_engine("postgresql://replica.db/mydb")

WriteSession = sessionmaker(bind=write_engine)
ReadSession = sessionmaker(bind=read_engine)

# Or use a proxy: ProxySQL (MySQL), PgBouncer pools, Pgpool-II

Aurora Read Replicas¶

Aurora supports up to 15 read replicas with shared storage (< 10ms lag typical):

resource "aws_rds_cluster_instance" "read_replica" {
  count              = 2
  cluster_identifier = aws_rds_cluster.main.id
  instance_class     = "db.r6g.xlarge"
  engine             = "aurora-postgresql"
}

Connection Pooling at Scale `[I]`¶

Connection limits:

PostgreSQL: max_connections = 200 (default)
Each connection: ~5-10MB RAM
100 connections = 500MB-1GB RAM

With many app instances, connection pooling is essential:

50 app pods × 10 connections each = 500 connections → exceeds max_connections

Solution:
50 app pods → PgBouncer (10 connections to PgBouncer each = 500 app connections)
PgBouncer → 20 connections → PostgreSQL

→ See Fundamentals: Connection Pooling

Vertical Scaling (Scale Up) `[I]`¶

Increasing instance size:

db.t3.medium (2 vCPU, 4GB) → db.r6g.2xlarge (8 vCPU, 64GB)

With RDS:

aws rds modify-db-instance \
  --db-instance-identifier prod-db \
  --db-instance-class db.r6g.2xlarge \
  --apply-immediately  # or during next maintenance window

Downtime: Standard RDS requires a brief failover (~60 seconds for Multi-AZ). Aurora is usually online.

Right-sizing: - CPU: target 60-70% average, headroom for spikes - RAM: buffer pool should fit working dataset - IOPS: check CloudWatch ReadIOPS and WriteIOPS

Caching Layer `[I]`¶

Redis/Memcached in front of the database for frequently-read, rarely-changing data.

Cache-Aside Pattern¶

def get_user(user_id: int) -> dict:
    cache_key = f"user:{user_id}"

    # Try cache first
    cached = redis.get(cache_key)
    if cached:
        return json.loads(cached)

    # Miss: read from DB
    user = db.query("SELECT * FROM users WHERE id = %s", user_id)

    # Store in cache with TTL
    redis.setex(cache_key, 3600, json.dumps(user))  # 1 hour TTL

    return user

def update_user(user_id: int, data: dict):
    db.execute("UPDATE users SET ... WHERE id = %s", user_id)
    redis.delete(f"user:{user_id}")  # Invalidate cache

Write-Through Cache¶

def update_user(user_id: int, data: dict):
    db.execute("UPDATE users SET ... WHERE id = %s", user_id)
    redis.setex(f"user:{user_id}", 3600, json.dumps(data))  # Update cache

What to Cache¶

Good candidates	Bad candidates
User profiles	Financial balances
Product catalog	Inventory counts
Configuration	Orders in progress
Reference data	Anything requiring strong consistency

Sharding `[A]`¶

Horizontal partitioning: split data across multiple databases.

Shard 1: user_id 1-1,000,000       → db-shard-01
Shard 2: user_id 1,000,001-2,000,000 → db-shard-02
Shard 3: user_id 2,000,001+        → db-shard-03

Sharding strategies: - Range-based — by ID or date range (simple, but hot shards possible) - Hash-based — shard = hash(user_id) % num_shards (even distribution, harder to rebalance) - Directory-based — lookup table maps key → shard (flexible, extra hop)

Sharding complexity: - Cross-shard queries are expensive or impossible - No cross-shard transactions - Rebalancing when adding shards is hard - Requires application-level shard routing

Use sharding only when other options are exhausted. Most companies never need it.

PostgreSQL Partitioning (Alternative to Sharding)¶

Partition a single large table for performance (all data in one DB):

-- Range partitioning by date
CREATE TABLE orders (
    id BIGSERIAL,
    created_at TIMESTAMP NOT NULL,
    customer_id INTEGER,
    total DECIMAL(10,2)
) PARTITION BY RANGE (created_at);

CREATE TABLE orders_2024_01 PARTITION OF orders
    FOR VALUES FROM ('2024-01-01') TO ('2024-02-01');

CREATE TABLE orders_2024_02 PARTITION OF orders
    FOR VALUES FROM ('2024-02-01') TO ('2024-03-01');

-- Queries with partition pruning only scan relevant partitions
SELECT * FROM orders
WHERE created_at BETWEEN '2024-01-15' AND '2024-01-31';
-- → only scans orders_2024_01

Distributed Databases `[A]`¶

When PostgreSQL/MySQL can't scale further:

Database	Type	Strength
CockroachDB	Distributed SQL	PostgreSQL-compatible, true ACID
YugabyteDB	Distributed SQL	PostgreSQL-compatible, multi-region
Cassandra	Wide-column NoSQL	Extreme write scale, eventual consistency
DynamoDB	Key-value/document	Serverless scale, AWS-native
Vitess	MySQL proxy	YouTube/Slack approach to sharding MySQL

Migration to distributed DB is a major multi-year effort. Exhaust vertical and read-scaling options first.

Scaling Decision Framework `[I]`¶

Problem: DB is slow/overloaded
    │
    ├── Is it a specific query? → Performance tuning, indexes
    │
    ├── Is it read-heavy? → Read replicas + caching
    │
    ├── Is it connection pressure? → Better connection pooling
    │
    ├── Is it write-heavy? → Vertical scale, batch writes, async processing
    │
    ├── Is the dataset just huge? → Partitioning, archiving old data
    │
    └── All of the above exhausted? → Sharding or distributed DB

Real-World Database Scaling `[A]`¶

From awesome-scalability and howtheysre:

Company	Scale	Approach
Instagram	1B+ users	PostgreSQL sharded by user_id, no NoSQL, Django ORM
GitHub	Billions of git objects	MySQL with Vitess-style sharding, extensive caching
Shopify	Black Friday peaks	MySQL + PlanetScale (Vitess), aggressive connection pooling
Notion	Explosive growth	PostgreSQL, vertical scaling + read replicas first
Discord	Message history	Cassandra for messages (append-only workload, no updates)
Stripe	Financial transactions	PostgreSQL, strong ACID, sharding by merchant

What Instagram teaches us: - Stayed on PostgreSQL to 1B users - Sharding was the hard part, not the choice of database - Having to add NoSQL would have added complexity without clarity

What Discord teaches us: - Choose data model based on access patterns, not hype - Cassandra is excellent for append-only, time-series, high-write workloads - Don't use Cassandra for relational data with complex queries

Performance Tuning — optimize before scaling
Fundamentals: Connection Pooling
Backup & Recovery — replicas and HA
SRE: Scalability — system-level scaling patterns
Platform: Cloud Infrastructure — RDS, Aurora, ElastiCache
awesome-scalability — real-world scaling stories
data-engineer-handbook — data engineering at scale

Scaling Databases¶

When to Scale [B]¶

Read Replicas [I]¶

Routing Read/Write Traffic¶

Aurora Read Replicas¶

Connection Pooling at Scale [I]¶

Vertical Scaling (Scale Up) [I]¶

Caching Layer [I]¶