Aurora Serverless v2: when it makes sense
Autoscaling Postgres/MySQL that scales to fractional capacity, and when it’s the wrong call.
Aurora Serverless v2 gets pitched as "the database that scales to zero so you stop paying for idle." That sentence is wrong in an important way, and believing it led me to expect a bill that never showed up. v2 is a genuinely good product, it just solves a different problem than the marketing implied. After running it for a staging fleet and one production service, here's when I actually reach for it.
What v2 changed from v1
Serverless v1 paused entirely when idle and resumed cold, great for "scale to zero," terrible because resumes took several seconds and it couldn't be in a normal cluster with provisioned instances. v2 took a different path: it scales capacity in fine-grained increments while staying online, measured in Aurora Capacity Units (ACUs). Each ACU is roughly 2GB of memory plus associated CPU. It adjusts in 0.5-ACU steps, fast, with no connection drops.
The catch that surprises everyone: until recently the floor was a minimum of 0.5 ACU, meaning it did not scale to zero, you paid for at least 0.5 ACU continuously. (A scale-to-zero / auto-pause capability exists in newer configurations, but it's opt-in and behaves differently from "free when idle.") Plan around a non-zero floor unless you've explicitly enabled and tested pausing.
v2's superpower isn't "free when idle." It's "follows a variable load smoothly without you guessing instance sizes or doing manual scaling." Buy it for variability, not for zero.
When it makes sense
- Spiky or unpredictable load, traffic that's 0.5 ACU at night and 16 ACU at lunch. v2 rides that curve so you don't over-provision for the peak.
- Many small databases, dev/test/staging fleets, or SaaS per-tenant databases where most are mostly idle.
- New workloads with unknown sizing, let it find the right capacity before you commit to provisioned + Reserved.
- Mixed clusters, v2 readers alongside provisioned writers, which v1 couldn't do.
When provisioned wins
If your load is steady, provisioned instances are cheaper, full stop. Per-ACU pricing carries a premium over an equivalently-sized provisioned instance, and a flat workload pays that premium 24/7 for elasticity it never uses. A predictable production database running at a stable 8 ACU-equivalent is almost always cheaper on a provisioned db.r6g instance with a Reserved Instance commitment.
| Workload shape | Better choice | Why |
|---|---|---|
| Spiky / unpredictable | Serverless v2 | Pay for the curve, not the peak |
| Steady, predictable | Provisioned + RI | No elasticity premium |
| Mostly idle dev fleet | Serverless v2 | Low floor beats idle provisioned |
Provisioning it
You set a min and max ACU range; the floor is your idle cost, the ceiling is your safety cap. Choose the floor based on how fast you need to absorb a spike, a higher floor warms more cache and scales up faster:
resource "aws_rds_cluster" "app" {
cluster_identifier = "app-aurora"
engine = "aurora-postgresql"
engine_mode = "provisioned" # v2 uses provisioned mode + serverlessv2 scaling
serverlessv2_scaling_configuration {
min_capacity = 0.5 # idle floor (your baseline cost)
max_capacity = 16 # safety ceiling for peaks
}
}
resource "aws_rds_cluster_instance" "app" {
cluster_identifier = aws_rds_cluster.app.id
instance_class = "db.serverless"
engine = aws_rds_cluster.app.engine
}
Set max_capacity high enough to survive a real spike but low enough to be a meaningful cost guardrail, it's the line between graceful scaling and a surprise bill.
Watch the scaling behavior
After launch, watch the ServerlessDatabaseCapacity and ACUUtilization CloudWatch metrics for a week. If capacity is pinned at your floor all day, your workload is steady and you're paying the elasticity premium for nothing, move to provisioned. If it's constantly slamming the ceiling, raise max_capacity or you're throttling yourself.
Takeaways
- v2 scales smoothly online in 0.5-ACU steps; its value is tracking variable load, not being free when idle.
- Expect a non-zero floor (historically 0.5 ACU) unless you explicitly enable and test the newer pause behavior.
- Choose it for spiky, unpredictable, or many-small-database workloads; choose provisioned + Reserved for steady load.
- Validate with
ACUUtilizationafter launch, pinned at the floor means switch to provisioned; pinned at the ceiling means raise the max.