Data transfer is the line item nobody budgets for and everybody overpays. On one account I inherited, "EC2-Other" and "Data Transfer" together were 22% of the bill, and almost none of it was the obvious "to the internet" egress people worry about. It was traffic crossing Availability Zones internally, all day, every day.

Here is the mental model I use now: every byte that crosses an AZ boundary, a Region boundary, or leaves AWS has a price, and the prices differ by an order of magnitude. Knowing which boundary your bytes cross is half the battle.

Know the boundaries and their prices

The rough hierarchy of data transfer cost, cheapest to most expensive:

  • Same AZ, private IP, free. Keep chatty services in the same AZ when you can.
  • Cross-AZ (in-region), about $0.01/GB each direction, so ~$0.02/GB round trip. This is the silent killer for replicated databases and service meshes.
  • Region to internet (egress), tiered, roughly $0.09/GB for the first 10 TB.
  • Inter-region, varies, commonly $0.02/GB and up depending on the pair.
  • Through a NAT Gateway, $0.045/GB processing on top of whatever the destination transfer costs.

The NAT Gateway trap

The single most common waste I find is private-subnet traffic to S3, DynamoDB, ECR, or other AWS services routed through a NAT Gateway. You pay the $0.045/GB NAT processing fee for traffic that could have gone over a free Gateway VPC endpoint. On a data pipeline pulling terabytes from S3, that fee alone was $3,100/month.

Gateway endpoints for S3 and DynamoDB are free. Add them and route table entries, and that traffic bypasses the NAT entirely.

resource "aws_vpc_endpoint" "s3" {
  vpc_id            = aws_vpc.main.id
  service_name      = "com.amazonaws.us-east-1.s3"
  vpc_endpoint_type = "Gateway"
  route_table_ids   = [aws_route_table.private.id]
}

resource "aws_vpc_endpoint" "dynamodb" {
  vpc_id            = aws_vpc.main.id
  service_name      = "com.amazonaws.us-east-1.dynamodb"
  vpc_endpoint_type = "Gateway"
  route_table_ids   = [aws_route_table.private.id]
}

Interface endpoints (for ECR, Secrets Manager, etc.) are not free, they charge per hour plus per GB, but they are still cheaper than NAT processing for high-volume services, and they keep traffic off the public path.

Stop paying to cross AZs

Cross-AZ traffic is the cost that hides inside "best practice." A Kafka cluster, an OpenSearch cluster, or a chatty microservice mesh spread across three AZs replicates and load-balances across those boundaries constantly. A few tactics:

  1. Use topology-aware routing so a client prefers a same-AZ replica before crossing zones. Many service meshes and the Kubernetes topology.kubernetes.io/zone hints support this.
  2. Keep producer and consumer of high-volume streams in the same AZ where availability requirements allow.
  3. For read-heavy databases, place read replicas so the bulk of reads stay local.
Cross-AZ transfer is billed in both directions. A 1 GB/s replication stream across zones is roughly $50,000/year in transfer alone, before you store a single byte.

Put CloudFront in front of egress

For internet-facing egress, serving through CloudFront is usually cheaper than serving directly from S3 or an ALB, because origin-to-CloudFront transfer is free and CloudFront's per-GB egress is discounted at scale. You also get caching, which cuts origin requests. For a static-asset-heavy site, moving from direct S3 egress to CloudFront cut my egress bill by about 40% even before cache hits were counted.

Find it before you fix it

Turn on the Cost and Usage Report and filter on usage types containing DataTransfer, Bytes, and NatGateway-Bytes. VPC Flow Logs let you attribute cross-AZ traffic to specific source/destination pairs so you know which service to move.

Takeaways

  • Add free S3 and DynamoDB Gateway endpoints so AWS-bound traffic skips the $0.045/GB NAT fee.
  • Cross-AZ transfer is billed both ways; use topology-aware routing to keep chatty traffic same-AZ.
  • Serve internet egress through CloudFront to cut per-GB cost and offload origins via caching.
  • Use the CUR and VPC Flow Logs to attribute transfer cost before changing anything.