Kinesis vs MSK for streaming on AWS, The Cloud Ledger

When my team needed to move clickstream events into a real-time pipeline, the architecture review turned into a religious war: half the room wanted Kinesis Data Streams because "it's serverless," the other half wanted MSK because "it's just Kafka and we know Kafka." Both camps were right and both were arguing about the wrong thing.

The real question isn't which is technically better, both move millions of records per second reliably. It's which one matches your team's operational appetite and your ecosystem constraints. Here's how I break the decision down without the dogma.

The fundamental trade: managed-for-you vs control

Kinesis Data Streams is an AWS-native, fully managed streaming primitive. MSK is managed Apache Kafka, AWS runs the brokers and ZooKeeper/KRaft, but it's real Kafka with the full protocol, ecosystem, and tuning surface.

Dimension	Kinesis Data Streams	Amazon MSK
Protocol	AWS-proprietary API	Native Kafka protocol
Scaling unit	Shard (or on-demand)	Partitions across brokers
Per-shard throughput	1 MB/s in, 2 MB/s out, 1000 rec/s	Limited by broker instance size
Retention	24h default, up to 365 days	Configurable, effectively unlimited with tiered storage
Ops burden	Near zero	You own broker sizing, partitions, rebalancing
Ecosystem	AWS services (Lambda, Firehose, Flink)	Entire Kafka ecosystem (Connect, Streams, Schema Registry)

Choose Kinesis when you want streaming to be a feature you consume. Choose MSK when streaming is a core competency your team owns, and when you need the Kafka ecosystem you already depend on.

Where Kinesis wins

For greenfield AWS-native pipelines, Kinesis is the path of least resistance. On-demand mode removes the single most annoying thing about Kinesis, capacity planning, by autoscaling shards based on observed throughput. You write a producer and forget about brokers entirely:

import boto3, json

kinesis = boto3.client("kinesis")

kinesis.put_record(
    StreamName="clickstream",
    Data=json.dumps({"user": "u123", "event": "add_to_cart", "ts": 1718900000}),
    PartitionKey="u123",   # determines shard; pick a high-cardinality key
)

The integrations are the real draw: a Lambda can consume a stream with zero infrastructure, Kinesis Data Firehose can land records in S3/Redshift/OpenSearch with no code, and Managed Service for Apache Flink does stateful processing. If your consumers are AWS services, Kinesis is almost always less work.

Where MSK wins

MSK earns its keep when you're already invested in Kafka. If you have Kafka Connect connectors, Kafka Streams apps, a Confluent Schema Registry, or consumers written against the Kafka client, MSK lets you lift-and-shift without rewriting any of it. A few specific pulls:

Protocol compatibility, existing Kafka producers/consumers work unchanged, including non-AWS systems.
Consumer groups, Kafka's mature consumer-group model is richer than Kinesis's enhanced fan-out for complex multi-consumer topologies.
Ordering and partitions, fine-grained partition control when you need precise ordering guarantees.
MSK Serverless, if the ops burden is the only thing holding you back, MSK Serverless removes broker sizing while keeping the Kafka API.

The cost shape is different

This catches people out. Kinesis bills per shard-hour plus per million PUT payload units, costs scale with throughput in clean increments, and it can be very cheap at low volume. MSK (provisioned) bills per broker-hour plus storage, so you pay for the cluster whether or not data flows through it. A 3-broker kafka.m5.large cluster runs continuously regardless of traffic.

Rough rule: at low and bursty volumes, Kinesis on-demand is usually cheaper because it scales toward zero. At high, sustained throughput, a well-packed MSK cluster often wins on cost-per-GB, but only if you're keeping the brokers busy.

How I actually decide

Already running Kafka / need the Kafka ecosystem? MSK (Serverless if you want to drop the ops).
Greenfield, AWS-native consumers, small team? Kinesis Data Streams on-demand.
Need landing data in S3/Redshift with no code? Kinesis + Firehose, no contest.
Sustained multi-GB/s with a dedicated platform team? Provisioned MSK for cost control.

Takeaways

The decision is operational appetite and ecosystem fit, not raw capability, both scale to millions of records per second.
Kinesis wins for AWS-native, low-ops, integration-heavy pipelines; on-demand mode removes shard planning.
MSK wins when you're already on Kafka or need its protocol, connectors, and consumer-group model; MSK Serverless splits the difference.
Cost shapes differ: Kinesis scales with throughput toward zero; MSK provisioned charges for the cluster continuously, so keep brokers busy.