The archive

All posts

2026

Building a production RAG system on AWS, start to finish
Machine Learning11 min read

Ingestion, retrieval, generation, and evaluation, a complete reference architecture.

Saurabh Kumar Saurabh Kumar19 Jun 2026
Progressive delivery on AWS with feature flags
DevOps8 min read

Canaries, gradual rollouts, and instant rollback, decoupling deploy from release.

Saurabh Kumar Saurabh Kumar5 Jun 2026
Aurora DSQL: a first look for builders
AWS Services7 min read

A serverless, distributed SQL database, what it is, and where it fits today.

Saurabh Kumar Saurabh Kumar21 May 2026
The cost of over-provisioning, measured
Cost Optimization6 min read

A back-of-envelope model for what idle headroom really costs you per year.

Saurabh Kumar Saurabh Kumar7 May 2026
Scaling inference with SageMaker async endpoints
Machine Learning8 min read

Queue-backed inference for large payloads and bursty traffic, scaling to zero between bursts.

Saurabh Kumar Saurabh Kumar22 Apr 2026
Observability on AWS: metrics, logs, and traces together
AWS8 min read

Wiring CloudWatch, X-Ray, and OpenTelemetry into one coherent view.

Saurabh Kumar Saurabh Kumar8 Apr 2026
Zero-trust networking on AWS with VPC Lattice
Security8 min read

Service-to-service auth and access policies without managing a mesh.

Saurabh Kumar Saurabh Kumar20 Mar 2026
Killing idle resources automatically with EventBridge
Cost Optimization6 min read

Schedule-driven cleanup of dev environments and orphaned resources, on autopilot.

Saurabh Kumar Saurabh Kumar6 Mar 2026
Choosing a message bus on AWS in 2026
AWS Services8 min read

SQS, SNS, EventBridge, Kinesis, MSK, an updated map of which fits which job.

Saurabh Kumar Saurabh Kumar20 Feb 2026
Agentic workflows on Bedrock: patterns and pitfalls
Machine Learning9 min read

Multi-step agents that call tools, what works, what fails, and how to keep costs sane.

Saurabh Kumar Saurabh Kumar6 Feb 2026
Well-Architected: the parts that matter for startups
AWS8 min read

The Framework is huge. Here’s the subset that earns its keep when you’re small.

Saurabh Kumar Saurabh Kumar23 Jan 2026
Your 2026 AWS cost optimization playbook
Cost Optimization10 min read

An updated, prioritized list of cost levers, highest impact first.

Saurabh Kumar Saurabh Kumar9 Jan 2026

2025

Putting an ML model into production: a checklist
Machine Learning8 min read

Everything between “it works in the notebook” and a model serving real traffic.

Saurabh Kumar Saurabh Kumar19 Dec 2025
Athena and the lakehouse: querying S3 like a database
AWS Services8 min read

Serverless SQL over your data lake, partitioning and formats that keep it fast and cheap.

Saurabh Kumar Saurabh Kumar5 Dec 2025
AWS re:Invent6 min read

AWS re:Invent 2025: my recap and the launches that matter

Field notes from Las Vegas: the keynotes, the standout announcements, and what I am taking back to production.

Saurabh Kumar Saurabh Kumar2 Dec 2025
Durable workflows with Step Functions and Lambda
Serverless8 min read

Long-running, fault-tolerant processes without managing servers or your own queue.

Saurabh Kumar Saurabh Kumar20 Nov 2025
Reducing data transfer costs across AWS
Cost Optimization7 min read

Cross-AZ, cross-region, and egress charges, the silent line items, mapped and tamed.

Saurabh Kumar Saurabh Kumar6 Nov 2025
Cost-aware model selection with Bedrock
Machine Learning7 min read

Routing easy requests to small models and hard ones to large, quality at a fraction of the cost.

Saurabh Kumar Saurabh Kumar21 Oct 2025
Caching layers: ElastiCache patterns that scale
AWS7 min read

Cache-aside, write-through, and the failure modes that bite under load.

Saurabh Kumar Saurabh Kumar7 Oct 2025
GuardDuty and Security Hub: what to actually act on
Security7 min read

Cutting through the findings firehose to the alerts that matter.

Saurabh Kumar Saurabh Kumar22 Sep 2025
Auditing your AWS bill: a repeatable monthly routine
Cost Optimization7 min read

A 30-minute monthly ritual that catches waste before it compounds.

Saurabh Kumar Saurabh Kumar8 Sep 2025
Bedrock Agents and Amazon Q: a builder’s first look
Machine Learning8 min read

Tool use, action groups, and where managed agents fit versus rolling your own.

Saurabh Kumar Saurabh Kumar20 Aug 2025
Evaluating LLM outputs at scale on AWS
Machine Learning9 min read

Building an evaluation harness so you can ship prompt and model changes with confidence.

Saurabh Kumar Saurabh Kumar6 Aug 2025
Designing idempotent event-driven systems
AWS8 min read

At-least-once delivery means duplicates. Patterns to make handlers safe to retry.

Saurabh Kumar Saurabh Kumar22 Jul 2025
S3 Intelligent-Tiering: when it saves and when it doesn’t
Cost Optimization6 min read

Automatic tiering sounds free. Here’s the math on when it actually pays off.

Saurabh Kumar Saurabh Kumar8 Jul 2025
Building a feature pipeline with Glue and SageMaker
Machine Learning8 min read

From raw data in S3 to model-ready features, orchestrated and repeatable.

Saurabh Kumar Saurabh Kumar20 Jun 2025
GitHub Actions to AWS with OIDC, no long-lived keys
DevOps7 min read

Federated, short-lived credentials for CI, delete those access keys for good.

Saurabh Kumar Saurabh Kumar6 Jun 2025
Kinesis vs MSK for streaming on AWS
AWS Services8 min read

Managed Kafka or Kinesis Data Streams? Throughput, ops burden, and cost compared.

Saurabh Kumar Saurabh Kumar21 May 2025
Compute Savings Plans: a practical buying strategy
Cost Optimization7 min read

How much to commit, for how long, and how to avoid over-committing into a corner.

Saurabh Kumar Saurabh Kumar7 May 2025
Real-time inference endpoints that don’t break the bank
Machine Learning8 min read

Autoscaling, multi-model endpoints, and serverless inference, paying for what you use.

Saurabh Kumar Saurabh Kumar22 Apr 2025
Multi-region architectures: what you really need
AWS9 min read

Active-active, active-passive, or just backups? Matching resilience to actual requirements.

Saurabh Kumar Saurabh Kumar8 Apr 2025
Rotating secrets with AWS Secrets Manager
Security7 min read

Automatic rotation for database credentials and API keys, with zero downtime.

Saurabh Kumar Saurabh Kumar21 Mar 2025
FinOps for engineers: making cost a first-class metric
Cost Optimization7 min read

Bringing cost into the engineering loop without turning everyone into an accountant.

Saurabh Kumar Saurabh Kumar7 Mar 2025
Picking compute: ECS, EKS, or Fargate
AWS Services8 min read

A decision framework for container compute on AWS that doesn’t cargo-cult big-company stacks.

Saurabh Kumar Saurabh Kumar20 Feb 2025
Bedrock vs self-hosting LLMs: a cost breakdown
Machine Learning9 min read

Per-token pricing vs GPU instances, where the crossover point actually sits.

Saurabh Kumar Saurabh Kumar6 Feb 2025
Building an internal developer platform on AWS
DevOps9 min read

Golden paths, self-service, and the AWS building blocks that make a platform team possible.

Saurabh Kumar Saurabh Kumar23 Jan 2025
The 2025 guide to AWS cost optimization
Cost Optimization10 min read

A structured walk through every major lever, from compute commitments to storage tiering.

Saurabh Kumar Saurabh Kumar9 Jan 2025

2024

Monitoring ML models in production with SageMaker
Machine Learning8 min read

Data drift, model drift, and the metrics that tell you a model has quietly gone stale.

Saurabh Kumar Saurabh Kumar19 Dec 2024
Choosing between RDS, Aurora, and DynamoDB
AWS Services8 min read

A decision framework for picking a database on AWS, by access pattern, not hype.

Saurabh Kumar Saurabh Kumar6 Dec 2024
API Gateway vs ALB for serverless APIs
Serverless7 min read

Two front doors for Lambda. Cost, features, and latency compared.

Saurabh Kumar Saurabh Kumar21 Nov 2024
Forecasting AWS spend with Cost Explorer and budgets
Cost Optimization6 min read

Turn last month’s surprise into next month’s forecast, with alerts before you blow the budget.

Saurabh Kumar Saurabh Kumar7 Nov 2024
RAG on AWS: Bedrock Knowledge Bases end to end
Machine Learning10 min read

Ingest documents, chunk, embed, and query, a working RAG setup with managed pieces.

Saurabh Kumar Saurabh Kumar22 Oct 2024
EventBridge Pipes: connecting services without glue code
AWS Services7 min read

Point-to-point integrations with filtering and enrichment, minus the Lambda plumbing.

Saurabh Kumar Saurabh Kumar8 Oct 2024
Securing S3 buckets: a checklist that actually helps
Security7 min read

Block Public Access, policies, encryption, and the misconfigurations that cause breaches.

Saurabh Kumar Saurabh Kumar20 Sep 2024
Graviton: the easiest 20% cost cut you’re not taking
Cost Optimization6 min read

ARM-based instances are cheaper and faster for most workloads. Migrating is easier than you think.

Saurabh Kumar Saurabh Kumar6 Sep 2024
Fine-tuning foundation models on Amazon Bedrock
Machine Learning9 min read

When fine-tuning beats prompting, and how the Bedrock workflow actually looks.

Saurabh Kumar Saurabh Kumar21 Aug 2024
Step Functions for orchestrating long-running jobs
AWS Services8 min read

Coordinate retries, timeouts, and human approval without writing your own state machine.

Saurabh Kumar Saurabh Kumar7 Aug 2024
CloudFront caching strategies for dynamic sites
AWS7 min read

Cache keys, TTLs, and invalidation, squeezing hit rates out of not-quite-static content.

Saurabh Kumar Saurabh Kumar23 Jul 2024
A tagging strategy that makes cost allocation possible
Cost Optimization6 min read

Without consistent tags, your cost reports are fiction. A tagging policy that sticks.

Saurabh Kumar Saurabh Kumar8 Jul 2024
Vector search on AWS with OpenSearch
Machine Learning8 min read

Storing and querying embeddings for semantic search and RAG, without a new database.

Saurabh Kumar Saurabh Kumar20 Jun 2024
Managing Terraform state with S3 and DynamoDB locking
DevOps7 min read

Remote state, locking, and the layout that prevents “who applied what” incidents.

Saurabh Kumar Saurabh Kumar6 Jun 2024
Aurora Serverless v2: when it makes sense
AWS Services7 min read

Autoscaling Postgres/MySQL that scales to fractional capacity, and when it’s the wrong call.

Saurabh Kumar Saurabh Kumar22 May 2024
Spot Instances in production: where they pay off
Cost Optimization8 min read

Up to 90% off, if your workload tolerates interruption. Patterns that make it safe.

Saurabh Kumar Saurabh Kumar8 May 2024
Serving models cheaply with Lambda container images
Machine Learning7 min read

Package a model in a container, serve it from Lambda, pay only when it runs.

Saurabh Kumar Saurabh Kumar23 Apr 2024
VPC design patterns for small teams
AWS8 min read

Subnets, routing, and endpoints, a pragmatic VPC layout you won’t outgrow next quarter.

Saurabh Kumar Saurabh Kumar9 Apr 2024
Lambda cold starts: measuring what actually matters
Serverless8 min read

Cold starts are real but often misunderstood. How to measure and when to care.

Saurabh Kumar Saurabh Kumar22 Mar 2024
Cutting CloudWatch costs: logs, metrics, and retention
Cost Optimization6 min read

Log ingestion and custom metrics add up fast. A checklist to trim the bill.

Saurabh Kumar Saurabh Kumar7 Mar 2024
Feature stores on AWS: do you actually need one?
Machine Learning7 min read

SageMaker Feature Store solves a real problem, but only past a certain scale.

Saurabh Kumar Saurabh Kumar21 Feb 2024
Understanding IAM policy evaluation logic
Security8 min read

Allow, deny, boundaries, SCPs, the order AWS evaluates them, and why your policy isn’t working.

Saurabh Kumar Saurabh Kumar6 Feb 2024
DynamoDB single-table design for people who hate it
AWS Services10 min read

A gentler path into single-table modeling, with access patterns front and center.

Saurabh Kumar Saurabh Kumar24 Jan 2024
The hidden cost of NAT Gateways (and how to cut it)
Cost Optimization6 min read

NAT Gateways quietly bleed money on data processing. Here’s how to find and fix it.

Saurabh Kumar Saurabh Kumar9 Jan 2024

2023