Agentic workflows on Bedrock: patterns and pitfalls
Multi-step agents that call tools, what works, what fails, and how to keep costs sane.
An "agent" is a model that can call tools in a loop until it decides it is done. The first one I built looked magical in the demo and then, in production, called the same search tool nine times in a row, burned through my token budget, and confidently returned a wrong answer. Agentic systems are powerful and they fail in ways that ordinary request/response code does not.
On Amazon Bedrock you can build these with the managed Agents for Bedrock feature or roll your own loop with the Converse API and tool use. I have shipped both. Here are the patterns that hold up and the pitfalls that bite.
The core loop is small; the danger is in the loop
Stripped down, an agent is: send the conversation plus tool definitions to the model, if it asks to use a tool then run the tool and feed the result back, repeat until it answers. The Bedrock Converse API standardizes this across model families.
import boto3
brt = boto3.client("bedrock-runtime")
tools = [{
"toolSpec": {
"name": "get_order_status",
"description": "Look up the status of an order by ID.",
"inputSchema": {"json": {
"type": "object",
"properties": {"order_id": {"type": "string"}},
"required": ["order_id"],
}},
}
}]
messages = [{"role": "user", "content": [{"text": "Where is order A-4471?"}]}]
for _ in range(MAX_STEPS): # hard cap, never an unbounded while-loop
resp = brt.converse(
modelId="anthropic.claude-3-5-sonnet-20241022-v2:0",
messages=messages,
toolConfig={"tools": tools},
)
out = resp["output"]["message"]
messages.append(out)
if resp["stopReason"] != "tool_use":
break
for block in out["content"]:
if "toolUse" in block:
tu = block["toolUse"]
result = run_tool(tu["name"], tu["input"]) # your code
messages.append({"role": "user", "content": [{
"toolResult": {"toolUseId": tu["toolUseId"],
"content": [{"json": result}]}
}]})
Notice the for _ in range(MAX_STEPS). That bound is not optional. An unbounded loop is how you get a $400 runaway overnight.
Patterns that work
- Narrow, well-described tools. The model picks tools from your descriptions, so a vague description causes wrong calls. Give each tool a tight job and a precise schema.
- Orchestrator plus workers. One agent plans and delegates to specialized sub-agents or plain functions. Easier to test and cheaper than one giant do-everything agent.
- Reflection only when it pays. A second "critique your answer" pass improves quality on hard tasks but doubles cost. Reserve it for steps that need it.
- Human-in-the-loop gates on irreversible actions (refunds, deletes). The agent proposes; a person or a rule approves.
Pitfalls that bite
- Runaway loops and cost. Cap steps, cap tokens, and set a per-session budget. Log every tool call so a loop is visible.
- Tools with side effects on retry. The model may call a tool twice. Make tools idempotent or guard them with an idempotency key.
- Prompt injection through tool output. If a tool returns web content or user data, treat it as untrusted, it can contain instructions trying to hijack the agent. Never let tool output silently expand the agent's permissions.
- Latency stacking. Each loop step is a full model round trip. A five-step task is five sequential calls; users feel it. Stream tokens and show progress.
Treat an agent like an intern with API keys: capable, fast, and occasionally about to do something irreversible. Bound its budget, scope its permissions, and gate the actions you can't undo.
Managed Agents vs. rolling your own
Agents for Bedrock handle orchestration, memory, and Knowledge Base retrieval for you, great for getting to a working RAG-plus-tools agent fast. Rolling your own with Converse gives you full control over the loop, custom stopping logic, and easier local testing. I start managed for prototypes and move to a custom loop once I need precise control over cost, retries, and observability.
Takeaways
- The agent loop is simple; always bound it with a max-step cap, token limits, and a per-session budget.
- Use narrow tools with precise schemas and an orchestrator/worker split rather than one mega-agent.
- Make tools idempotent, treat tool output as untrusted (prompt injection), and gate irreversible actions.
- Start with managed Agents for Bedrock for speed; move to a custom Converse loop when you need control over cost and observability.