Agent Architectures
Tool-calling loops, planning, memory, and multi-agent patterns
Overview
An LLM agent is a loop where the model decides which tools to call, observes results, and continues until a goal is met. Architectures differ in how they plan and coordinate: the ReAct loop interleaves reasoning and actions; plan-and-execute separates a planning phase from execution; and multi-agent systems split work across specialized roles with an orchestrator. The hard parts are reliable tool schemas, loop termination, and state/memory management.
Syntax / Usage
The foundation is a tool-calling loop. You expose functions with JSON schemas, let the model choose one, execute it, feed the result back, and repeat until it returns a final message.
import json
from openai import OpenAI
client = OpenAI()
TOOLS = [{
"type": "function",
"function": {
"name": "get_weather",
"description": "Current weather for a city",
"parameters": {
"type": "object",
"properties": {"city": {"type": "string"}},
"required": ["city"],
},
},
}]
def get_weather(city: str) -> str:
return f"{city}: 18C, clear"
def run_agent(question: str, max_steps: int = 6) -> str:
messages = [{"role": "user", "content": question}]
for _ in range(max_steps): # bounded loop prevents runaway calls
resp = client.chat.completions.create(
model="gpt-4o", messages=messages, tools=TOOLS,
)
msg = resp.choices[0].message
if not msg.tool_calls:
return msg.content
messages.append(msg)
for call in msg.tool_calls:
args = json.loads(call.function.arguments)
result = get_weather(**args)
messages.append({"role": "tool", "tool_call_id": call.id, "content": result})
return "Stopped: step limit reached."
Examples
Plan-and-execute keeps long tasks coherent: generate a checklist first, then execute steps, which reduces mid-task drift:
def plan(goal: str) -> list[str]:
r = client.chat.completions.create(
model="gpt-4o",
messages=[{"role": "system", "content": "Return a JSON array of steps."},
{"role": "user", "content": goal}],
response_format={"type": "json_object"},
)
return json.loads(r.choices[0].message.content)["steps"]
Multi-agent orchestration routes subtasks to specialists via a supervisor that decides who acts next—useful when one prompt can't hold every responsibility cleanly:
ROLES = {"researcher": "Gather facts with tools.",
"coder": "Write code from findings.",
"reviewer": "Critique code and request fixes."}
def supervisor_route(state: str) -> str:
r = client.chat.completions.create(
model="gpt-4o-mini",
messages=[{"role": "system", "content": f"Pick next role from {list(ROLES)}."},
{"role": "user", "content": state}],
)
return r.choices[0].message.content.strip()
Common Mistakes
- Unbounded loops with no step limit, budget cap, or termination check
- Vague tool descriptions, so the model calls the wrong function or bad args
- Stuffing all history into context instead of summarizing/trimming memory
- No error handling when a tool fails—feed errors back so the agent recovers
- Over-engineering multi-agent setups where one tool-calling loop suffices
See Also
ai-agents large-language-models ai-rag-advanced