Agent Architectures

Overview

An LLM agent is a loop where the model decides which tools to call, observes results, and continues until a goal is met. Architectures differ in how they plan and coordinate: the ReAct loop interleaves reasoning and actions; plan-and-execute separates a planning phase from execution; and multi-agent systems split work across specialized roles with an orchestrator. The hard parts are reliable tool schemas, loop termination, and state/memory management.

Syntax / Usage

The foundation is a tool-calling loop. You expose functions with JSON schemas, let the model choose one, execute it, feed the result back, and repeat until it returns a final message.

import json
from openai import OpenAI

client = OpenAI()

TOOLS = [{
    "type": "function",
    "function": {
        "name": "get_weather",
        "description": "Current weather for a city",
        "parameters": {
            "type": "object",
            "properties": {"city": {"type": "string"}},
            "required": ["city"],
        },
    },
}]

def get_weather(city: str) -> str:
    return f"{city}: 18C, clear"

def run_agent(question: str, max_steps: int = 6) -> str:
    messages = [{"role": "user", "content": question}]
    for _ in range(max_steps):  # bounded loop prevents runaway calls
        resp = client.chat.completions.create(
            model="gpt-4o", messages=messages, tools=TOOLS,
        )
        msg = resp.choices[0].message
        if not msg.tool_calls:
            return msg.content
        messages.append(msg)
        for call in msg.tool_calls:
            args = json.loads(call.function.arguments)
            result = get_weather(**args)
            messages.append({"role": "tool", "tool_call_id": call.id, "content": result})
    return "Stopped: step limit reached."

Examples

Plan-and-execute keeps long tasks coherent: generate a checklist first, then execute steps, which reduces mid-task drift:

def plan(goal: str) -> list[str]:
    r = client.chat.completions.create(
        model="gpt-4o",
        messages=[{"role": "system", "content": "Return a JSON array of steps."},
                  {"role": "user", "content": goal}],
        response_format={"type": "json_object"},
    )
    return json.loads(r.choices[0].message.content)["steps"]

Multi-agent orchestration routes subtasks to specialists via a supervisor that decides who acts next—useful when one prompt can't hold every responsibility cleanly:

ROLES = {"researcher": "Gather facts with tools.",
         "coder": "Write code from findings.",
         "reviewer": "Critique code and request fixes."}

def supervisor_route(state: str) -> str:
    r = client.chat.completions.create(
        model="gpt-4o-mini",
        messages=[{"role": "system", "content": f"Pick next role from {list(ROLES)}."},
                  {"role": "user", "content": state}],
    )
    return r.choices[0].message.content.strip()

Common Mistakes

Unbounded loops with no step limit, budget cap, or termination check
Vague tool descriptions, so the model calls the wrong function or bad args
Stuffing all history into context instead of summarizing/trimming memory
No error handling when a tool fails—feed errors back so the agent recovers
Over-engineering multi-agent setups where one tool-calling loop suffices

Overview

Syntax / Usage

Examples

Common Mistakes

See Also