Agents
An Agent wraps a model + instructions + (optional) tools, guardrails, memory, and cost controls into one callable object. You call await agent.run(task) and get back an AgentResult.
There are two agent surfaces:
| Import | Style | When to use |
|---|---|---|
from largestack import Agent |
Classic / keyword args | Tools as a list, named guardrails, fallback, callbacks |
from largestack.decorators import Agent |
Typed Agent[Deps, Out] |
Typed dependency injection, @agent.tool, @agent.output_validator |
Both run on the same engine. This page covers the classic Agent; the typed one is at the bottom.
Constructing an Agent
from largestack import Agent
agent = Agent(
name="helper",
instructions="You are a helpful assistant.",
llm="openai/gpt-4o-mini",
)
Constructor arguments
| Arg | Type | Default | Meaning |
|---|---|---|---|
name |
str |
required | Agent name (used in traces / audit / shared memory keys). |
instructions |
str |
"You are a helpful assistant." |
System prompt. |
llm |
str \| None |
config default | Provider model id, e.g. "openai/gpt-4o-mini", "anthropic/claude-...", or "auto". |
tools |
list \| None |
[] |
Tools to register (see Tools). |
guardrails |
see below | config default | False to disable, a list of names, or a GuardrailPipeline. See Guardrails. |
memory |
object | ConversationMemory() |
Conversation memory. See Memory. |
cost_budget |
float \| None |
config default | Hard per-run USD ceiling. Run raises BudgetExceededError if exceeded. |
max_turns |
int \| None |
config default | Max LLM↔tool turns before the engine forces a final answer. |
tool_permissions |
dict \| None |
None |
{"allow": [...]} / {"deny": [...]} static allow/deny. See Tools. |
tool_policy |
ToolAccessPolicy \| None |
None |
Runtime rate-limit + parameter validation. See Tools. |
shared_memory |
object | None |
A SharedMemorySpace for cross-agent data. |
retries |
int |
0 |
Extra attempts on failure (0 = one attempt total). |
fallback |
Agent \| None |
None |
Another agent to try if this one exhausts its retries. |
on_complete |
callable | None |
Called with the AgentResult on success (sync or async). |
on_error |
callable | None |
Called with the exception on each failed attempt. |
steering |
list \| None |
None |
Steering rules applied before tools / after the model. |
guardrails=False is an explicit opt-out — use it for tests, benchmarks, and trusted local runs.
The run loop
await agent.run(task) drives this loop (in largestack/_core/engine.py):
- Build messages — system instructions + prior memory + the user
task. - Input guardrails —
check_input(messages)(skipped if guardrails are off). - LLM call — through the gateway (or a
TestModel, if overridden). - Output guardrails —
check_output(response). - Tool loop — if the model requested tools, each is executed and the result is fed back; loop returns to step 2. Bounded by
max_turns,cost_budget, and loop detection. - Return — when the model answers with text (no tool calls), a final
AgentResultis produced and a trace row is written.
The whole run is wrapped in retries + fallback, and every run emits events + an audit/trace row.
import asyncio
from largestack import Agent
async def main():
agent = Agent(name="helper", llm="openai/gpt-4o-mini")
result = await agent.run("Summarize the plot of Hamlet in one line.")
print(result.content)
print(f"${result.total_cost:.6f}, {result.turns} turns, trace {result.trace_id}")
asyncio.run(main())
Not in an event loop? Use the sync wrapper:
from largestack import Agent
agent = Agent(name="helper", llm="openai/gpt-4o-mini")
result = agent.run_sync("Hello!") # raises RuntimeError if a loop is already running
AgentResult
run() returns an AgentResult (a Pydantic model from largestack.types):
| Field | Type | Meaning |
|---|---|---|
content |
str |
The final text answer. |
agent_name |
str |
Which agent produced it. |
total_cost |
float |
Per-run USD cost. |
total_tokens |
int |
Per-run token total (input + output). |
turns |
int |
Number of loop turns taken. |
trace_id |
str |
Trace id for this run. |
duration_ms |
float |
Wall-clock duration. |
tool_calls_made |
list[str] |
Tool names attempted this run. |
tool_calls_failed |
list[str] |
Tools attempted that errored (succeeded = made − failed). |
Passing
response_model=MySchematorun()returns a parsed Pydantic model instead of anAgentResult(provider-native structured output engages where available).
Testing with TestModel
agent.override(model=...) is a context manager that swaps in a TestModel / FunctionModel from largestack.testing. Inside the block the engine bypasses the gateway — no real API key and no network call — so it runs in CI.
import asyncio
from largestack import Agent
from largestack.testing import TestModel
async def main():
agent = Agent(name="helper", llm="openai/gpt-4o-mini", guardrails=False)
test_model = TestModel(custom_output_text="canned reply")
with agent.override(model=test_model):
result = await agent.run("anything")
assert result.content == "canned reply"
assert test_model.calls == 1
asyncio.run(main())
To assert that no real provider call leaks through a test path, wrap the call in block_model_requests() (raises ModelRequestsBlockedError on any real gateway call), or capture the message flow with capture_run_messages(). See Testing.
Typed Agent (Deps, Out)
largestack.decorators.Agent is a typed, PydanticAI-style surface: Agent[DepsT, OutputT] with dependency injection via RunContext, decorator-registered tools, and output validators.
import asyncio
from dataclasses import dataclass
from largestack.decorators import Agent, RunContext, ModelRetry
from largestack.testing import TestModel
@dataclass
class Deps:
user_id: str
agent = Agent[Deps, str](
"openai/gpt-4o-mini",
deps_type=Deps,
instructions="You are a support agent.",
)
@agent.tool
async def whoami(ctx: RunContext[Deps]) -> str:
"""Return the current user id."""
return ctx.deps.user_id
@agent.output_validator
def no_badword(ctx: RunContext[Deps], output: str) -> str:
if "badword" in output:
raise ModelRetry("Avoid bad words") # ask the LLM to retry with a hint
return output
async def main():
with agent.override(model=TestModel(custom_output_text="hello u1", call_tools=[])):
result = await agent.run("hi", deps=Deps(user_id="u1"))
print(result.output) # "hello u1"
print(result.usage) # {"input_tokens": ..., "output_tokens": ..., "cost": ...}
print(result.retry_count) # 0
asyncio.run(main())
RunContext[Deps] carries deps, usage, retry_count, messages, and model. The typed run() returns an AgentRunResult[Out] with .output, .usage, .cost, .trace_id, .retry_count, .tool_calls_made, and .tool_calls_failed.
| Decorator | First arg | Purpose |
|---|---|---|
@agent.tool |
RunContext[Deps] (optional) |
Tool with access to typed deps. |
@agent.tool_plain |
none | Tool with no context. |
@agent.output_validator |
RunContext, output |
Validate / transform output; raise ModelRetry(hint) to retry. |
@agent.instructions_func |
RunContext |
Append dynamic instructions per run. |
Constructor: Agent(model, *, deps_type=NoneType, output_type=str, instructions="", name="agent", max_retries=2, cost_budget=1.0, guardrails=None, retries=0). See Tools for tool details and Guardrails.