Guide: Cost Control

Per-Run Budget

agent = Agent(name="researcher", cost_budget=0.50)  # Max $0.50 per run

If the run exceeds this, BudgetExceededError is raised immediately.

Pre-Execution Prediction

from largestack._core.cost import CostTracker
tracker = CostTracker()
estimate = tracker.predict("gpt-4o-mini", input_tokens=5000)
print(f"Expected: ${estimate.expected:.4f} (range: ${estimate.low:.4f}–${estimate.high:.4f})")

Choose Cheaper Models

Model	Input/1M	Output/1M	Best For
deepseek-chat	$0.14	$0.28	General tasks
gpt-4o-mini	$0.05	$0.40	Simple classification
gpt-4o-mini	$0.15	$0.60	Balanced quality/cost
claude-sonnet-4-6	$3.00	$15.00	Complex reasoning

Smart Routing (auto-pick cheapest viable model)

# largestack.yaml
smart_routing: true

agent = Agent(name="auto", llm="auto")  # Thompson Sampling picks best model

Semantic Caching

semantic_cache: true  # Identical queries return cached response — $0 cost

Monitor with CLI

largestack cost              # Total costs by agent
largestack cost --period=today  # Today's costs