Guide: Cost Control
Per-Run Budget
agent = Agent(name="researcher", cost_budget=0.50) # Max $0.50 per run
If the run exceeds this, BudgetExceededError is raised immediately.
Pre-Execution Prediction
from largestack._core.cost import CostTracker
tracker = CostTracker()
estimate = tracker.predict("gpt-4o-mini", input_tokens=5000)
print(f"Expected: ${estimate.expected:.4f} (range: ${estimate.low:.4f}–${estimate.high:.4f})")
Choose Cheaper Models
| Model | Input/1M | Output/1M | Best For |
|---|---|---|---|
| deepseek-chat | $0.14 | $0.28 | General tasks |
| gpt-4o-mini | $0.05 | $0.40 | Simple classification |
| gpt-4o-mini | $0.15 | $0.60 | Balanced quality/cost |
| claude-sonnet-4-6 | $3.00 | $15.00 | Complex reasoning |
Smart Routing (auto-pick cheapest viable model)
# largestack.yaml
smart_routing: true
agent = Agent(name="auto", llm="auto") # Thompson Sampling picks best model
Semantic Caching
semantic_cache: true # Identical queries return cached response — $0 cost
Monitor with CLI
largestack cost # Total costs by agent
largestack cost --period=today # Today's costs