Largestack Architecture

This document explains how Largestack works internally in beginner-friendly but technically accurate terms.

1. Simple mental model

Largestack is a runtime for AI applications.

A normal application has:

User -> API -> Business Logic -> Database -> Response

A Largestack application has:

User -> Agent -> Tools/RAG/Memory/Workflow/Guardrails -> Observability -> Response

The agent is not just a chatbot. It is a controlled runtime component that can:

reason over a task,
call approved tools,
use memory,
retrieve documents,
follow workflow steps,
obey guardrails,
produce auditable outputs.

2. Main runtime blocks

flowchart LR
    User[User or App] --> SDK[SDK / CLI / API]
    SDK --> Agent[Agent Runtime]
    Agent --> Model[LLM Provider]
    Agent --> Tools[Tool Registry]
    Agent --> RAG[RAG Layer]
    Agent --> Memory[Memory Layer]
    Agent --> Workflow[Workflow Engine]
    Agent --> Guard[Guardrails]
    Agent --> Obs[Observability]
    Workflow --> State[State / Checkpoint]
    Guard --> Policy[Policies / Approvals]
    Obs --> Dash[Dashboard / Logs / Metrics]

3. Agent runtime

The agent runtime is the core execution layer. It handles:

model selection,
instructions,
user input,
tool schemas,
retries,
structured output,
execution result handling.

Typical usage:

from largestack import Agent

agent = Agent(
    name="support-agent",
    llm="deepseek/deepseek-chat",
    instructions="Classify support tickets and suggest the next action."
)

4. Tool layer

Tools let the agent interact with code and systems.

Examples:

calculator,
file reader,
HTTP client,
database query,
browser/search adapter,
custom Python function,
approval-protected write action.

A production tool should define:

Control	Why it matters
Input schema	Prevents malformed calls
Timeout	Prevents stuck executions
Retry policy	Handles transient errors
Permission level	Blocks unsafe actions
Idempotency	Avoids duplicate side effects
Audit log	Tracks who/what called it

5. Workflow orchestration

Workflows control how tasks move from step to step.

Supported concepts include:

sequential steps,
parallel execution,
router decisions,
supervisor patterns,
graph workflows,
checkpoint and interrupt patterns,
subgraphs.

Example workflow mental model:

flowchart TD
    A[Receive ticket] --> B[Classify issue]
    B --> C{Urgent?}
    C -->|Yes| D[Escalate to senior agent]
    C -->|No| E[Generate normal response]
    D --> F[Audit + output]
    E --> F

6. RAG layer

RAG means retrieval-augmented generation. Instead of asking the model to guess, Largestack retrieves relevant documents and grounds the answer.

RAG pipeline:

flowchart LR
    Docs[Documents] --> Load[Loaders]
    Load --> Chunk[Chunking]
    Chunk --> Embed[Embeddings]
    Embed --> Store[Vector Store]
    Query[User Query] --> Retrieve[Retriever]
    Store --> Retrieve
    Retrieve --> Rerank[Reranker]
    Rerank --> Answer[Answer with citations]

Important behaviors:

cite sources when available,
refuse/no-answer when evidence is insufficient,
filter by tenant/project when needed,
support table/document retrieval patterns.

7. Memory layer

Memory stores previous context or facts. Largestack has memory patterns for:

short-term buffer memory,
long-term memory,
vector memory,
shared memory,
isolated tenant/session memory.

Memory must be controlled carefully. Enterprise systems should avoid leaking memory between users, tenants, or projects.

8. Guardrails and security

Guardrails are policy checks around the agent.

They help prevent:

prompt injection,
unsafe tool calls,
PII exposure,
hallucinated answers,
policy violations,
unapproved write actions.

Guardrails should be applied before, during, and after agent execution.

flowchart TD
    Input[User Input] --> Pre[Input Guardrails]
    Pre --> Agent[Agent Execution]
    Agent --> ToolCheck[Tool Policy]
    ToolCheck --> OutputCheck[Output Guardrails]
    OutputCheck --> Final[Final Response]

9. Observability

Observability lets developers understand what happened.

Largestack tracks:

execution traces,
cost data,
model calls,
tool calls,
RAG events,
guardrail decisions,
dashboard health.

Production systems need this because AI failures are often not simple exceptions. They are usually wrong route, wrong retrieval, wrong tool, wrong policy, or wrong model behavior.

10. Enterprise layer

Largestack includes enterprise-oriented modules for:

RBAC,
audit,
tenant scoping,
SSO/session foundations,
billing/payment scaffolds,
canary controls,
compliance scenarios.

These are strong framework foundations, but external audit, VAPT, and enterprise certification are still separate requirements before regulated production claims.

11. Deployment architecture

flowchart TD
    Client[Client / Browser / API User] --> API[FastAPI / Largestack App]
    API --> Runtime[Agent Runtime]
    Runtime --> Queue[Queue / Worker Layer - Roadmap/P0]
    Runtime --> DB[(Postgres / SQLite)]
    Runtime --> Vector[(Vector Store)]
    Runtime --> LLM[LLM Providers]
    Runtime --> Obs[Metrics / Traces]
    Obs --> Grafana[Grafana / Dashboard]
    API --> Docker[Docker]
    Docker --> K8s[Kubernetes / Helm]

Current deployment support includes Docker, Compose, Helm templates, dashboard health endpoints, and validation scripts. Real cluster install evidence should be captured before enterprise deployment claims.