Stop paying twice when your LLM script crashes. Stop paying $90 when Claude Code runs away. StateLoom wraps your Python loop or your agent CLI in a stateful session. Durable resume for scripts, hard budget caps for CLIs, full session traces for both. Two lines of code. Runs on your laptop or in your VPC. No framework, no SaaS.
Every existing tool solves part of the problem. None of them solve all of it.
LangSmith, Langfuse, Helicone
Trace what your agents did after the fact.
Read-only — can’t intervene mid-run, can’t enforce budgets, can’t recover from crashes.
Portkey, LiteLLM, Cloudflare
Control individual LLM calls in the request path.
Stateless — budgets are per-key, not per-run. A crashed agent re-pays for every completed step on restart.
Temporal, DBOS, LangGraph
Checkpoint and resume workflows.
Requires rewriting your code as workflows, activities, or a StateGraph. No help for agent CLIs you didn’t write.
Sits in the request path like a gateway, but groups calls into durable sessions. Crashed scripts resume from the last completed step for $0. Budget caps and kill switches scope to the whole run, not individual calls. Works with your existing loop — no workflow DSL, no framework adoption.
Two integration paths. Scripts get durable resume. CLIs get budget caps and guardrails. Both get full session traces.
import stateloom
stateloom.init()
with stateloom.session("eval-run-42", durable=True, budget=50.00):
for row in dataset: # 10,000 rows
result = claude.messages.create(
model="claude-opus-4-5",
messages=[{"role": "user", "content": row.prompt}]
)
save(row.id, result)
# Crashed at row 8,312? Re-run with the same session ID.
# Rows 1–8,311 served from cache. $0 spent. Resumes at 8,312.
$ pip install stateloom && stateloom start
$ export ANTHROPIC_BASE_URL=http://localhost:4782
$ claude "refactor the auth module"
# → Capped at $2. PII scanned. Full session trace at localhost:4782.
Works with every major provider & SDK
Open source for individual developers. Enterprise features for teams at scale.
Crash on step 47, resume from step 46. Every LLM response is checkpointed with its request hash. Detects non-deterministic call order, rejects concurrent calls, and optionally buffers streams for crash safety.
Hard dollar caps per session, per agent, or per team. Budget enforcer fires before the call, not after. Subscription billing detection skips enforcement for flat-rate plans.
Every LLM call belongs to a session. Waterfall timeline, cumulative cost, named checkpoints, call-by-call token breakdown. Not scattered traces — one grouped view per task.
Replay any session deterministically from its cached responses. Step through each LLM call with the exact request and response. See where your agent went wrong without re-running it.
Global emergency stop or granular rules by model, provider, agent, or environment. Blast radius auto-pauses agents on consecutive failures. Takes effect mid-session.
Exact-match and semantic similarity caching. Sentence-transformer embeddings with FAISS or Redis vector search. Skips error responses automatically.
Set a dollar limit before your agent starts. StateLoom kills the session when it hits the cap. No more $90 surprises from Claude Code or Codex.
Real-time waterfall view of every call your CLI agent makes. Per-call cost, token counts, model used, latency. Runs on localhost.
Full session history in a local SQLite database. Query what your agent did, when, and how much it cost. No cloud account required.
One localhost endpoint for OpenAI, Anthropic, and Gemini APIs. Native protocol support — Claude CLI speaks Anthropic, Codex speaks OpenAI Responses API. No format translation needed.
Emergency stop from the dashboard. Block a specific model, provider, or all traffic. Takes effect on the next call, not the next session.
Watch the dashboard react to a live Claude Code session. Budget enforcement, session traces, and kill switch — all in real time.
Run as a sidecar next to your app, or as a centralized gateway for your team. Single binary, no external dependencies. Your LLM traffic stays in your network either way.
All session data stays in your infrastructure. SQLite for single-node, Postgres for teams. PII detection and redaction run locally — nothing phones home.
Per-session, per-agent, per-team cost breakdown with context-tiered pricing. Detects API vs subscription billing automatically. Exportable audit trail.
Granular rules by model, provider, agent version, or environment. Circuit breaker with tier-based failover. Blast radius auto-containment on repeated failures.
Tamper-evident SHA-256 event chain. Every LLM call, budget decision, PII detection, and guardrail trigger is persisted locally with full request/response context. GDPR purge engine for Right to Be Forgotten.
Scoped API keys per team or integration. Per-key rate limits, model allowlists, and agent restrictions. Revoke without rotating upstream provider keys.
See how StateLoom can secure and optimize your AI infrastructure. We'll walk you through a live demo tailored to your use case.