BurnLens vs LiteLLM
A simpler LLM cost tracking alternative · Updated May 2026
TL;DR
LiteLLM and BurnLens both sit between your app and AI providers, but they solve different problems. LiteLLM is a gateway — it normalizes every provider into one OpenAI-compatible API, with YAML config, model routing, and request rewriting. BurnLens is a FinOps proxy — it forwards your requests unmodified and only watches cost. If you already use the OpenAI, Anthropic, and Google SDKs directly and just want to see and cap spend, BurnLens is the simpler choice.
Feature comparison
| BurnLens | LiteLLM | |
|---|---|---|
| Primary purpose | Cost tracking + budgets | Provider normalization gateway |
| Config required | None — one env var | YAML / Python config |
| Payload modification | None — transparent passthrough | Rewrites requests into OpenAI format |
| Proxy overhead target | < 20ms | ~40-100ms with router |
| Hard caps before upstream call | Yes — HTTP 429 at limit | Hosted tier only |
| Multi-provider | OpenAI, Anthropic, Google (Azure / Bedrock / Groq / Mistral / Together on v0.2 / v0.3 roadmap) | 100+ providers |
| Streaming passthrough (SSE chunks unbuffered) | Yes | Yes, with re-serialization |
| Local SQLite, no external DB | Yes | Requires Postgres for spend tracking |
| Per-customer attribution via headers | Yes — X-BurnLens-Tag-* | Yes — virtual keys |
| Free self-hosted | Unlimited | Unlimited (OSS tier) |
When BurnLens is the right choice
1. You don't want a gateway.Your code already uses the OpenAI SDK for OpenAI and the Anthropic SDK for Claude. You don't want to rewrite call sites to a unified completion()function. BurnLens lets you keep your existing code and just observe cost.
2. Latency matters. BurnLens does not parse, normalize, or re-serialize request bodies. It forwards bytes directly and reads the usage field from the response only. Measured overhead stays under 20ms on the critical path.
3. You need hard caps without operational overhead. Set a daily dollar cap per API key in burnlens.yaml or via CLI. At 100% of cap, the proxy returns HTTP 429 before the request reaches the provider. No Postgres, no hosted control plane, no extra service to monitor.
4. Prompts must not leave your machine. Compliance teams reject any architecture that routes prompts through a third party. BurnLens runs on localhost:8420 and stores in local SQLite by default; cloud sync is opt-in and only ships anonymized token counts.
When LiteLLM is the right choice
If you need to swap models across 100+ providers at runtime, do prompt-level fallbacks, or unify your codebase around one completion()call — LiteLLM's router is built for that and BurnLens is not. The two tools also compose: run BurnLens on localhost:8420as the egress proxy, and point LiteLLM's provider base URLs at it. You get LiteLLM's routing logic with BurnLens's cost enforcement.
Migration path: LiteLLM → BurnLens
If you only used LiteLLM for cost tracking and not for routing, the migration is three commands:
pip install burnlens
burnlens start
# Point your existing SDK back at the provider's URL, via BurnLens:
export OPENAI_BASE_URL=http://localhost:8420/proxy/openai/v1
export ANTHROPIC_BASE_URL=http://localhost:8420/proxy/anthropicYou can deprecate the LiteLLM YAML and Postgres deployment. Tag attribution moves from LiteLLM virtual keys to BurnLens X-BurnLens-Tag-Feature / -Team / -Customer headers.
Get started
Start the free trial · Star on GitHub · Compare to Helicone · Back to homepage