BURNLENSDashboard

BurnLens vs LiteLLM

A simpler LLM cost tracking alternative · Updated May 2026

TL;DR

LiteLLM and BurnLens both sit between your app and AI providers, but they solve different problems. LiteLLM is a gateway — it normalizes every provider into one OpenAI-compatible API, with YAML config, model routing, and request rewriting. BurnLens is a FinOps proxy — it forwards your requests unmodified and only watches cost. If you already use the OpenAI, Anthropic, and Google SDKs directly and just want to see and cap spend, BurnLens is the simpler choice.

Feature comparison

BurnLensLiteLLM
Primary purposeCost tracking + budgetsProvider normalization gateway
Config requiredNone — one env varYAML / Python config
Payload modificationNone — transparent passthroughRewrites requests into OpenAI format
Proxy overhead target< 20ms~40-100ms with router
Hard caps before upstream callYes — HTTP 429 at limitHosted tier only
Multi-providerOpenAI, Anthropic, Google (Azure / Bedrock / Groq / Mistral / Together on v0.2 / v0.3 roadmap)100+ providers
Streaming passthrough (SSE chunks unbuffered)YesYes, with re-serialization
Local SQLite, no external DBYesRequires Postgres for spend tracking
Per-customer attribution via headersYes — X-BurnLens-Tag-*Yes — virtual keys
Free self-hostedUnlimitedUnlimited (OSS tier)

When BurnLens is the right choice

1. You don't want a gateway.Your code already uses the OpenAI SDK for OpenAI and the Anthropic SDK for Claude. You don't want to rewrite call sites to a unified completion()function. BurnLens lets you keep your existing code and just observe cost.

2. Latency matters. BurnLens does not parse, normalize, or re-serialize request bodies. It forwards bytes directly and reads the usage field from the response only. Measured overhead stays under 20ms on the critical path.

3. You need hard caps without operational overhead. Set a daily dollar cap per API key in burnlens.yaml or via CLI. At 100% of cap, the proxy returns HTTP 429 before the request reaches the provider. No Postgres, no hosted control plane, no extra service to monitor.

4. Prompts must not leave your machine. Compliance teams reject any architecture that routes prompts through a third party. BurnLens runs on localhost:8420 and stores in local SQLite by default; cloud sync is opt-in and only ships anonymized token counts.

When LiteLLM is the right choice

If you need to swap models across 100+ providers at runtime, do prompt-level fallbacks, or unify your codebase around one completion()call — LiteLLM's router is built for that and BurnLens is not. The two tools also compose: run BurnLens on localhost:8420as the egress proxy, and point LiteLLM's provider base URLs at it. You get LiteLLM's routing logic with BurnLens's cost enforcement.

Migration path: LiteLLM → BurnLens

If you only used LiteLLM for cost tracking and not for routing, the migration is three commands:

pip install burnlens
burnlens start
# Point your existing SDK back at the provider's URL, via BurnLens:
export OPENAI_BASE_URL=http://localhost:8420/proxy/openai/v1
export ANTHROPIC_BASE_URL=http://localhost:8420/proxy/anthropic

You can deprecate the LiteLLM YAML and Postgres deployment. Tag attribution moves from LiteLLM virtual keys to BurnLens X-BurnLens-Tag-Feature / -Team / -Customer headers.

Get started

Start the free trial · Star on GitHub · Compare to Helicone · Back to homepage