Open source · MIT License · v0.1.1

Stop guessing what your
LLM API calls cost

BurnLens is a local proxy that tracks every AI API call — per feature, team, and customer. One install. Zero config. Nothing leaves your machine.

View on GitHub See how it works
BurnLens Dashboard
terminal
# install
pip install burnlens
 
# start the proxy + dashboard
burnlens start
 
# dashboard at http://127.0.0.1:8420/ui
3 providers supported
0 data leaves your machine
1 command to start
MIT open source license
// how it works

Three steps to full visibility

No SDK changes. No code rewrites. Just point your existing API calls through the proxy.

01 —

Install & start

pip install burnlens then burnlens start. Proxy runs on port 8420. Takes 30 seconds.

02 —
🔀

Route your traffic

Set OPENAI_BASE_URL to the proxy. Add X-BurnLens-Tag-Feature headers to your requests. Done.

03 —
📊

See everything

Open the dashboard. Cost by model, feature, team, customer. Waste alerts. Per-call token counts and latency.

// features

Everything you need to
control LLM spend

Built for engineers and teams who are serious about understanding their AI costs.

💰

Cost per feature

Tag requests with X-BurnLens-Tag-Feature and see exactly which part of your product is spending what.

👥

Team & customer budgets

Set monthly spend limits per team and per customer. Get warnings at 80%. Automatic 429 when a customer exceeds their cap.

🚨

Waste alerts

Detect duplicate requests, context bloat, model overkill, and uncached system prompts. Dollar estimates included.

🤖

Model recommendations

BurnLens analyses your usage patterns and recommends cheaper models where the data supports it. Projected savings shown.

📤

Export & reports

burnlens export to CSV. burnlens report for weekly summaries. Email delivery supported.

🔒

100% local

Everything stored in a local SQLite file. No accounts. No SaaS. No prompt content ever leaves your machine.

// providers

Works with your stack

Supports all major providers out of the box. More coming.

OpenAI — GPT-4o, GPT-4o-mini, o1, o3
Anthropic — Claude 3.5, Claude Haiku
Google — Gemini 1.5, Gemini 2.0
# Set env var once — all OpenAI calls go through BurnLens
export OPENAI_BASE_URL=http://127.0.0.1:8420/proxy/openai

# Tag your requests
response = client.chat.completions.create(
  model="gpt-4o-mini",
  messages=[...],
  extra_headers={
    "X-BurnLens-Tag-Feature": "chat",
    "X-BurnLens-Tag-Team": "backend",
  }
)
# Set env var once
export ANTHROPIC_BASE_URL=http://127.0.0.1:8420/proxy/anthropic

# Tag your requests
response = client.messages.create(
  model="claude-haiku-4-5-20251001",
  max_tokens=1024,
  messages=[...],
  extra_headers={
    "X-BurnLens-Tag-Feature": "summarize",
    "X-BurnLens-Tag-Customer": "acme-corp",
  }
)
# Use the BurnLens patch helper for Google SDK
import burnlens.patch
burnlens.patch.patch_google()

import google.generativeai as genai
model = genai.GenerativeModel("gemini-2.0-flash")
response = model.generate_content("Hello")

Free forever.
Your data stays yours.

BurnLens is MIT licensed and always will be. No accounts, no usage limits, no data collection. Run it locally and own everything.

MIT License
Local SQLite storage
No accounts required
No prompt data collected
Full source on GitHub
Star on GitHub View on PyPI