Open source · MIT License · v0.1.1

Stop guessing what your
LLM API calls cost

BurnLens is a local proxy that tracks every AI API call — per feature, team, and customer. One install. Zero config. Nothing leaves your machine.

View on GitHub See how it works

terminal

# install

pip install burnlens

# start the proxy + dashboard

burnlens start

# dashboard at http://127.0.0.1:8420/ui

// how it works

Three steps to full visibility

No SDK changes. No code rewrites. Just point your existing API calls through the proxy.

01 —

⚡

Install & start

pip install burnlens then burnlens start. Proxy runs on port 8420. Takes 30 seconds.

02 —

🔀

Route your traffic

Set OPENAI_BASE_URL to the proxy. Add X-BurnLens-Tag-Feature headers to your requests. Done.

03 —

📊

See everything

Open the dashboard. Cost by model, feature, team, customer. Waste alerts. Per-call token counts and latency.

// features

Everything you need to
control LLM spend

Built for engineers and teams who are serious about understanding their AI costs.

💰

Cost per feature

Tag requests with X-BurnLens-Tag-Feature and see exactly which part of your product is spending what.

👥

Team & customer budgets

Set monthly spend limits per team and per customer. Get warnings at 80%. Automatic 429 when a customer exceeds their cap.

🚨

Waste alerts

Detect duplicate requests, context bloat, model overkill, and uncached system prompts. Dollar estimates included.

🤖

Model recommendations

BurnLens analyses your usage patterns and recommends cheaper models where the data supports it. Projected savings shown.

📤

Export & reports

burnlens export to CSV. burnlens report for weekly summaries. Email delivery supported.

🔒

100% local

Everything stored in a local SQLite file. No accounts. No SaaS. No prompt content ever leaves your machine.

// providers

Works with your stack

Supports all major providers out of the box. More coming.

OpenAI — GPT-4o, GPT-4o-mini, o1, o3

Anthropic — Claude 3.5, Claude Haiku

Google — Gemini 1.5, Gemini 2.0

      
      # Set env var once — all OpenAI calls go through BurnLens

      export OPENAI_BASE_URL=http://127.0.0.1:8420/proxy/openai

      # Tag your requests

      response = client.chat.completions.create(

        model="gpt-4o-mini",

        messages=[...],

        extra_headers={

          "X-BurnLens-Tag-Feature": "chat",

          "X-BurnLens-Tag-Team": "backend",

        }

      )
    
      # Set env var once

      export ANTHROPIC_BASE_URL=http://127.0.0.1:8420/proxy/anthropic

      # Tag your requests

      response = client.messages.create(

        model="claude-haiku-4-5-20251001",

        max_tokens=1024,

        messages=[...],

        extra_headers={

          "X-BurnLens-Tag-Feature": "summarize",

          "X-BurnLens-Tag-Customer": "acme-corp",

        }

      )
    
      # Use the BurnLens patch helper for Google SDK

      import burnlens.patch

      burnlens.patch.patch_google()

      import google.generativeai as genai

      model = genai.GenerativeModel("gemini-2.0-flash")

      response = model.generate_content("Hello")

// open source

Free forever.
Your data stays yours.

BurnLens is MIT licensed and always will be. No accounts, no usage limits, no data collection. Run it locally and own everything.

MIT License

Local SQLite storage

No accounts required

No prompt data collected

Full source on GitHub

Star on GitHub View on PyPI

Stop guessing what yourLLM API calls cost