TokenMix Research Lab · 2026-06-05

LiteLLM Logger 2026: Callbacks, Spend Logs, Cost Tracking

LiteLLM Logger 2026: Callbacks, Spend Logs, Cost Tracking

Last Updated: 2026-06-05 Author: TokenMix Research Lab Data verified: 2026-06-05 - LiteLLM official docs for custom callbacks, callbacks, proxy logging, spend tracking, request tags, custom pricing, Datadog, OpenTelemetry, and GitHub README

LiteLLM logger is not one switch. Use callbacks for events, spend logs for cost, request tags for attribution, and external observability tools for traces.

LiteLLM's custom callback docs define log_pre_api_call, log_post_api_call, log_success_event, log_failure_event, plus async success and failure hooks, and show that response_cost, cache_hit, model, messages, and metadata are available in callback kwargs (LiteLLM custom callbacks). LiteLLM's spend tracking docs say the proxy automatically tracks spend for all known models and can track spend for keys, users, and teams across 100+ LLMs (LiteLLM spend tracking). Request tags appear in request_tags inside LiteLLM_SpendLogs, and can be set from config, x-litellm-tags, request body tags, key metadata, team metadata, or configured custom headers (Request tags). The GitHub README describes LiteLLM as an OpenAI-format gateway for 100+ providers with cost tracking, guardrails, load balancing, logging, virtual keys, spend tracking, and an admin dashboard (LiteLLM GitHub).

Table of Contents

Quick Verdict

Claim Status Source
LiteLLM supports custom callback classes Confirmed Custom callbacks
LiteLLM supports input_callback, success_callback, and failure_callback Confirmed Custom callbacks
response_cost is available in callback kwargs Confirmed Custom callbacks
LiteLLM Proxy can track spend by keys, users, and teams Confirmed Spend tracking
Request tags appear in LiteLLM_SpendLogs Confirmed Request tags
Tags can be passed by x-litellm-tags header Confirmed Request tags
LiteLLM automatically tracks spend for all known models Confirmed Spend tracking
Custom pricing can override default model costs Confirmed Custom pricing
LiteLLM logger alone is a full observability platform False LiteLLM integrates with external tools; it is not a replacement for every trace backend
Blocking network I/O inside callback functions is safe False LiteLLM docs warn to use async hooks for I/O and avoid blocking
Streaming cost logs are always identical to provider bills Likely false Docs recommend debugging cost discrepancies by aligning time ranges and token categories
Teams will use request tags more for chargeback in 2026 Speculation LiteLLM documents tags, but adoption trend is inferred

Logger Surfaces

Surface What it captures Best for Requires proxy DB? Status
Python SDK callbacks Per-call input, success, failure, cost, metadata App-level debugging No Confirmed
CustomLogger class Structured sync/async event hooks Production callback control No Confirmed
Proxy logging Gateway-level requests and responses Central team observability Usually yes for durable logs Confirmed
Spend logs Spend, model, request tags, user/key/team attribution Budgeting and chargeback Yes Confirmed
Request tags Environment, team, customer, job, region labels FinOps slices Yes for spend logs Confirmed
External integrations Datadog, Langfuse, MLflow, Helicone, OTel, Slack, Sentry Long-term observability Depends on integration Confirmed

For gateway-level model routing, compare this with TokenMix vs OpenRouter vs Portkey vs LiteLLM. For a broader architecture view, use AI API Gateway 2026.

Callback Hooks

Hook Mode Fires when Use case Status
log_pre_api_call sync class Before provider call Redact, inspect, trace start Confirmed
log_post_api_call sync class After provider call Add response metadata Confirmed
log_success_event sync class Successful call Cost log, latency log Confirmed
log_failure_event sync class Failed call Error counter, alert Confirmed
async_log_success_event async class Async success Non-blocking trace export Confirmed
async_log_failure_event async class Async failure Non-blocking error export Confirmed
async_pre_call_hook proxy-only Before proxy sends request Policy or request mutation Confirmed
async_post_call_success_hook proxy-only After proxy success Response/header mutation Confirmed

LiteLLM docs state the kwargs payload can include model, messages, response_cost, cache_hit, and litellm_params.metadata. That is enough to build a basic internal logger without parsing provider-specific response formats.

Spend Logs and Tags

Attribution method How to set it Where it lands Priority note Status
Config tags litellm_params.tags in config.yaml request_tags Automatic per deployment Confirmed
Header tags x-litellm-tags: team-api,production request_tags Dynamic per call Confirmed
Request body tags "tags": ["team-api"] request_tags Body takes precedence over header Confirmed
Metadata tags "metadata": {"tags": [...]} request_tags Also supported Confirmed
Key metadata Tags on generated key request_tags Default by API key Confirmed
Team metadata Tags on team request_tags Default by team Confirmed
Custom headers extra_spend_tag_headers request_tags Advanced tracking Confirmed

Cost calculation 1: if your app spends $3,000/month through LiteLLM and 40% of calls carry x-litellm-tags: eval, that tag represents $1,200/month of attributable usage. Without tags, the same spend becomes an undifferentiated provider bill.

Setup Examples

Python callback:

import litellm
from litellm import completion

def track_cost_callback(kwargs, completion_response, start_time, end_time):
    print({
        "model": kwargs.get("model"),
        "cost": kwargs.get("response_cost", 0),
        "cache_hit": kwargs.get("cache_hit", False),
        "metadata": kwargs.get("litellm_params", {}).get("metadata", {}),
    })

litellm.success_callback = [track_cost_callback]

response = completion(
    model="openai/gpt-4o-mini",
    messages=[{"role": "user", "content": "Say hello in one sentence."}],
)

cURL with request tags:

curl -X POST 'http://0.0.0.0:4000/chat/completions' \
  -H 'Authorization: Bearer sk-1234' \
  -H 'Content-Type: application/json' \
  -H 'x-litellm-tags: team-api,production,us-east-1' \
  -d '{
    "model": "gpt-4",
    "messages": [{"role": "user", "content": "Hello"}]
  }'

Config tags:

model_list:
  - model_name: gpt-4-prod
    litellm_params:
      model: azure/gpt-4-prod
      api_key: os.environ/AZURE_PROD_API_KEY
      api_base: https://prod.openai.azure.com/
      tags: ["production", "azure", "team-platform"]

Cost Tracking Matrix

Need LiteLLM feature Why it works Caveat Status
Per-call cost response_cost in callback kwargs Lightweight app-level cost logging Depends on model cost map accuracy Confirmed
Per-key spend Proxy spend tracking Tracks virtual key usage Needs DB and key setup Confirmed
Per-user spend user field / key metadata Useful for chargeback User identifiers must be stable Confirmed
Per-team spend Teams and metadata Org-level reporting Needs team model Confirmed
Per-environment spend Request tags prod, staging, eval slices Tags must be enforced Confirmed
Custom model cost Custom pricing Works for private/self-hosted models Must keep prices current Confirmed
Cost discrepancy debug Spend tracking workflow Align ranges and token categories Provider bills can differ by cache/tier metadata Confirmed

Cost calculation 2: a custom logger that writes one 1 KB JSON line per call produces about 1 GB logs per 1M calls before compression. At 10M calls/month, that is about 10 GB raw log volume. LiteLLM does not price this; your log backend does.

Observability Integrations

Integration type Examples in LiteLLM docs Best for Status
LLM observability Langfuse, Helicone, Lunary, MLflow Prompt/response trace and eval Confirmed
APM / metrics Datadog, OpenTelemetry, Sentry, Slack Operations alerting Confirmed
Data warehouse / billing Spend logs, tags, custom pricing FinOps and chargeback Confirmed
Raw request logging Raw request/response logging docs Deep debugging Confirmed
Scrubbing Scrub logged data docs Privacy and compliance Confirmed

Do not log full prompts by default in regulated products. LiteLLM makes logging easy; that does not make every payload safe to store.

Production Checklist

Step Action Pass condition
1 Decide logging goal Debug, spend, trace, compliance, or chargeback
2 Choose hook surface SDK callbacks or proxy spend logs
3 Add request IDs Every call can be traced across systems
4 Add tags Team, env, customer, job, model route
5 Avoid blocking callbacks Use async for external I/O
6 Redact secrets and PII No API keys, credentials, or sensitive prompts stored by default
7 Validate cost map Known models match provider pricing
8 Test streaming Streamed calls include complete cost/usage expectations
9 Alert on failures Failure callback or APM integration fires
10 Reconcile monthly LiteLLM spend vs provider invoice

Risks and Caveats

Risk What breaks Mitigation Status
Callback not called No logs Register correct hook and use correct mode Confirmed
Blocking callback Latency spikes Use async hooks for I/O Confirmed
Prompt leakage Sensitive data in logs Scrub or hash payloads Likely
Cost mismatch Spend logs differ from provider invoice Sync pricing data and compare token categories Confirmed
Tag drift Chargeback slices become unreliable Enforce tags in gateway Likely
Streaming edge cases Usage may be harder to reconcile Test include_usage and provider behavior Likely
Custom pricing stale Budget caps wrong Update model costs regularly Confirmed

Final Recommendation

Use LiteLLM callbacks for per-call events, SpendLogs for budget reporting, and request tags for chargeback. In production, do not block inside callbacks, do not store raw prompts by default, and reconcile LiteLLM spend against provider invoices every month.

FAQ

What is LiteLLM logger?

LiteLLM logger usually means the callback and proxy logging surfaces used to capture LLM calls, costs, failures, tags, and metadata. It is not one single feature.

How do I log LiteLLM costs?

Use response_cost in callback kwargs for simple app-level logging, or LiteLLM Proxy spend tracking for durable key/user/team spend logs.

What are LiteLLM spend logs?

Spend logs are proxy-side records that can include model, spend, request ID, and request tags. They are used for cost tracking, budgets, and chargeback.

How do LiteLLM request tags work?

Tags can come from config, x-litellm-tags, request body tags, metadata, key metadata, team metadata, or configured custom headers. They appear in request_tags.

Can LiteLLM log to Datadog?

Yes, LiteLLM documents Datadog as an observability integration. Use it for operations metrics and traces, not as a replacement for spend logs.

Should I log prompts and responses?

Only if your privacy model allows it. For most production systems, log metadata, model, cost, latency, request ID, and redacted snippets by default.

Why does LiteLLM spend not match my provider invoice?

Cost mismatches can come from time range alignment, token category differences, cache pricing, provider tier metadata, or stale model pricing maps. LiteLLM docs recommend a discrepancy debugging workflow.

Do I need LiteLLM Proxy for callbacks?

No. Python SDK callbacks work without proxy. Proxy-only hooks and durable spend logs require the proxy path and usually a database.

Sources

Related Articles