Skip to main content

Real-Time Cost Dashboard: Monitor Spend & Usage Metrics

A real-time cost dashboard surfaces LLM spending across features, users, and time windows, enabling rapid detection of cost anomalies and trend analysis. Without visibility, runaway loops or unexpected traffic spikes can burn thousands before anyone notices. With a dashboard, an engineer sees "support_qa jumped 300% this hour" instantly and can investigate (Is there a bug in the prompting logic? Is a bot attack ingesting support queries?) within minutes. A cost dashboard aggregates cost logs (from Article 2) into queryable time-series data, computes spend by dimension (feature, user, model, region), and surfaces top-N insights: "Top 5 cost drivers this week," "Cost per request trend over 30 days," "Projected monthly spend if current burn rate continues." Building a dashboard takes one week; the payback is immediate—most teams discover and fix a 5-figure mistake within the first month of running a dashboard.

Architecture of a Real-Time Cost Dashboard

A cost dashboard pipeline has four components:

  1. Cost event ingestion: Raw log events (from instrumentation in Articles 2–3) stream into a data warehouse or time-series DB.
  2. Aggregation: Batch or stream-processing jobs compute spend by feature, user, model, hour.
  3. Metrics store: Pre-aggregated metrics stored in a metrics database (Prometheus, InfluxDB, or data warehouse) for fast querying.
  4. Visualization: Dashboard frontend (Grafana, Looker, custom web UI) queries metrics and renders charts.

Here is a conceptual pipeline:

App logs cost events
→ Pub/Sub (Kafka, Cloud Pub/Sub)
→ Stream processor (Flink, Dataflow)
→ BigQuery / Snowflake (warehouse)
→ Grafana / Looker (dashboard)

For a simpler setup (small teams, low volume):

App logs to PostgreSQL
→ cron job (hourly) aggregates and stores in Metrics table
→ Flask web app queries Metrics, renders dashboard

The simpler setup is easier to build and maintain; the complex setup scales to billions of events/day. Start simple; scale up only if needed.

Building a Lightweight Cost Dashboard with PostgreSQL + Flask

Here is a minimal, production-ready dashboard in one afternoon:

import flask
import psycopg2
from datetime import datetime, timedelta
import json

app = flask.Flask(__name__)
db_url = "postgresql://user:password@localhost/cost_db"

def query_cost_by_feature(days: int = 30) -> list[dict]:
"""Query total cost by feature over last N days."""
conn = psycopg2.connect(db_url)
cur = conn.cursor()

cur.execute("""
SELECT
feature_name,
DATE(timestamp) AS date,
SUM(cost_usd) AS total_cost,
COUNT(*) AS request_count,
AVG(cost_usd) AS avg_cost_per_request
FROM cost_events
WHERE timestamp > NOW() - INTERVAL '%s days'
GROUP BY feature_name, date
ORDER BY date DESC, total_cost DESC
""", (days,))

results = [
{
"feature": row[0],
"date": str(row[1]),
"total_cost": float(row[2]),
"request_count": int(row[3]),
"avg_cost": float(row[4]),
}
for row in cur.fetchall()
]
conn.close()
return results

def query_spend_forecast(days_elapsed: int = 15) -> dict:
"""Forecast monthly spend based on burn rate so far."""
conn = psycopg2.connect(db_url)
cur = conn.cursor()

# Get current month's spend so far
cur.execute("""
SELECT SUM(cost_usd) FROM cost_events
WHERE DATE_TRUNC('month', timestamp) = DATE_TRUNC('month', NOW())
""")

spend_so_far = cur.fetchone()[0] or 0
days_in_month = 30
daily_burn = spend_so_far / max(days_elapsed, 1)
forecasted_monthly = daily_burn * days_in_month

conn.close()
return {
"spend_so_far": round(spend_so_far, 2),
"days_elapsed": days_elapsed,
"daily_burn_rate": round(daily_burn, 2),
"forecasted_monthly": round(forecasted_monthly, 2),
}

def query_cost_anomalies(threshold: float = 0.5) -> list[dict]:
"""Detect anomalies: features with cost jumps >50% vs 7-day average."""
conn = psycopg2.connect(db_url)
cur = conn.cursor()

cur.execute("""
WITH daily_costs AS (
SELECT
feature_name,
DATE(timestamp) AS date,
SUM(cost_usd) AS daily_cost
FROM cost_events
WHERE timestamp > NOW() - INTERVAL '8 days'
GROUP BY feature_name, date
),
weekly_avg AS (
SELECT
feature_name,
AVG(daily_cost) AS avg_cost,
STDDEV(daily_cost) AS stddev_cost
FROM daily_costs
WHERE date < CURRENT_DATE
GROUP BY feature_name
)
SELECT
dc.feature_name,
dc.date,
dc.daily_cost,
wa.avg_cost,
(dc.daily_cost - wa.avg_cost) / NULLIF(wa.avg_cost, 0) AS pct_change
FROM daily_costs dc
JOIN weekly_avg wa ON dc.feature_name = wa.feature_name
WHERE dc.date = CURRENT_DATE
AND ABS((dc.daily_cost - wa.avg_cost) / NULLIF(wa.avg_cost, 0)) > %s
ORDER BY pct_change DESC
""", (threshold,))

anomalies = [
{
"feature": row[0],
"date": str(row[1]),
"today_cost": float(row[2]),
"week_avg": float(row[3]),
"pct_change": float(row[4]) * 100,
}
for row in cur.fetchall()
]
conn.close()
return anomalies

@app.route('/')
def dashboard():
"""Render main dashboard."""
cost_by_feature = query_cost_by_feature(days=30)
forecast = query_spend_forecast()
anomalies = query_cost_anomalies()

return flask.render_template('dashboard.html',
cost_by_feature=cost_by_feature,
forecast=forecast,
anomalies=anomalies,
)

@app.route('/api/cost-by-feature')
def api_cost_by_feature():
"""JSON API for frontend charting."""
return flask.jsonify(query_cost_by_feature(days=30))

@app.route('/api/forecast')
def api_forecast():
"""JSON API for spend forecast."""
return flask.jsonify(query_spend_forecast())

@app.route('/api/anomalies')
def api_anomalies():
"""JSON API for cost anomalies."""
return flask.jsonify(query_cost_anomalies())

if __name__ == '__main__':
app.run(debug=True, port=5000)

Pair this with a simple HTML template (using Chart.js or similar):

<!DOCTYPE html>
<html>
<head>
<title>LLM Cost Dashboard</title>
<script src="https://cdn.jsdelivr.net/npm/chart.js"></script>
<style>
body { font-family: sans-serif; margin: 20px; }
.metric { display: inline-block; margin: 10px; padding: 10px; border: 1px solid #ccc; }
.anomaly { background-color: #ffcccc; padding: 5px; margin: 5px; }
chart { max-width: 600px; }
</style>
</head>
<body>
<h1>LLM Cost Dashboard</h1>

<!-- Forecast metrics -->
<div class="metrics">
<div class="metric">
<h3>Spend So Far</h3>
<p>${{ forecast.spend_so_far }}</p>
</div>
<div class="metric">
<h3>Daily Burn Rate</h3>
<p>${{ forecast.daily_burn_rate }}/day</p>
</div>
<div class="metric">
<h3>Forecasted Monthly</h3>
<p>${{ forecast.forecasted_monthly }} (if trend continues)</p>
</div>
</div>

<!-- Anomalies -->
<h2>Cost Anomalies (Today)</h2>
{% if anomalies %}
{% for anomaly in anomalies %}
<div class="anomaly">
<strong>{{ anomaly.feature }}</strong>:
${{ anomaly.today_cost }} ({{ anomaly.pct_change | round(1) }}% above week avg)
</div>
{% endfor %}
{% else %}
<p>No anomalies detected.</p>
{% endif %}

<!-- Cost by feature chart -->
<h2>Cost by Feature (Last 30 Days)</h2>
<canvas id="costChart"></canvas>

<script>
// Fetch cost data and render chart
fetch('/api/cost-by-feature')
.then(r => r.json())
.then(data => {
// Aggregate by feature
const byFeature = {};
data.forEach(row => {
if (!byFeature[row.feature]) byFeature[row.feature] = 0;
byFeature[row.feature] += row.total_cost;
});

const ctx = document.getElementById('costChart').getContext('2d');
new Chart(ctx, {
type: 'bar',
data: {
labels: Object.keys(byFeature),
datasets: [{
label: 'Cost (USD)',
data: Object.values(byFeature),
backgroundColor: 'rgba(75, 192, 192, 0.7)',
}],
},
options: { responsive: true },
});
});
</script>
</body>
</html>

This dashboard takes one afternoon to build and provides immediately actionable insights. You get:

  • Real-time forecast: "We're on pace for $12,500 this month" (vs budget of $10,000).
  • Anomalies: "Chat costs jumped 200% today. Why? (check for bug or traffic spike)."
  • Trend analysis: "Feature X was $1,000 last week, $2,000 this week. Need optimization."

Alerting: Programmatic Cost Monitoring

Combine dashboards with programmatic alerts. Alert when:

  • Any feature's cost exceeds 150% of daily average.
  • Daily total spend exceeds monthly budget / 30.
  • A user/project hits 80% of monthly quota.
  • Forecast shows we'll exceed monthly budget by >10%.
import smtplib
from email.mime.text import MIMEText

def check_and_alert():
"""Run hourly: check for alert conditions."""
forecast = query_spend_forecast()
anomalies = query_cost_anomalies()

alerts = []

# Check forecast
if forecast["forecasted_monthly"] > 11000: # Budget is $10k
alerts.append(f"FORECAST: Projected monthly spend ${forecast['forecasted_monthly']} exceeds budget.")

# Check anomalies
if len(anomalies) > 0:
alerts.append(f"ANOMALIES: {len(anomalies)} features with >50% cost increase today.")
for anom in anomalies:
alerts.append(f" - {anom['feature']}: {anom['pct_change']:.0f}% above average")

# Send alerts
if alerts:
message = "LLM Cost Alert:\n\n" + "\n".join(alerts)
send_slack_message(message)
print(f"Alert sent: {len(alerts)} issues detected")

def send_slack_message(text: str):
"""Send alert to Slack."""
import requests
webhook_url = "https://hooks.slack.com/services/YOUR/WEBHOOK/URL"
requests.post(webhook_url, json={"text": text})

Run this check every hour via cron or a scheduler. When anomalies are detected, Slack notifies the team immediately.

Key Takeaways

  • A cost dashboard makes spend visible and actionable; most teams recoup the build cost (1 week) within the first month by catching and fixing anomalies.
  • Aggregate cost logs by feature, user, model, and time window for multi-dimensional analysis.
  • Forecast monthly spend based on burn rate so far; alert if forecast exceeds budget.
  • Detect anomalies: features with >50% cost jumps vs recent average.
  • Combine dashboards with programmatic alerts that page on-call on critical issues.

Frequently Asked Questions

How often should I refresh dashboard data?

For the main dashboard (features, forecast, anomalies), refresh hourly. For detailed views (per-request costs, model breakdown), refresh every 4 hours. Do not refresh per-second (expensive, hard to reason about rapid fluctuations).

What if my cost events have high latency (logged 2 hours after the request)?

Add a "processing latency" buffer to dashboard queries: only display data older than 1 hour to ensure it's complete. Real-time alerting should use a separate pipeline with lower latency (Kafka, Cloud Pub/Sub).

Should I track cost per token or per request?

Track both. Per-request is intuitive for business stakeholders ("We process 1,000 requests/day at $0.01 each = $10/day"). Per-token is intuitive for engineers ("We're using 10M tokens/month at $0.003/token = $30/month"). Both are useful for different audiences.

How do I handle multi-tenancy in the dashboard?

Add a tenant_id or org_id dimension to cost events and dashboard queries. Each tenant/org sees only their own costs. Use row-level security in your database to enforce isolation.

Can I integrate the dashboard with my billing system?

Yes. Export aggregated monthly costs (by project/feature) to your billing system (Stripe, custom), and use that for customer invoicing. Dashboard should match billing system's cost attribution for audit compliance.

Further Reading