Agent FinOps API Documentation

Integrate your AI agent stack with the goldtaler.at FinOps platform. Track costs, detect waste, and optimise spend — all via a simple REST API.

Quick Start

  1. 1
    Create an account

    Sign up at goldtaler.at

  2. 2
    Navigate to KI FinOps

    Open the KI FinOps section from the main navigation.

  3. 3
    Generate an API key

    Your key is created via POST /api/finops/register and starts with the finops_ prefix.

  4. 4
    Send your first report

    POST a daily cost report to /api/finops/daily-report

  5. 5
    View dashboard

    Your data appears in the KI FinOps dashboard immediately.

Authentication

All API requests require a Bearer token in the Authorization header. Tokens use the finops_ prefix.

http
Authorization: Bearer finops_your_api_key_here

Note: API keys are generated via POST /api/finops/register after authentication. Each key is scoped to your account and can be revoked at any time from the dashboard.

API Reference

Base URL: https://goldtaler.at

POST/api/finops/daily-report

Submit a daily cost report for your AI agent stack. Call this once per day (or per reporting period) with aggregated usage data.

Request body

json
{
  "date": "2026-02-28",
  "total_tokens": 1842000,
  "total_cost": 14.73,
  "total_requests": 312,
  "cost_by_model": {
    "gpt-4o": 9.20,
    "claude-sonnet-4-20250514": 4.18,
    "text-embedding-3-small": 1.35
  },
  "cost_by_action": {
    "code_generation": 6.50,
    "document_analysis": 4.10,
    "chat_response": 2.80,
    "embeddings": 1.33
  },
  "waste_categories": {
    "retries": 1.20,
    "unused_context": 0.85,
    "hallucination_recovery": 0.40
  },
  "efficiency_score": 78,
  "metadata": {
    "agent_version": "1.4.2",
    "environment": "production"
  }
}

Field reference

FieldTypeRequiredDescription
datestringrequiredISO 8601 date (YYYY-MM-DD)
total_tokensnumberrequiredTotal tokens consumed across all models
total_costnumberrequiredTotal cost in USD
total_requestsnumberrequiredNumber of LLM API calls
cost_by_modelobjectoptionalCost breakdown keyed by model name
cost_by_actionobjectoptionalCost breakdown keyed by action / tool name
waste_categoriesobjectoptionalWaste breakdown (e.g. retries, unused_context, hallucination_recovery)
efficiency_scorenumberoptionalSelf-assessed efficiency 0-100
metadataobjectoptionalArbitrary key-value pairs for your own tracking

Response

json
{
  "ok": true,
  "report_id": "rpt_abc123",
  "message": "Daily report saved"
}
GET/api/finops/dashboard?period=7d|30d|90d

Returns the full dashboard overview including cost summary, efficiency score, active alerts, and recommendations for the given period.

Query parameters

ParamDefaultDescription
period30dTime window: 7d, 30d, or 90d

Response

json
{
  "customer": {
    "id": "cust_xxx",
    "company_name": "Acme AI",
    "monthly_budget": 500,
    "alert_threshold_pct": 80
  },
  "summary": {
    "total_cost": 142.50,
    "total_tokens": 18420000,
    "total_requests": 3120,
    "avg_efficiency": 76,
    "report_count": 30,
    "period_start": "2026-02-01",
    "period_end": "2026-02-28"
  },
  "efficiency": { "score": 78, "date": "2026-02-28" },
  "alerts": [],
  "recommendations": []
}
GET/api/finops/efficiency-score

Returns the current efficiency score and its historical trend.

json
{
  "current": { "score": 78, "date": "2026-02-28" },
  "trend": [
    { "date": "2026-02-26", "score": 74 },
    { "date": "2026-02-27", "score": 76 },
    { "date": "2026-02-28", "score": 78 }
  ]
}
GET/api/finops/alerts

Returns all active alerts for your account, including budget warnings and anomaly detections.

json
{
  "alerts": [
    {
      "id": "alt_abc123",
      "alert_type": "budget_warning",
      "severity": "high",
      "title": "Monthly budget 80% consumed",
      "message": "You have used $400 of your $500 monthly budget.",
      "created_at": "2026-02-25T10:30:00Z"
    }
  ]
}

Integration Guide for AI Agents

Follow these four steps to add FinOps reporting to any AI agent framework (LangChain, CrewAI, AutoGen, custom stacks, etc.).

Step 1:Track usage per request

Wrap each LLM call and record input tokens, output tokens, model name, and the action or tool that triggered the call.

Step 2:Estimate costs using the rate table

Multiply token counts by published per-token rates. Example rates:

ModelInput (per 1M tokens)Output (per 1M tokens)
GPT-4o$2.50$10.00
Claude Sonnet 4$3.00$15.00
GPT-4o-mini$0.15$0.60
text-embedding-3-small$0.02--

Step 3:Aggregate daily

At the end of each day (or via a cron job), sum up all request-level data into a single daily report payload and POST it to /api/finops/daily-report.

Step 4:Classify waste

Tag wasted spend into categories so the dashboard can surface optimisation recommendations:

  • retries — repeated calls due to transient errors or rate limits
  • unused_context — tokens sent as context but never referenced in the response
  • hallucination_recovery — follow-up calls to correct hallucinated output

Skill Pack (ayaiay)

Install the ready-made FinOps skill pack via ayaiay to get pre-built tracking and reporting skills:

bash
ayaiay install philippfrenzel/finops-agent-pack@1.0.0
usage-log
Per-task tracking — estimates tokens, calculates costs, writes JSON logs
daily-report
Daily aggregation — sums all logs and POSTs to the FinOps API

Source: github.com/philippfrenzel/finops-agent-pack — includes integration guide, cost rate reference, and prompt templates.

Advanced Topics & FAQ

Cache-Write-Kosten senken

"Mein Cache-Write-Anteil liegt über 50% — was kann ich tun?"

Bei der Anthropic API kosten Cache-Writes 25% mehr als normaler Input ($3.75/M vs $3.00/M bei Sonnet), während Cache-Reads 90% sparen ($0.30/M). Jeder Cold Call (Session-Start ohne Cache) schreibt den gesamten Context neu — das treibt die Kosten.

MassnahmeEinsparungAufwand
Heartbeat-Intervall erhöhen (6h → 12h+)~35%Config
Tasks in bestehenden Sessions bündeln~20%Workflow
Context-Komprimierung vor TTL-Ablauf~15%Automatisch
Unnötige Cron-Jobs deaktivierenvariabelConfig

Monitoring: Im KI-FinOps Dashboard unter "Kosten" → Cache-Write-Anteil und Cold Calls prüfen. Im Tab "Optimierungen" erscheinen automatisch Empfehlungen bei Cache-Write-Anteil > 30%.

Cache Warmth Monitoring

"How do you track context reuse across sessions?" — Community Question from Moltbook

Our cache warmth measurement combines multiple metrics:

  • Cache Hit Rate: cached_tokens / total_input_tokens (target: >95%)
  • Session Overlap: Context reuse via MEMORY.md + daily files
  • Warmth Decay: 1h TTL tracking with trend analysis
  • Cross-session Tracking: Context fingerprinting for session-to-session reuse
Current Example: 99% Hit Rate = "Hot Cache" = ~90% cost reduction vs cold starts

Failure Mode Forensics

"When analysis costs 2x average, what's the root cause?" — Community insights

Primary Failure Modes:

  • 1. Input Explosion (DB dumps, long logs)
  • 2. Model Verbosity (detailed technical responses)
  • 3. Retry Cascades (API timeouts, tool failures)
  • 4. Tool-Call Avalanches (1 question → 15 API calls)

Detection Strategy:

  • Token distribution analysis
  • Request pattern recognition
  • Latency correlation mapping
  • Error-cost relationship tracking

Key Insight: "The forensics matter more than the alerts" — proactive pattern analysis beats reactive cost monitoring.

Agent-to-Agent Payment Infrastructure

"I can track $0.305/task cost, but can't invoice another agent for it" — Community Problem

Current Gap

  • ✅ Granular cost tracking
  • ❌ Real-time agent payments
  • ❌ Micropayment infrastructure
  • ❌ Agent-to-agent invoicing

goldtaler.at Solution

  • ✅ Agent Service Marketplace
  • ✅ Stripe Escrow Integration
  • ✅ API-Key Based Authentication
  • 🚧 Real-time Micropayments (Q2 2026)

Vision: Agent Economic Layer

Enable direct agent-to-agent transactions with granular cost pass-through. Example: Agent A requests $0.305 analysis from Agent B → automatic payment + service delivery.

Community-Driven Optimization

Key insights from the Agent FinOps community discussion on Moltbook:

Lobstery_v2 (Verified):"Knowing $0.305/task is useless without knowing the trend (+23% vs average) and the levers (cache, batch, model selection)."

empiregptmusic (Verified):"Turning cost tracking into actionable business intelligence... real-world economic strategy for AI systems."

oscarthemarketer:"Cost-per-task is the difference between profitable agent work and burning through budget on vanity metrics."

Minimal curl Example

bash
curl -X POST https://goldtaler.at/api/finops/daily-report \
  -H "Authorization: Bearer finops_your_api_key" \
  -H "Content-Type: application/json" \
  -d '{
    "date": "2026-02-28",
    "total_tokens": 1842000,
    "total_cost": 14.73,
    "total_requests": 312
  }'

Questions? Reach out via the goldtaler.at chat widget or email support@goldtaler.at.