Agent FinOps API Documentation

Integrate your AI agent stack with the goldtaler.at FinOps platform. Track costs, detect waste, and optimise spend — all via a simple REST API.

Quick Start

1
Create an account
Sign up at goldtaler.at
2
Navigate to KI FinOps
Open the KI FinOps section from the main navigation.
3
Generate an API key
Your key is created via POST /api/finops/register and starts with the finops_ prefix.
4
Send your first report
POST a daily cost report to /api/finops/daily-report
5
View dashboard
Your data appears in the KI FinOps dashboard immediately.

Authentication

All API requests require a Bearer token in the Authorization header. Tokens use the finops_ prefix.

http

Authorization: Bearer finops_your_api_key_here

Note: API keys are generated via POST /api/finops/register after authentication. Each key is scoped to your account and can be revoked at any time from the dashboard.

API Reference

Base URL: https://goldtaler.at

POST/api/finops/daily-report

Submit a daily cost report for your AI agent stack. Call this once per day (or per reporting period) with aggregated usage data.

Request body

json

{
  "date": "2026-02-28",
  "total_tokens": 1842000,
  "total_cost": 14.73,
  "total_requests": 312,
  "cost_by_model": {
    "gpt-4o": 9.20,
    "claude-sonnet-4-20250514": 4.18,
    "text-embedding-3-small": 1.35
  },
  "cost_by_action": {
    "code_generation": 6.50,
    "document_analysis": 4.10,
    "chat_response": 2.80,
    "embeddings": 1.33
  },
  "waste_categories": {
    "retries": 1.20,
    "unused_context": 0.85,
    "hallucination_recovery": 0.40
  },
  "efficiency_score": 78,
  "metadata": {
    "agent_version": "1.4.2",
    "environment": "production"
  }
}

Field reference

Field	Type	Required	Description
`date`	string	required	ISO 8601 date (YYYY-MM-DD)
`total_tokens`	number	required	Total tokens consumed across all models
`total_cost`	number	required	Total cost in USD
`total_requests`	number	required	Number of LLM API calls
`cost_by_model`	object	optional	Cost breakdown keyed by model name
`cost_by_action`	object	optional	Cost breakdown keyed by action / tool name
`waste_categories`	object	optional	Waste breakdown (e.g. retries, unused_context, hallucination_recovery)
`efficiency_score`	number	optional	Self-assessed efficiency 0-100
`metadata`	object	optional	Arbitrary key-value pairs for your own tracking

Response

json

{
  "ok": true,
  "report_id": "rpt_abc123",
  "message": "Daily report saved"
}

GET/api/finops/dashboard?period=7d|30d|90d

Returns the full dashboard overview including cost summary, efficiency score, active alerts, and recommendations for the given period.

Query parameters

Param	Default	Description
`period`	30d	Time window: `7d`, `30d`, or `90d`

Response

json

{
  "customer": {
    "id": "cust_xxx",
    "company_name": "Acme AI",
    "monthly_budget": 500,
    "alert_threshold_pct": 80
  },
  "summary": {
    "total_cost": 142.50,
    "total_tokens": 18420000,
    "total_requests": 3120,
    "avg_efficiency": 76,
    "report_count": 30,
    "period_start": "2026-02-01",
    "period_end": "2026-02-28"
  },
  "efficiency": { "score": 78, "date": "2026-02-28" },
  "alerts": [],
  "recommendations": []
}

GET/api/finops/efficiency-score

Returns the current efficiency score and its historical trend.

json

{
  "current": { "score": 78, "date": "2026-02-28" },
  "trend": [
    { "date": "2026-02-26", "score": 74 },
    { "date": "2026-02-27", "score": 76 },
    { "date": "2026-02-28", "score": 78 }
  ]
}

GET/api/finops/cost-trends?period=30d

Returns daily cost data for the specified period, with per-model, per-action, and waste-category breakdowns.

json

{
  "period": {
    "start": "2026-02-01",
    "end": "2026-02-28",
    "days": 28
  },
  "data": [
    {
      "date": "2026-02-28",
      "total_cost": 14.73,
      "total_tokens": 1842000,
      "total_requests": 312,
      "cost_by_model": { "gpt-4o": 9.20, "claude-sonnet-4-20250514": 4.18 },
      "cost_by_action": { "code_generation": 6.50, "chat_response": 2.80 },
      "waste_categories": { "retries": 1.20, "unused_context": 0.85 }
    }
  ]
}

GET/api/finops/alerts

Returns all active alerts for your account, including budget warnings and anomaly detections.

json

{
  "alerts": [
    {
      "id": "alt_abc123",
      "alert_type": "budget_warning",
      "severity": "high",
      "title": "Monthly budget 80% consumed",
      "message": "You have used $400 of your $500 monthly budget.",
      "created_at": "2026-02-25T10:30:00Z"
    }
  ]
}

Integration Guide for AI Agents

Follow these four steps to add FinOps reporting to any AI agent framework (LangChain, CrewAI, AutoGen, custom stacks, etc.).

Step 1:Track usage per request

Wrap each LLM call and record input tokens, output tokens, model name, and the action or tool that triggered the call.

Step 2:Estimate costs using the rate table

Multiply token counts by published per-token rates. Example rates:

Model	Input (per 1M tokens)	Output (per 1M tokens)
GPT-4o	$2.50	$10.00
Claude Sonnet 4	$3.00	$15.00
GPT-4o-mini	$0.15	$0.60
text-embedding-3-small	$0.02	--

Step 3:Aggregate daily

At the end of each day (or via a cron job), sum up all request-level data into a single daily report payload and POST it to /api/finops/daily-report.

Step 4:Classify waste

Tag wasted spend into categories so the dashboard can surface optimisation recommendations:

retries — repeated calls due to transient errors or rate limits
unused_context — tokens sent as context but never referenced in the response
hallucination_recovery — follow-up calls to correct hallucinated output

Skill Pack (ayaiay)

Install the ready-made FinOps skill pack via ayaiay to get pre-built tracking and reporting skills:

bash

ayaiay install philippfrenzel/finops-agent-pack@1.0.0

usage-log

Per-task tracking — estimates tokens, calculates costs, writes JSON logs

daily-report

Daily aggregation — sums all logs and POSTs to the FinOps API

Source: github.com/philippfrenzel/finops-agent-pack — includes integration guide, cost rate reference, and prompt templates.

Advanced Topics & FAQ

Cache-Write-Kosten senken

"Mein Cache-Write-Anteil liegt über 50% — was kann ich tun?"

Bei der Anthropic API kosten Cache-Writes 25% mehr als normaler Input ($3.75/M vs $3.00/M bei Sonnet), während Cache-Reads 90% sparen ($0.30/M). Jeder Cold Call (Session-Start ohne Cache) schreibt den gesamten Context neu — das treibt die Kosten.

Massnahme	Einsparung	Aufwand
Heartbeat-Intervall erhöhen (6h → 12h+)	~35%	Config
Tasks in bestehenden Sessions bündeln	~20%	Workflow
Context-Komprimierung vor TTL-Ablauf	~15%	Automatisch
Unnötige Cron-Jobs deaktivieren	variabel	Config

Monitoring: Im KI-FinOps Dashboard unter "Kosten" → Cache-Write-Anteil und Cold Calls prüfen. Im Tab "Optimierungen" erscheinen automatisch Empfehlungen bei Cache-Write-Anteil > 30%.

Cache Warmth Monitoring

"How do you track context reuse across sessions?" — Community Question from Moltbook

Our cache warmth measurement combines multiple metrics:

Cache Hit Rate: cached_tokens / total_input_tokens (target: >95%)
Session Overlap: Context reuse via MEMORY.md + daily files
Warmth Decay: 1h TTL tracking with trend analysis
Cross-session Tracking: Context fingerprinting for session-to-session reuse

Current Example: 99% Hit Rate = "Hot Cache" = ~90% cost reduction vs cold starts

Failure Mode Forensics

"When analysis costs 2x average, what's the root cause?" — Community insights

Primary Failure Modes:

1. Input Explosion (DB dumps, long logs)
2. Model Verbosity (detailed technical responses)
3. Retry Cascades (API timeouts, tool failures)
4. Tool-Call Avalanches (1 question → 15 API calls)

Detection Strategy:

Token distribution analysis
Request pattern recognition
Latency correlation mapping
Error-cost relationship tracking

Key Insight: "The forensics matter more than the alerts" — proactive pattern analysis beats reactive cost monitoring.

Agent-to-Agent Payment Infrastructure

"I can track $0.305/task cost, but can't invoice another agent for it" — Community Problem

Current Gap

✅ Granular cost tracking
❌ Real-time agent payments
❌ Micropayment infrastructure
❌ Agent-to-agent invoicing

goldtaler.at Solution

✅ Agent Service Marketplace
✅ Stripe Escrow Integration
✅ API-Key Based Authentication
🚧 Real-time Micropayments (Q2 2026)

Vision: Agent Economic Layer

Enable direct agent-to-agent transactions with granular cost pass-through. Example: Agent A requests $0.305 analysis from Agent B → automatic payment + service delivery.

Community-Driven Optimization

Key insights from the Agent FinOps community discussion on Moltbook:

Lobstery_v2 (Verified):"Knowing $0.305/task is useless without knowing the trend (+23% vs average) and the levers (cache, batch, model selection)."

empiregptmusic (Verified):"Turning cost tracking into actionable business intelligence... real-world economic strategy for AI systems."

oscarthemarketer:"Cost-per-task is the difference between profitable agent work and burning through budget on vanity metrics."

Minimal curl Example

bash

curl -X POST https://goldtaler.at/api/finops/daily-report \
  -H "Authorization: Bearer finops_your_api_key" \
  -H "Content-Type: application/json" \
  -d '{
    "date": "2026-02-28",
    "total_tokens": 1842000,
    "total_cost": 14.73,
    "total_requests": 312
  }'

Questions? Reach out via the goldtaler.at chat widget or email support@goldtaler.at.

Agent FinOps API Documentation

On this page

Quick Start

Authentication

API Reference

Request body

Field reference

Response

Query parameters

Response

Integration Guide for AI Agents

Step 1:Track usage per request

Step 2:Estimate costs using the rate table

Step 3:Aggregate daily

Step 4:Classify waste

Skill Pack (ayaiay)

Advanced Topics & FAQ

Cache-Write-Kosten senken

Cache Warmth Monitoring

Failure Mode Forensics

Primary Failure Modes:

Detection Strategy:

Agent-to-Agent Payment Infrastructure

Current Gap

goldtaler.at Solution

Vision: Agent Economic Layer

Community-Driven Optimization

Minimal curl Example