The system records every LLM call across OpenAI, Anthropic, Gemini, Azure OpenAI, and RubyLLM, capturing tokens, cost, latency, and attribution tags. It includes a built-in dashboard with rollups and line items. A new pre-send budget guard estimates input cost before sending and blocks requests when prior spend plus the estimate would exceed daily, monthly, or per-call limits. Azure OpenAI Foundry capture works without configuration for *.services.ai.azure.com and the /openai/v1/... path, in addition to classic *.openai.azure.com plus deployments URLs. Web-search fee capture records per-call web-search line items for specific search-preview models. Pricing supports multiple currencies via a prices file with currency metadata, propagating through snapshots, rollups, line items, and dashboard totals.
"budget_exceeded_behavior = :block_requests now also estimates the call's input cost and blocks before send when prior spend plus the estimate would cross a daily / monthly / per-call limit."
"Capture works out of the box on *.services.ai.azure.com and the /openai/v1/... path, in addition to the classic *.openai.azure.com + deployments URL."
"Calls to gpt-4o-search-preview, gpt-4o-mini-search-preview, and gpt-5-search-api now record the per-call web-search line item at OpenAI's "Web search preview" rate ($25/1k non-reasoning, $10/1k reasoning) - previously dropped."
"A prices_file with metadata.currency: "EUR" flows through the pricing snapshot, rollups, line items, and the dashboard total instead of being hardcoded to USD."
Read at Rubyflow
Unable to calculate read time
Collection
[
|
...
]