GitHub Slashes Agent Workflow Token Spend up to 62% with Daily Audits and MCP Pruning
Briefly

GitHub Slashes Agent Workflow Token Spend up to 62% with Daily Audits and MCP Pruning
GitHub reduced token usage in agentic workflows running in its repositories by pruning unused Model Context Protocol tools, replacing MCP calls with GitHub CLI invocations, and adding daily audit and optimisation agents. Token usage is measured per run through a token-usage.jsonl artefact that records input, output, and cache tokens in a normalized format across Claude CLI, Copilot CLI, and Codex CLI. An Effective Tokens metric weights output tokens at 4× and cache reads at 0.1×, then applies model multipliers to compare tiers. A 10% drop in Effective Tokens corresponds to a 10% cost reduction. A Daily Token Usage Auditor aggregates consumption, flags anomalies, and identifies expensive jobs, while a Daily Token Optimiser proposes fixes such as removing unused MCP tools and using gh CLI for pull request diffs and file contents.
"GitHub routes every agent call through an API proxy and now writes a token-usage.jsonl artefact for each run that captures input, output and cache tokens in one normalised format across Claude CLI, Copilot CLI and Codex CLI. To compare across model tiers, the team uses an Effective Tokens (ET) metric that weights output tokens by 4× and cache reads by 0.1×, then applies a model multiplier (Haiku at 0.25×, Sonnet at 1.0×, Opus at 5.0×). A 10% drop in ET maps to a 10% cost reduction regardless of the model in use."
"Two agentic workflows drive the optimisation loop. A Daily Token Usage Auditor aggregates consumption by workflow, flags anomalous runs and surfaces the most expensive jobs. When the auditor highlights a workflow, a Daily Token Optimiser reads the source and recent logs, opens a GitHub issue, and proposes specific fixes. Both agents themselves appear in the same daily reports."
"The most common inefficiency the optimiser finds is unused MCP tools. Because LLM APIs are stateless, agent runtimes include tool schemas with every request, so a GitHub MCP server with 40 tools can add 10 to 15 KB of schema per turn. Removing unused entries cuts per-call context by 8 to 12 KB in GitHub's smoke-test workflows."
"The team also replaced MCP calls for fetching pull request diffs and file contents with gh CLI commands, either pre-downloaded into workspace files before the agent starts or proxied at runtime through a transparent HTTP proxy that keeps a"
Read at InfoQ
Unable to calculate read time
[
|
]