Anthropic: Claude quota drain not caused by cache tweaks
Briefly

Anthropic: Claude quota drain not caused by cache tweaks
"The 5m TTL is disproportionately punishing for the long-session, high-context use case that defines Claude Code usage, as reported by user Sean Swanson."
"Writing to the five-minute cache costs 25 percent more in tokens, while writing to the one-hour cache costs 100 percent more, but reading from cache is around 10 percent of the base price."
"Jarred Sumner claimed that the change back to the five-minute cache made Claude Code cheaper because a meaningful share of requests are one-shot calls where the cached context is used once."
Anthropic changed the Claude Code prompt cache TTL from one hour to five minutes, affecting users who rely on long-session, high-context interactions. Users reported faster depleting quotas, with increased costs for writing to the five-minute cache. Jarred Sumner stated that the change could lower costs for one-shot calls. Sean Swanson noted that he had not hit a quota limit until the change, indicating potential issues with the new cache settings for regular users.
Read at Theregister
Unable to calculate read time
[
|
]