#low-latency-inference

[ follow ]
Artificial intelligence
fromComputerWeekly.com
3 hours ago

HPE taps Nvidia to transform distributed AI factories into intelligent AI grid | Computer Weekly

HPE launches AI Grid infrastructure powered by Nvidia GPUs to enable distributed, low-latency AI inference at edge locations for real-time applications across retail, manufacturing, healthcare, and telecommunications.
Artificial intelligence
fromTechzine Global
1 month ago

OpenAI swaps Nvidia for Cerebras with GPT-5.3-Codex-Spark

GPT-5.3-Codex-Spark is a Cerebras-optimized, low-latency encoding model generating over 1,000 tokens/sec to enable immediate, minimal, real-time developer code adjustments.
Artificial intelligence
fromTechCrunch
1 month ago

A new version of OpenAI's Codex is powered by a new dedicated chip | TechCrunch

OpenAI released GPT-5.3-Codex-Spark, a lightweight, low-latency Codex model using Cerebras WSE-3 hardware to accelerate inference and enable real-time collaboration and rapid prototyping.
Artificial intelligence
fromTechCrunch
2 months ago

OpenAI signs deal, reportedly worth $10 billion, for compute from Cerebras | TechCrunch

OpenAI signed a multi-year agreement with Cerebras for 750 megawatts of compute through 2028 to accelerate low-latency inference and speed customer-facing responses.
[ Load more ]