#server-npus

[ follow ]
#ai-infrastructure
DevOps
fromMedium
1 day ago

The AI Infrastructure Stack in 2026: Companies Building the Future of AI

AI infrastructure companies are transforming the deployment and scaling of artificial intelligence into full production systems with essential governance and observability.
fromZDNET
1 month ago
Tech industry

Nvidia wants to own your AI data center from end to end

Nvidia expanded its AI infrastructure portfolio with five rack types, including a new LPX inference rack using Groq technology, positioning itself to control all data center processing.
fromComputerWeekly.com
1 month ago
Artificial intelligence

HPE taps Nvidia to transform distributed AI factories into intelligent AI grid | Computer Weekly

HPE launches AI Grid infrastructure powered by Nvidia GPUs to enable distributed, low-latency AI inference at edge locations for real-time applications across retail, manufacturing, healthcare, and telecommunications.
DevOps
fromTechzine Global
17 hours ago

95% of GPU capacity goes unused in Kubernetes clusters

GPU and CPU usage remains low despite rising cloud costs, highlighting inefficiencies in resource utilization as Kubernetes adoption increases.
DevOps
fromMedium
1 day ago

The AI Infrastructure Stack in 2026: Companies Building the Future of AI

AI infrastructure companies are transforming the deployment and scaling of artificial intelligence into full production systems with essential governance and observability.
Tech industry
fromZDNET
1 month ago

Nvidia wants to own your AI data center from end to end

Nvidia expanded its AI infrastructure portfolio with five rack types, including a new LPX inference rack using Groq technology, positioning itself to control all data center processing.
Artificial intelligence
fromComputerWeekly.com
1 month ago

HPE taps Nvidia to transform distributed AI factories into intelligent AI grid | Computer Weekly

HPE launches AI Grid infrastructure powered by Nvidia GPUs to enable distributed, low-latency AI inference at edge locations for real-time applications across retail, manufacturing, healthcare, and telecommunications.
#nvidia
Artificial intelligence
fromnews.bitcoin.com
2 days ago

Nvidia Releases Nemotron 3 Super, a 120B Open AI Model Built for Agentic Workloads

Nvidia launched Nemotron 3 Super, a 120 billion parameter model that significantly reduces AI compute costs and increases throughput.
Tech industry
fromInfoWorld
2 weeks ago

Nvidia's SchedMD acquisition puts open-source AI scheduling under scrutiny

Nvidia's acquisition of Slurm raises concerns about potential bias towards its own hardware in workload management.
Tech industry
fromTheregister
2 weeks ago

Nvidia embraces optical scale-up as copper reaches limits

Nvidia plans to integrate over a thousand GPUs into a single system using photonic interconnects by 2028, investing heavily in optics and interconnect technology.
Artificial intelligence
fromComputerworld
1 week ago

Nvidia's Stephen Jones on the toolkit powering GPUs: 'A wild ride'

Nvidia's CUDA toolkit is foundational for AI advancements and is driving innovations in quantum computing, robotics, and autonomous vehicles.
Business
from24/7 Wall St.
4 weeks ago

Nvidia Could Hit $340 by 2031 and the AI Buildout Is Just Getting Started

NVIDIA's stock is projected to reach $209.50 in one year and $298.29 in five years, driven by strong growth and strategic partnerships.
Artificial intelligence
from24/7 Wall St.
4 weeks ago

NVIDIA's GTC Developments Were Far Bigger Than the Market Realizes

Nvidia's stock remains stagnant despite significant innovations, with uncertainty about future reactions to developments in the AI sector.
Business
from24/7 Wall St.
18 hours ago

Forget Nvidia: Why HPE Could Be the Overlooked AI Infrastructure Play of 2026

Hewlett Packard Enterprise is an overlooked investment opportunity in AI infrastructure with strong financial growth and expanding margins.
Gadgets
fromThe Verge
11 hours ago

Framework's first eGPUs turn its laptop into a desktop PC

Framework introduces the OCuLink Dev Kit for external GPU support, targeting power users with advanced connectivity options.
#ai-agents
Web frameworks
fromInfoQ
1 day ago

Cloudflare Introduces Project Think: A Durable Runtime for AI Agents

Cloudflare's Project Think introduces durable AI agents with a kernel-like runtime, enabling long-lived workloads and preserving execution progress during platform restarts.
Software development
fromTechzine Global
5 days ago

OpenAI's new Agents SDK focuses on safety and scalability

OpenAI's updated Agents SDK enables developers to create autonomous AI agents for complex tasks with enhanced usability and a sandbox environment.
Web frameworks
fromInfoQ
1 day ago

Cloudflare Introduces Project Think: A Durable Runtime for AI Agents

Cloudflare's Project Think introduces durable AI agents with a kernel-like runtime, enabling long-lived workloads and preserving execution progress during platform restarts.
Software development
fromTechzine Global
5 days ago

OpenAI's new Agents SDK focuses on safety and scalability

OpenAI's updated Agents SDK enables developers to create autonomous AI agents for complex tasks with enhanced usability and a sandbox environment.
Tech industry
fromTNW | Artificial-Intelligence
2 days ago

Google in talks with Marvell Technology to build new AI inference chips alongside Broadcom TPU programme

Google is collaborating with Marvell Technology to develop new AI chips, enhancing its custom silicon supply chain for inference processing.
Productivity
fromSilicon Canals
3 days ago

I let AI plan my workdays down to the minute for a week - the shock wasn't my output, it was realizing how much of my old schedule had been performance - Silicon Canals

Using ChatGPT to manage a calendar revealed that much of the scheduled time was performance rather than productive work.
Data science
fromInfoQ
1 week ago

Google's TurboQuant Compression May Support Faster Inference, Same Accuracy on Less Capable Hardware

TurboQuant compresses language models' Key-Value caches by up to 6x with near-zero accuracy loss, enabling efficient use of modest hardware.
Environment
fromAxios
4 days ago

The best and worst states for AI data centers

Texas is attracting data center investments with tax incentives, while Maine is implementing a moratorium to evaluate the impact of data centers.
European startups
fromTheregister
5 days ago

NodeWeaver: Perpetual licensing beats VMware nickel-and-dime

Nodeweaver offers a cost-effective alternative for VMware customers, focusing on edge computing solutions without the complexity of traditional virtualization.
fromTechzine Global
17 hours ago

Snowflake Intelligence and Cortex Code become the agentic AI control layer

"Snowflake gives customers one place to bring their data together, connect the systems they rely on, and turn AI into something that actually helps teams get work done," says Baris Gultekin, VP of AI at Snowflake.
Artificial intelligence
DevOps
fromInfoQ
1 day ago

Anthropic Introduces Managed Agents to Simplify AI Agent Deployment

Anthropic's Managed Agents streamline agent-based workflows by handling execution complexities, allowing developers to focus on behavior and tools.
#scale-computing
Scala
fromTechzine Global
6 days ago

New Scale Computing gets new Velocity Partner Program

Scale Computing revamps its partner program to address market changes and strengthen relationships with partners amid industry challenges.
Software development
fromTechzine Global
5 days ago

Scale sets edge platform's software ever more free from hardware constraints

Scale Computing is reducing hardware requirements for its software, allowing more flexibility for partners and customers in choosing hardware platforms.
Scala
fromTechzine Global
6 days ago

New Scale Computing gets new Velocity Partner Program

Scale Computing revamps its partner program to address market changes and strengthen relationships with partners amid industry challenges.
Software development
fromTechzine Global
5 days ago

Scale sets edge platform's software ever more free from hardware constraints

Scale Computing is reducing hardware requirements for its software, allowing more flexibility for partners and customers in choosing hardware platforms.
#ai
from24/7 Wall St.
3 days ago
Business intelligence

Nuclear's AI Moment Is Here -- There Is Only 1 Play for the 4X Data Center Demand Explosion

fromTechCrunch
2 weeks ago
Artificial intelligence

Anthropic ups compute deal with Google and Broadcom amid skyrocketing demand | TechCrunch

Business intelligence
from24/7 Wall St.
3 days ago

Nuclear's AI Moment Is Here -- There Is Only 1 Play for the 4X Data Center Demand Explosion

Global data center power demand will quadruple by 2034, with nuclear energy being crucial for meeting this surge in energy needs.
Artificial intelligence
from24/7 Wall St.
6 days ago

AI Compute Demand is Running Way Ahead of Supply - A Stock I'd Buy on That Signal

AI-driven power demand is outpacing supply, creating a significant energy shortfall that may impact top energy producers.
Artificial intelligence
fromTechCrunch
2 weeks ago

Anthropic ups compute deal with Google and Broadcom amid skyrocketing demand | TechCrunch

Anthropic signed a new agreement with Google and Broadcom to expand compute capacity for its Claude AI models amid soaring demand.
Tech industry
fromTheregister
4 days ago

IOWN targets datacenter interconnects to spread AI infra

IOWN Global Forum focuses on datacenter interconnect use cases to enhance AI infrastructure connectivity and reduce costs for users.
#intel
Gadgets
fromTheregister
4 days ago

Intel eases reliance on TSMC with Core Series 3 CPUs

Intel has introduced budget-oriented Core Series 3 processors manufactured in the US using a 2nm process, offering a solid upgrade for older systems.
Gadgets
fromEngadget
5 days ago

Intel launches new Core Series 3 chips for mainstream laptops

Intel's new Core Series 3 chips offer significant performance improvements and exceptional battery life for mainstream laptops.
Gadgets
fromTheregister
4 days ago

Intel eases reliance on TSMC with Core Series 3 CPUs

Intel has introduced budget-oriented Core Series 3 processors manufactured in the US using a 2nm process, offering a solid upgrade for older systems.
Gadgets
fromEngadget
5 days ago

Intel launches new Core Series 3 chips for mainstream laptops

Intel's new Core Series 3 chips offer significant performance improvements and exceptional battery life for mainstream laptops.
Data science
fromTheregister
1 week ago

Nvidia slaps forehead: AI, that's what quantum needs!

Nvidia's AI models aim to reduce quantum processor error rates significantly, enhancing the reliability of quantum computing applications.
Science
fromNature
2 weeks ago

Breakthrough computer chip tech could help meet 'monumental demand' driven by AI

A new light source enables the creation of 8 nm wide structures on silicon wafers, increasing transistor density for advanced computer chips.
DevOps
fromComputerWeekly.com
4 days ago

AI, energy, and the new rules of cloud sustainability competition | Computer Weekly

Cloud providers offer sustainability metrics, but lack standardization makes it difficult for enterprises to compare workloads effectively.
#anthropic
fromAxios
1 day ago
Artificial intelligence

Anthropic bites back in the compute wars with Amazon partnership

Anthropic is investing heavily in compute capacity to enhance its Claude models, competing directly with OpenAI's infrastructure advantage.
Artificial intelligence
fromSilicon Canals
2 weeks ago

Why Anthropic is locking in 3.5 gigawatts of compute years before it comes online - Silicon Canals

Anthropic signed a major deal with Google and Broadcom for 3.5 gigawatts of compute capacity, signaling consolidation in the AI industry.
Artificial intelligence
fromAxios
1 day ago

Anthropic bites back in the compute wars with Amazon partnership

Anthropic is investing heavily in compute capacity to enhance its Claude models, competing directly with OpenAI's infrastructure advantage.
Artificial intelligence
fromSilicon Canals
2 weeks ago

Why Anthropic is locking in 3.5 gigawatts of compute years before it comes online - Silicon Canals

Anthropic signed a major deal with Google and Broadcom for 3.5 gigawatts of compute capacity, signaling consolidation in the AI industry.
Software development
fromTNW | Anthropic
5 days ago

Claude Opus 4.7 leads on SWE-bench and agentic reasoning, beating GPT-5.4 and Gemini 3.1 Pro

Claude Opus 4.7 is Anthropic's most capable model, outperforming competitors in software engineering and agentic reasoning with significant improvements.
Artificial intelligence
fromInfoQ
2 days ago

Designing Memory for AI Agents: Inside Linkedin's Cognitive Memory Agent

LinkedIn's Cognitive Memory Agent enables context-aware AI systems that retain knowledge across interactions, enhancing personalization and continuity.
DevOps
fromInfoQ
6 days ago

Cloudflare Launches Code Mode MCP Server to Optimize Token Usage for AI Agents

Cloudflare's Model Context Protocol significantly reduces API interaction costs for AI agents, enhancing efficiency and enabling better task reasoning.
Software development
fromInfoWorld
5 days ago

The two-pass compiler is back - this time, it's fixing AI code generation

Multi-pass compilers revolutionized programming by separating analysis and optimization, a model that could enhance AI code generation.
Tech industry
from24/7 Wall St.
2 weeks ago

Broadcom's Long-Term Google TPU Deal Is Bigger Than It Looks for AI Infrastructure

Broadcom's long-term agreement with Alphabet for custom TPUs enhances revenue visibility and positions the company for significant growth in AI semiconductor revenue.
fromAxios
5 days ago

Anthropic's AI downgrade stings power users

"Claude has regressed to the point it cannot be trusted to perform complex engineering," an AMD senior director wrote in a widely shared post on GitHub.
Artificial intelligence
Tech industry
fromThe Verge
4 weeks ago

Arm's first CPU ever will plug into Meta's AI datacenters later this year

Arm AGI CPU features up to 136 cores and claims double the performance per watt compared to x86 chips.
Tech industry
fromTheregister
1 month ago

A closer look at Nvidia's Groq-powered LPX rack systems

Nvidia acquired Groq for $20 billion primarily to accelerate time-to-market for SRAM-heavy inference chips rather than develop the technology independently, enabling faster token generation for AI reasoning workloads.
#meta
Tech industry
fromComputerworld
1 month ago

System-level 'coopetition': Why Nvidia's DGX Rubin NVL8 runs on Intel Xeon 6

Nvidia's flagship DGX Rubin NVL8 AI systems use Intel Xeon 6 processors as host CPUs to maintain x86 compatibility and meet enterprise deployment requirements.
Tech industry
fromTheregister
1 month ago

Nvidia slaps Groq into new LPX racks for faster AI response

Nvidia integrates Groq's language processing units into Vera Rubin systems to dramatically accelerate LLM inference, enabling hundreds to thousands of tokens per second per user.
#ai-efficiency
DevOps
fromInfoWorld
1 month ago

5 requirements for using MCP servers to connect AI agents

Organizations deploying MCP servers for agent-to-agent communication must establish upfront strategy, nonfunctional requirements, and security protocols to ensure safer and more trustworthy deployments.
Data science
fromTechRepublic
1 month ago

Inside the Gas Engine Strategy Powering AI's Next Wave

Gas reciprocating engines are emerging as a critical power solution for AI data centers, with manufacturers like Caterpillar securing multi-gigawatt orders to meet demand that exceeds grid and turbine capacity.
Artificial intelligence
fromMedium
4 weeks ago

Less Compute, More Impact: How Model Quantization Fuels the Next Wave of Agentic AI

Model quantization and architectural optimization can outperform larger models, challenging the belief that more GPUs equal greater intelligence.
Artificial intelligence
fromTechCrunch
1 month ago

Niv-AI exits stealth to wring more power performance out of GPUs | TechCrunch

AI data centers waste significant power due to GPU demand surges, forcing operators to throttle performance by up to 30%, prompting startups like Niv-AI to develop precision power management solutions.
Artificial intelligence
fromComputerworld
1 month ago

Nvidia NemoClaw promises to run OpenClaw agents securely

Nvidia introduced NemoClaw with OpenShell security features to address OpenClaw's enterprise security vulnerabilities through sandbox isolation and policy enforcement.
Artificial intelligence
fromTechzine Global
1 month ago

Nvidia's Groq 3 LPU targets agentic AI inference at GTC 2026

Nvidia's acquisition of Groq technology produces the Groq 3 LPU, a specialized inference chip delivering 40 petabytes per second bandwidth, significantly outpacing GPU inference speeds.
Artificial intelligence
fromInfoWorld
1 month ago

Nvidia launches Nemotron 3 Super to power enterprise AI agents

Nemotron 3 Super's hybrid architecture combining Mamba and Transformer technologies enables enterprises to run complex AI agents more efficiently with lower costs and faster execution on existing infrastructure.
fromTheregister
2 months ago

Intel greets memory apocalypse with Xeon workstation CPUs

The Xeon 600 lineup spans the gamut between 12 and 86 performance cores (no cut-down efficiency cores here), with support for between four and eight channels of DDR5 and 80 to 128 lanes of PCIe 5.0 connectivity. Compared to its aging W-3500-series chips, Intel is claiming a 9 percent uplift in single threaded workloads and up to 61 percent higher performance in multithreaded jobs, thanks in no small part to an additional 22 processor cores this generation.
Tech industry
fromTechzine Global
1 month ago

Meta shifts to AI inference with its future chips

Four generations, MTIA 300, 400, 450, and 500, have been produced within less than two years, with several already in production and others scheduled for mass deployment in 2026 and 2027. The quick pace is deliberate. Rather than betting on a single chip generation and waiting years for results, Meta has adopted a roughly six-month cadence per generation, using modular chiplet architecture to enable incremental upgrades without replacing entire rack systems.
Artificial intelligence
#neoclouds
Artificial intelligence
fromComputerWeekly.com
1 month ago

Edge AI: What's working and what isn't | Computer Weekly

Edge AI deployment success depends on identifying efficient, narrow use cases with manageable risks rather than pursuing sophisticated, large-scale models across all applications.
Artificial intelligence
from24/7 Wall St.
1 month ago

NVIDIA Cements Its Role as the Backbone of AI Infrastructure

NVIDIA's networking revenue grew 162% year-over-year to $8.2 billion, nearly tripling GPU growth, signaling a shift from chip seller to integrated infrastructure provider selling complete AI data center systems.
fromCointelegraph
2 months ago

What Role Is Left for Decentralized GPU Networks in AI?

What we are beginning to see is that many open-source and other models are becoming compact enough and sufficiently optimized to run very efficiently on consumer GPUs,
Artificial intelligence
Artificial intelligence
fromInfoWorld
1 month ago

Why AI requires rethinking the storage-compute divide

AI workloads require continuous processing of unstructured multimodal data, causing redundant data movement and transformation that wastes infrastructure costs and data scientist time.
fromTechCrunch
2 months ago

Quadric rides the shift from cloud AI to on-device inference - and it's paying off | TechCrunch

The company, which is based in San Francisco and has an office in Pune, India, is targeting up to $35 million this year as it builds a royalty-driven on-device AI business. That growth has buoyed the company, which now has post-money valuation of between $270 million and $300 million, up from around $100 million in its 2022 Series B, Kheterpal said.
Artificial intelligence
Artificial intelligence
fromTechzine Global
2 months ago

OpenAI seeks faster alternatives to Nvidia chips

OpenAI seeks alternative inference chips with larger on-chip SRAM to improve response speed for coding and AI-to-AI communication, aiming for about 10% of future inference capacity.
fromInfoQ
2 months ago

NVIDIA Dynamo Planner Brings SLO-Driven Automation to Multi-Node LLM Inference

The new capabilities center on two integrated components: the Dynamo Planner Profiler and the SLO-based Dynamo Planner. These tools work together to solve the "rate matching" challenge in disaggregated serving. The teams use this term when they split inference workloads. They separate prefill operations, which process the input context, from decode operations that generate output tokens. These tasks run on different GPU pools. Without the right tools, teams spend a lot of time determining the optimal GPU allocation for these phases.
Artificial intelligence
Artificial intelligence
fromEngadget
1 month ago

AI data centers could reduce power draw on demand, study says

AI data centers can dynamically reduce energy consumption by up to 40% without disrupting critical workloads, enabling grid stability and reducing infrastructure strain.
[ Load more ]