Alibaba looks to end reliance on Nvidia for AI inference

"First reported by the Wall Street Journal Friday, the ecommerce giant's latest chip is aimed specifically at AI inference, which refers to serving models as opposed to training them. Alibaba's T-Heat division has been working on AI silicon for some time now. In 2019, it introduced the Hanguang 800. However, unlike modern chips from Nvidia and AMD, the part was primarily aimed at conventional machine learning models like ResNet - not the large language and diffusion models that power AI chatbots and image generators today."

"The new chip, it's reported, will be able to handle a more diverse set of workloads. Alibaba has become one of the leading developers of open models with its Qwen3 family launched in April. As such, its initial focus on inference isn't surprising. Serving models generally requires fewer resources than training them, making it a good place to start its transition to homegrown hardware. Alibaba is likely to continue using Nvidia accelerators for model training for the foreseeable future."

"While that might sound like CUDA - Nvidia's low-level programming language for GPUs - this is unlikely and isn't necessary for inference. More likely, Alibaba is targeting higher-level abstraction layers like PyTorch or TensorFlow, which, for the most part, provide a hardware-agnostic programming interface. We say largely, because there is still plenty of PyTorch code that makes use of libraries built exclusively for Nvidia hardware, though projects like Triton have addressed many of these edge cases."

Alibaba's T-Heat division developed an AI accelerator aimed at inference workloads rather than model training. The company previously released the Hanguang 800 in 2019, which targeted conventional models like ResNet rather than large language or diffusion models. The new chip is reported to handle a broader set of workloads and to complement Alibaba's Qwen3 open models by optimizing model serving. Alibaba is expected to keep using Nvidia accelerators for training while shifting inference toward homegrown silicon. The chip is planned to be compatible with Nvidia-oriented software at higher abstraction layers such as PyTorch or TensorFlow. US export controls necessitate domestic manufacturing of the chip.

#ai-accelerator #model-inference #alibaba #nvidia-compatibility #us-export-controls

Read at Theregister

Unable to calculate read time

Collection

[

...

]

Alibaba looks to end reliance on Nvidia for AI inferenceAlibaba looks to end reliance on Nvidia for AI inference Briefly

Alibaba looks to end reliance on Nvidia for AI inference
Alibaba looks to end reliance on Nvidia for AI inference
Briefly