Dodgy Huawei chips nearly sunk DeepSeek's next-gen R2 mode
Briefly

DeepSeek's development of next-gen LLMs has been significantly delayed due to Huawei's AI chips proving unhelpful. Government pressure to use Huawei's silicon hampered progress, making stable training runs impossible. After months of issues, including data labeling challenges, DeepSeek switched to Nvidia's H20 GPUs for model training. Huawei's Ascend 910C chips, which were anticipated to outperform Nvidia's hardware, failed to provide the necessary stability, forcing their use to be limited to inference tasks. This pivot marked a significant setback for DeepSeek's anticipated model release.
The failure to utilize Huawei's chips forced DeepSeek to shift to Nvidia's H20 GPUs for the training of its next-gen LLMs due to instability and performance issues.
DeepSeek faced significant setbacks in training their successor to the R1 model, which the company attributes to immature software and problematic hardware from Huawei.
The Ascend 910C chips from Huawei, which were expected to outperform Nvidia's offerings, could not deliver stable training results, causing delays in DeepSeek's model development.
Challenges with data labeling and unstable hardware led DeepSeek to relegate Huawei’s accelerators to inference tasks, delaying the anticipated release of DeepSeek R2.
Read at Theregister
[
|
]