China's DeepSeek kicked off 2026 with a new AI training method that analysts say is a 'breakthrough' for scaling

"The Chinese AI startup published a research paper on Wednesday, describing a method to train large language models that could shape "the evolution of foundational models," it said. The paper, co-authored by its founder Liang Wenfeng, introduces what DeepSeek calls "Manifold-Constrained Hyper-Connections," or mHC, a training approach designed to scale models without them becoming unstable or breaking altogether. As language models grow, researchers often try to improve performance by allowing different parts of a model to share more information internally."

"As language models grow, researchers often try to improve performance by allowing different parts of a model to share more information internally. However, this increases the risk of the information becoming unstable, the paper said. DeepSeek's latest research enables models to share richer internal communication in a constrained manner, preserving training stability and computational efficiency even as models scale, it added."

DeepSeek introduced Manifold-Constrained Hyper-Connections (mHC) to enable model scaling without training instability. mHC permits richer internal communication among model components while constraining those interactions to maintain stability and computational efficiency. The approach combines techniques to limit additional training cost and to improve performance even with modest extra compute. mHC supports end-to-end redesign of the training stack to enable rapid experimentation and unconventional architectures. The method positions DeepSeek to scale toward larger flagship models and to mitigate compute bottlenecks during model development.

#manifold-constrained-hyper-connections-mhc #model-scaling #training-stability #computational-efficiency

Read at Business Insider

Unable to calculate read time

Collection

[

...

]

China's DeepSeek kicked off 2026 with a new AI training method that analysts say is a 'breakthrough' for scalingChina's DeepSeek kicked off 2026 with a new AI training method that analysts say is a 'breakthrough' for scaling Briefly

China's DeepSeek kicked off 2026 with a new AI training method that analysts say is a 'breakthrough' for scaling
China's DeepSeek kicked off 2026 with a new AI training method that analysts say is a 'breakthrough' for scaling
Briefly