Latest Search
Quote
| Back Zoom + Zoom - | |
|
Liang Wenfeng, Others Publish Signed Paper Claiming DeepSeek Opens New Chapter in Architecture
Recommend 2 Positive 4 Negative 1 |
|
|
|
|
Chinese AI startup DeepSeek released a new paper on New Year's Day, proposing a new architecture called mHC (Manifold Constrained Hyperconnection), aimed at addressing the instability issues of traditional hyperconnections in large-scale model training while maintaining significant performance gains. DeepSeek's mHC extends the traditional Transformer single residual stream into a multi-stream parallel architecture, utilizing the Sinkhorn-Knopp algorithm to constrain the connection matrix on a doubly stochastic matrix manifold, successfully resolving the numerical instability and signal explosion issues caused by the disruption of identity mapping properties in hyperconnections (HC) during large-scale training. The first authors of the paper included Zhenda Xie, Yixuan Wei, and Huanqi Cao. DeepSeek founder Liang Wenfeng was also listed among the authors. AASTOCKS Financial News Website: www.aastocks.com |
|
