About 445,000 results
Open links in new tab
  1. This figure compares the structural design of (a) standard Residual Connection, (b) Hyper-Connections (HC), and (c) our proposed Manifold-Constrained Hyper-Connections (mHC). …

  2. DeepSeek Unveils mHC: A Mathematical Fix for the "Exploding ...

    13 hours ago · The relentless pursuit of wider, more complex neural networks just hit a significant milestone. Researchers at DeepSeek-AI have released a paper on Manifold-Constrained …

  3. DeepSeek researchers detail a new mHC architecture they used ...

    1 day ago · Vincent Chow / South China Morning Post: DeepSeek researchers detail a new mHC architecture they used to train 3B, 9B, and 27B models, finding it scaled without adding …

  4. DeepSeek kicks off 2026 with paper signalling push to train ...

    20 hours ago · DeepSeek has published a technical paper co-authored by founder Liang Wenfeng proposing a rethink of its core deep learning architecture.

  5. DeepSeek researchers detail a new mHC architecture they used ...

    17 hours ago · Vincent Chow / South China Morning Post: DeepSeek researchers detail a new mHC architecture they used to train 3B, 9B, and 27B models, finding it scaled without adding …

  6. deepseek-ai (DeepSeek) - Hugging Face

    Oct 20, 2023 · DeepSeek (深度求索), founded in 2023, is a Chinese company dedicated to making AGI a reality. Unravel the mystery of AGI with curiosity. Answer the essential question …

  7. DeepSeek-AI Open Source DeepSeek-VL2 Series: 3B, 16B, and …

    Dec 16, 2024 · The architecture of DeepSeek-VL2 is designed to optimize performance while reducing computational demands. The dynamic slicing approach ensures that high-resolution …