TechFlow reports, on January 21, according to Jinshi Data, as DeepSeek-R1 marks its first anniversary, a new model "MODEL1" has been revealed. DeepSeek updated the FlashMLA code on GitHub, mentioning MODEL1 in 28 out of 114 files, appearing alongside V32 as a separate model. Given that V32 refers to DeepSeek-V3.2, MODEL1 is likely a new architecture. Specific differences in the code are reflected in KV cache layout, sparsity handling, and FP8 decoding, with multiple variations in memory optimization. (QbitAI)
Navigating Web3 tides with focused insights
Contribute An Article
Media Requests
Risk Disclosure: This website's content is not investment advice and offers no trading guidance or related services. Per regulations from the PBOC and other authorities, users must be aware of virtual currency risks. Contact us / support@techflowpost.com ICP License: 琼ICP备2022009338号




