<div id="readability-page-1">DeepSeek 凭新版 R1 跻身全球第二，开源战力封王。 智东西 5 月 30 日消息，今日，知名独立 AI 基准测试和分析机构 Artificial Analysis 发布报告并提到，DeepSeek 凭借新版 R1 超越 xAI、Meta 和 Anthropic，成为（与谷歌）并列的全球第二大 AI 实验室。报告一经分享，就在社交平台 X 上获得了超 30 万的浏览量以及大量网友讨论和转发。 <img src="https://imageproxy.pbkrs.com/https://wpimg-wscn.awtmt.com/5680c3e9-552b-4ab3-a997-338e20faac0d.jpeg?x-oss-process=image/auto-orient,1/interlace,1/resize,w_1440,h_1440/quality,q_95/format,jpg" width="1024" height="829" original-src="https://imageproxy.pbkrs.com/https://wpimg-wscn.awtmt.com/5680c3e9-552b-4ab3-a997-338e20faac0d.jpeg"/> 在该机构提出的 AI 分析指数中，DeepSeek-R1-0528 的指数从 60 分跃升至第 68 分，与谷歌 Gemini 2.5 Pro 并列第三。这一 AI 分析指数是 Artificial Analysis 对所有领先模型独立进行的 MMLU-Pro、GPQA Diamond 等 7 项领先评估的指数。 <img src="https://imageproxy.pbkrs.com/https://wpimg-wscn.awtmt.com/96a0fd23-5948-4708-b9e1-17b2a56e54eb.jpeg?x-oss-process=image/auto-orient,1/interlace,1/resize,w_1440,h_1440/quality,q_95/format,jpg" width="1024" height="548" original-src="https://imageproxy.pbkrs.com/https://wpimg-wscn.awtmt.com/96a0fd23-5948-4708-b9e1-17b2a56e54eb.jpeg"/> DeepSeek 本次增幅与 OpenAI 的 o1 和 o3 之间的差异（从 62 分到第 70 分）相同。这使 DeepSeek R1 的智能程度超越了 xAI 的 Grok 3 mini（高版本）、NVIDIA 的 Llama Nemotron Ultra、Meta 的 Llama 4 Maverick、阿里巴巴的 Qwen3-235B，并与谷歌的 Gemini 2.5 Pro 相当。 <img src="https://imageproxy.pbkrs.com/https://wpimg-wscn.awtmt.com/4754e1cb-e29e-48a3-bf9a-5550293a64d3.png?x-oss-process=image/auto-orient,1/interlace,1/resize,w_1440,h_1440/quality,q_95/format,jpg" width="1024" height="845" original-src="https://imageproxy.pbkrs.com/https://wpimg-wscn.awtmt.com/4754e1cb-e29e-48a3-bf9a-5550293a64d3.png"/> ▲社交平台 X 上的网友评论（英文已翻译为中文） 在 X 平台上，众多外国网友发出 “真快！”、“极好的！”“令人印象深刻” 等赞叹。 有网友称 DeepSeek-R1-0528 的 “飞跃是开源 AI 的里程碑”，有的则赞叹其 RL（强化学习）驱动改进的成功表明 “RL 比预训练更有效率”。同时，也有网友认为基准测试与实际应用仍有区别。 <img src="https://imageproxy.pbkrs.com/https://wpimg-wscn.awtmt.com/f2e5e4d8-a787-4d04-8daf-3e45cd5932fd.jpeg?x-oss-process=image/auto-orient,1/interlace,1/resize,w_1440,h_1440/quality,q_95/format,jpg" width="1024" height="776" original-src="https://imageproxy.pbkrs.com/https://wpimg-wscn.awtmt.com/f2e5e4d8-a787-4d04-8daf-3e45cd5932fd.jpeg"/> ▲社交平台 X 上的网友评论（英文已翻译为中文） 还有网友联想到 AI 竞争，称 “DeepSeek 的 R1 动作就像在参加比赛一样”，并表示随着下一轮基准测试到来，游戏才刚刚开始。 <img src="https://imageproxy.pbkrs.com/https://wpimg-wscn.awtmt.com/124901eb-76f6-4461-a959-97cf39f23d3c.jpeg?x-oss-process=image/auto-orient,1/interlace,1/resize,w_1440,h_1440/quality,q_95/format,jpg" width="1024" height="583" original-src="https://imageproxy.pbkrs.com/https://wpimg-wscn.awtmt.com/124901eb-76f6-4461-a959-97cf39f23d3c.jpeg"/> ▲社交平台 X 上的网友评论（英文已翻译为中文） <h2>DeepSeek 成全球第二大 AI 实验室 开源领域第一</h2> Artificial Analysis 的 AI 分析指数包含 7 项评估：MMLU-Pro、GPQA Diamond、Humanity's Last Exam、LiveCodeBench、SciCode、AIME、MATH-500。 DeepSeek-R1-0528 实现了多方面智能提升：最大的突破体现在 AIME 2024（竞赛数学，+21 分）、LiveCodeBench（代码生成，+15 分）、GPQA Diamond（科学推理，+10 分）和人类的最后考试（推理与知识，+6 分）中。 如下图所示，DeepSeek-R1-0528 在 AI 分析指数得分达到 68 分，仅次于 OpenAI o4-mini（高版本）的 70 分和 OpenAI o3 的 69 分。与谷歌 Gemini 2.5 Pro 的 68 分持平。 <img src="https://imageproxy.pbkrs.com/https://wpimg-wscn.awtmt.com/3b507b16-7370-4556-a943-32790a9a1645.jpeg?x-oss-process=image/auto-orient,1/interlace,1/resize,w_1440,h_1440/quality,q_95/format,jpg" width="1024" height="479" original-src="https://imageproxy.pbkrs.com/https://wpimg-wscn.awtmt.com/3b507b16-7370-4556-a943-32790a9a1645.jpeg"/> 开源模型和封闭模型之间的差距比以往任何时候都小。如下图所示，蓝色矩形代表开原模型，黑色矩形代表封闭模型，DeepSeek-R1-0528 以 68 分稳居第一，而后是 62 分的 Qwen3-235B。 <img src="https://imageproxy.pbkrs.com/https://wpimg-wscn.awtmt.com/4abca26c-6b76-4a79-a30a-2619fce7f84d.jpeg?x-oss-process=image/auto-orient,1/interlace,1/resize,w_1440,h_1440/quality,q_95/format,jpg" width="1024" height="548" original-src="https://imageproxy.pbkrs.com/https://wpimg-wscn.awtmt.com/4abca26c-6b76-4a79-a30a-2619fce7f84d.jpeg"/> <h2>编程和数学能力突出 三年里一直加速追赶</h2> 拆开来看，在编程能力（参考 LiveCodeBench 和 SciCode 测试）上，DeepSeek-R1-0528 以 59 分居于并列第二位，仅次于 OpenAI o4-mini（高版本）的 63 分。 <img src="https://imageproxy.pbkrs.com/https://wpimg-wscn.awtmt.com/1263405e-18cb-48e5-ba87-ebba29c42674.jpeg?x-oss-process=image/auto-orient,1/interlace,1/resize,w_1440,h_1440/quality,q_95/format,jpg" width="1024" height="484" original-src="https://imageproxy.pbkrs.com/https://wpimg-wscn.awtmt.com/1263405e-18cb-48e5-ba87-ebba29c42674.jpeg"/> 在数学能力（参考 AIME 2024 和 Math-500）上，DeepSeek-R1-0528 以 94 分居于第四位，仅次于 OpenAI o4-mini（高版本）的 96 分、Grok 3 mini Reasoning（高版本）的 96 分和 OpenAI o3 的 95 分。 <img src="https://imageproxy.pbkrs.com/https://wpimg-wscn.awtmt.com/b4261897-a000-4b4b-a668-f62230256d52.jpeg?x-oss-process=image/auto-orient,1/interlace,1/resize,w_1440,h_1440/quality,q_95/format,jpg" width="1024" height="484" original-src="https://imageproxy.pbkrs.com/https://wpimg-wscn.awtmt.com/b4261897-a000-4b4b-a668-f62230256d52.jpeg"/> 将时间维度拉长，可以看到三年以来 DeepSeek 一直在缩短与 OpenAI 的差距。其一直保持着领先的 AI 实验室地位，在 2025 年 1 月大幅就逼近 OpenAI。 <img src="https://imageproxy.pbkrs.com/https://wpimg-wscn.awtmt.com/8d16f615-7cd8-4b02-890f-578c0cd66d1e.jpeg?x-oss-process=image/auto-orient,1/interlace,1/resize,w_1440,h_1440/quality,q_95/format,jpg" width="1024" height="548" original-src="https://imageproxy.pbkrs.com/https://wpimg-wscn.awtmt.com/8d16f615-7cd8-4b02-890f-578c0cd66d1e.jpeg"/> DeepSeek 在 1 月份发布的 R1 版本是开放权重模型首次获得第二名，而 DeepSeek 今天的 R1 更新使其重回同一位置。 <h2>智能和价格的平衡 “性价比之王”</h2> 从价格来看，DeepSeek-R1-0528 的价格为 0.96 美元/百万 tokens，OpenAI o4-mini（高版本）的价格为 1.93 美元/百万 tokens，OpenAI o3 的价格甚至高达 17.5 美元/百万 tokens。DeepSeek-R1-0528 堪称 “性价比之王”。注意，这里的价格由输入价格和输出价格（3:1 比例）综合而成。 <img src="https://imageproxy.pbkrs.com/https://wpimg-wscn.awtmt.com/ab0eff05-b906-467c-bd2a-60e8c90c9c9a.jpeg?x-oss-process=image/auto-orient,1/interlace,1/resize,w_1440,h_1440/quality,q_95/format,jpg" width="1024" height="703" original-src="https://imageproxy.pbkrs.com/https://wpimg-wscn.awtmt.com/ab0eff05-b906-467c-bd2a-60e8c90c9c9a.jpeg"/> 从输入输出各自价格来看，DeepSeek-R1-0528 输入价格为 0.55 美元/百万 tokens，输出价格为 2.19 美元/百万 tokens。其低于 OpenAI o4-mini（高版本）的输入价格 1.1 美元/百万 tokens，输出价格 4.4 美元/百万 tokens；远低于 o3 的输入价格 10 美元/百万 tokens，输出价格 40 美元/百万 tokens。 <img src="https://imageproxy.pbkrs.com/https://wpimg-wscn.awtmt.com/08747b30-cbbb-485b-913e-dc19aad5c862.jpeg?x-oss-process=image/auto-orient,1/interlace,1/resize,w_1440,h_1440/quality,q_95/format,jpg" width="1024" height="517" original-src="https://imageproxy.pbkrs.com/https://wpimg-wscn.awtmt.com/08747b30-cbbb-485b-913e-dc19aad5c862.jpeg"/> 从输出速度来看，DeepSeek-R1-0528 输出速度达到 32.01tokens/秒，OpenAI o4-mini（高版本）的速度为 129.37tokens/秒，o3 的速度为 150.73tokens/秒。 <img src="https://imageproxy.pbkrs.com/https://wpimg-wscn.awtmt.com/b3782aa2-8d35-4951-9445-96929f8b6fee.jpeg?x-oss-process=image/auto-orient,1/interlace,1/resize,w_1440,h_1440/quality,q_95/format,jpg" width="1024" height="700" original-src="https://imageproxy.pbkrs.com/https://wpimg-wscn.awtmt.com/b3782aa2-8d35-4951-9445-96929f8b6fee.jpeg"/> 从第一个应答 token 的时间来看，DeepSeek-R1-0528 的 “思考” 时间达到 65.6 秒，思考较久。 <img src="https://imageproxy.pbkrs.com/https://wpimg-wscn.awtmt.com/f6481c8f-0edc-486b-9f43-f309b5d5176e.jpeg?x-oss-process=image/auto-orient,1/interlace,1/resize,w_1440,h_1440/quality,q_95/format,jpg" width="1024" height="511" original-src="https://imageproxy.pbkrs.com/https://wpimg-wscn.awtmt.com/f6481c8f-0edc-486b-9f43-f309b5d5176e.jpeg"/> 此外，DeepSeek 新版 R1 增加了 token 使用量：R1-0528 使用了 9900 万个 token 来完成 AI 分析指数的评估，比原始 R1 的 7100 万个 token 多 40%，即新的 R1 比原始 R1 思考的时间更长。这仍然不是我们所见过的最高 token 使用量：Gemini 2.5 Pro 使用的 token 比 R1-0528 多 30%。 <img src="https://imageproxy.pbkrs.com/https://wpimg-wscn.awtmt.com/d68d7077-bb0a-466f-b7e4-baa687a82e53.jpeg?x-oss-process=image/auto-orient,1/interlace,1/resize,w_1440,h_1440/quality,q_95/format,jpg" width="1024" height="510" original-src="https://imageproxy.pbkrs.com/https://wpimg-wscn.awtmt.com/d68d7077-bb0a-466f-b7e4-baa687a82e53.jpeg"/> <h2>结语：开源媲美闭源 中国 AI 实验室赶上美国同行</h2> 当下，开源模型和封闭模型之间的差距比以往任何时候都小。DeepSeek 在 1 月份发布的 R1 版本是开放权重模型首次获得第二名，而 DeepSeek 今天的 R1 更新使其重回同一位置。 同时，来自中国 AI 实验室的模型几乎完全赶上了美国同行，这次发布的版本延续了这一新兴趋势。截至今天，DeepSeek 在 AI 分析智能指数方面领先于包括 Anthropic 和 Meta 在内的美国 AI 实验室。 风险提示及免责条款 市场有风险，投资需谨慎。本文不构成个人投资建议，也未考虑到个别用户特殊的投资目标、财务状况或需要。用户应考虑本文中的任何意见、观点或结论是否符合其特定状况。据此投资，责任自负。</div>

深度求索

谷歌-C

DeepSeek 凭借新版 R1 成为全球第二大 AI 实验室，与谷歌并列。根据 Artificial Analysis 的报告，DeepSeek 的 AI 分析指数从 60 分跃升至 68 分，超越 xAI、Meta 和 Anthropic。该指数评估了多个领先模型，DeepSeek 的进步与 OpenAI 的 o1 和 o3 相当。社交平台上，网友对 DeepSeek 的表现表示赞赏，认为其飞跃是开源 AI 的里程碑，但也有人指出基准测试与实际应用的差异。

- DeepSeek 凭新版 R1 成为全球第二大 AI 实验室。  
- AI 分析指数显示其得分从 60 分跃升至 68 分。  
- 开源模型与闭源模型之间的差距显著缩小。  

DeepSeek 成全球第二大 AI 实验室，OpenAI 谷歌坐不住了