--- title: "HLE“人類最後考試” 首次突破 60 分!Eigen-1 基於 DeepSeek V3.1 顯著領先 Grok4、GPT-5" description: "Eigen-1 多智能體系統在 HLE Bio/Chem Gold 測試集上取得歷史性突破,Pass@1 準確率達到 48.3%,Pass@5 準確率達到 61.74%,首次超過 60 分,領先谷歌 Gemini 2.5 Pro、OpenAI GPT-5 和 Grok 4。該成就基於開源的 DeepSeek V3.1,而非閉源超大模型。" type: "news" locale: "zh-HK" url: "https://longbridge.com/zh-HK/news/259215649.md" published_at: "2025-09-28T11:59:11.000Z" --- # HLE“人類最後考試” 首次突破 60 分!Eigen-1 基於 DeepSeek V3.1 顯著領先 Grok4、GPT-5 > Eigen-1 多智能體系統在 HLE Bio/Chem Gold 測試集上取得歷史性突破,Pass@1 準確率達到 48.3%,Pass@5 準確率達到 61.74%,首次超過 60 分,領先谷歌 Gemini 2.5 Pro、OpenAI GPT-5 和 Grok 4。該成就基於開源的 DeepSeek V3.1,而非閉源超大模型。 就在最近,由耶魯大學唐相儒、王昱婕,上海交通大學徐望瀚,UCLA 萬冠呈,牛津大學尹榛菲,Eigen AI 金帝、王瀚鋭等團隊聯合開發的 Eigen-1 多智能體系統實現了歷史性突破——在 HLE Bio/Chem Gold 測試集上,Pass@1 準確率達到 48.3%,Pass@5 準確率更是飆升至 61.74%,首次跨越 60 分大關。這一成績遠超谷歌 Gemini 2.5 Pro、OpenAI GPT-5 和 Grok 4。最令人振奮的是,這一成就並非依賴閉源超大模型,而是完全基於開源的 DeepSeek V3.1 搭建。 ### Related Stocks - [OpenAI.NA - OpenAI](https://longbridge.com/zh-HK/quote/OpenAI.NA.md) - [GOOG.US - 谷歌-C](https://longbridge.com/zh-HK/quote/GOOG.US.md) ## Related News & Research | Title | Description | URL | |-------|-------------|-----| | GPT-5 bests human judges in legal smack down | Legal scholars have found that OpenAI's GPT-5 outperforms human judges in adhering to the law, achieving a 100% complian | [Link](https://longbridge.com/zh-HK/news/276008190.md) | | Microsoft Seeks Greater AI Independence From OpenAI | Microsoft is working to reduce its dependence on OpenAI by developing its own AI models, aiming for self-sufficiency by | [Link](https://longbridge.com/zh-HK/news/275781856.md) | | OpenAI's GPT-5.3-Codex Faces California AI Safety Law Scrutiny As Watchdog Alleges High-Risk Violations | OpenAI is facing potential fines for alleged violations of California's AI safety law with its GPT-5.3-Codex model. The | [Link](https://longbridge.com/zh-HK/news/275584531.md) | | OpenAI’s supposedly ‘leaked’ Super Bowl ad with ear buds and a shiny orb was a hoax | OpenAI's rumored Super Bowl ad featuring earbuds and a shiny orb was revealed to be a hoax. The false information stemme | [Link](https://longbridge.com/zh-HK/news/275266132.md) | | OpenAI says starting to roll out a test for ads in ChatGPT today to a subset of free and Go users in the U.S. | OpenAI says starting to roll out a test for ads in ChatGPT today to a subset of free and Go users in the U.S. | [Link](https://longbridge.com/zh-HK/news/275355173.md) | --- > **免責聲明**:本文內容僅供參考,不構成任何投資建議。