---
title: "Just now, Yang Zhilin personally open-sourced Kimi K2.5! A day of domestic large models fighting"
type: "News"
locale: "en"
url: "https://longbridge.com/en/news/273837470.md"
description: "Today, Yang Zhilin released Kimi K2.5, an open-source MoE foundational model with 1 trillion parameters. K2.5 shows significant improvements in visual understanding and programming capabilities, particularly excelling in evaluations such as HLE and BrowseComp, achieving the current best level. Compared to top proprietary models, K2.5's operating costs are only a fraction of theirs, attracting widespread attention and trial use"
datetime: "2026-01-27T11:53:03.000Z"
locales:
  - [zh-CN](https://longbridge.com/zh-CN/news/273837470.md)
  - [en](https://longbridge.com/en/news/273837470.md)
  - [zh-HK](https://longbridge.com/zh-HK/news/273837470.md)
---

# Just now, Yang Zhilin personally open-sourced Kimi K2.5! A day of domestic large models fighting

Today is truly a day for domestic large models to compete! Last night, Qianwen launched a new model, and today DeepSeek open-sourced OCR 2.

At noon, Kimi also unveiled its updates, with the website, app, API open platform, and programming assistant product Kimi Code model version fully updated; Kimi K2.5 has arrived.

Yang Zhilin, the founder of Dark Side of the Moon, made his first appearance to share the capabilities of the new model.

Kimi K2.5 is a MoE base model with 1 trillion parameters. Compared to its predecessor, K2.5 has significantly enhanced visual understanding capabilities (it can now process videos), and its coding abilities have also improved noticeably. More importantly, K2.5 remains open-source.

Kimi K2.5 achieved the current best performance (SOTA) in highly challenging agent evaluations, including HLE, BrowseComp, and DeepSearchQA. For example, it scored 50.2% on HLE (Human Last Exam) and 74.9% on BrowseComp.

At the same time, K2.5's programming capabilities are also outstanding, achieving 76.8% on SWE-bench Verified, narrowing the gap with top closed-source models. K2.5 also achieved the current best results in several visual understanding evaluations among open-source models.

As can be seen, in core benchmark tests, Kimi K2.5's performance is comparable to that of the most powerful closed-source models such as Opus 4.5, GPT 5.2 XHigh, and Gemini 3.0 Pro, with some scores even exceeding them.

It is worth mentioning that while Kimi K2.5 outperformed GPT-5.2-xhigh in several evaluations, its operating cost is only a fraction of that of GPT-5.2-xhigh.

With the momentum from K2 Thinking two months ago, the release of K2.5 has been unprecedentedly lively. On social networks, people are trying out the new model and sharing their results.

Some netizens expressed that this is the highest standard of Chinese large models (without qualifiers), and now the pressure is on DeepSeek R2.

 Screenshot as Code: Coding Now Has "Aesthetics"

It should be noted that Kimi K2.5 is an all-in-one model, encompassing visual and text capabilities, dialogue and agent functions, thinking and non-thinking — all these abilities are concentrated in one model (all in one, Unified model).

Since it enhances visual capabilities + code capabilities, the Kimi model now focuses on image-to-code conversion — not only does it eliminate the need to write code, but it also saves on prompt engineering; just provide a design draft to the AI, and it can generate the code you want.

Sometimes when you want to modify an interface, it's hard to explain with just text; now you only need to give the AI an image. You can circle the areas you want to change on the UI, and let the AI handle the rest.

If you've designed animation effects in other tools, you can also record a video and show it to Kimi, and it will automatically understand and write the code to reproduce it.

To be honest, it does feel a bit like directing subordinates to work.

With the addition of visual capabilities, Kimi K2.5 not only excels at writing code but also possesses a certain level of "design aesthetics" — it combines visual abilities to construct web pages with high aesthetics and animations, akin to those produced by professional designers.

To give large models better "taste," one cannot help but recall the speech by Yang Zhilin, founder of Dark Side of the Moon, at the AGI-Next Frontier Summit over two weeks ago. He mentioned that the process of creating models is essentially about creating a worldview, and enabling AI to have better taste is a key focus of Kimi's current development.

In addition to front-end design, Kimi is now also delving into software engineering. Based on Kimi K2.5, Kimi Code was officially released today; it can run in terminals and seamlessly integrate into IDEs like VSCode, Cursor, and Zed. During use, Kimi Code supports users in inputting images and videos, and it can automatically discover and migrate your existing skills and MCP to the Kimi Code working environment.

Just two weeks after Yang Zhilin provided direction, we can already experience AI based on the new route.

## Built-in Agent "Project Team"

To solve complex real-world problems, Kimi K2.5 introduces the "Agent Swarm" feature, currently in testing on Kimi.com, with advanced paid users receiving free credits.

When handling complex tasks, K2.5 no longer executes tasks in a single-threaded manner but acts as a conductor, coordinating and collaborating with up to 100 Agent avatars to work in parallel, supporting up to 1,500 tool calls, and achieving speeds 4.5 times faster than a single-agent configuration.

Now, the large model has undergone Parallel Agent Reinforcement Learning (PARL) training, and the agent swarm is automatically created and orchestrated by Kimi K2.5 without any predefined setup.

PARL uses trainable coordinator agents to decompose tasks into parallelizable subtasks, each executed by dynamically instantiated frozen sub-agents. Running these subtasks concurrently significantly reduces end-to-end latency compared to sequentially executed agents.

Training a reliable parallel orchestrator is highly challenging due to the delays, sparsity, and non-stationarity of feedback provided by independently running sub-agents. A common failure mode is serial collapse, where the orchestrator defaults to executing single-agent tasks despite having parallel capabilities. To address this issue, PARL employs a staged reward shaping strategy that encourages parallelism in the early training stages and gradually shifts focus to task success.

This parallel processing capability compresses work that would normally take days into just a few minutes.

Scaling the training of agent swarms is quite a challenging problem. Moonlight Dark Side indicates that they have restructured the reinforcement learning infrastructure for this purpose and specifically optimized the training algorithms to ensure maximum efficiency and performance.

In an example provided by Kimi, feeding the Kimi Agent swarm 40 papers on psychology and AI allows the agents to sequentially read through the papers, then derive several sub-agents to write different sections of the report. Finally, a main agent is responsible for acceptance, and all content is compiled into a professional PDF review spanning dozens of pages.

Kimi K2.5 also introduces agents into knowledge work in the real world.

K2.5 Agents can handle high-density, large-scale office work end-to-end. They can process a large amount of high-density input, coordinate the use of multi-step tools, and provide expert-level output directly through dialogue, covering documents, spreadsheets, PDFs, and presentation formats In the era of Kimi K2.5, we can have the intelligent agent complete some advanced tasks, such as adding comments in Word, building financial models using pivot tables, and writing LaTeX formulas in PDFs; the output capability of the agent has reached an unprecedented length, capable of producing a 10,000-word thesis or a 100-page document.

## First-Hand Testing: From Riddles to "Handcrafted" 3D Apartments

Opening the official website, we can see that the Kimi model has been fully updated, and we can also see the K2.5 Agent cluster that is in beta testing.

Kimi-K2.5 series model name bilingual version.

Next, we will test these new models one by one.

First up is K2.5 Instant, which faces the simplest task — a cryptic mini-game: please use a passage that seems like "late-night radio lyrics" to secretly hide information about "evacuating at 3 PM tomorrow." The requirement is that it must read like pure literature, without any sense of incongruity.

Kimi K2.5 takes a small test and easily completes the task in a second.

Next, we will increase the difficulty. We will switch Kimi K2.5 to thinking mode to test its multimodal reasoning ability.

Here we found a hand-drawn floor plan of Sheldon’s apartment from "The Big Bang Theory" by Spanish interior designer Iñaki Aliste Lizarralde, and we will start with a basic test to see if it can correctly identify the background of this image:

The result is very good! Kimi K2.5 correctly identified the background based on the annotations on the image and explained the relevant context. Next, let's see if K2.5 can correctly understand the spatial implications of this image and reconstruct it into a 3D version.

4x speed video.

The generation took two and a half minutes, and K2.5 ultimately produced the following result:

The result is quite good, but it is also evident that this 3D image only provides a rough outline, lacking many details such as sofas, tables, chairs, and beds. Additionally, all the rooms in this 3D image are square, which differs significantly from the reference image. Meanwhile, continuing to let K2.5 Thinking generate faced a code length limitation (10,000 characters) But it doesn't matter, let's let K2.5 Agent take the stage.

This time, due to our emphasis on details, the analysis and processing time has significantly increased (nearly 20 minutes), and the amount of code has naturally increased (1042 lines). During the execution process, we can see the task planning and gradual execution of the Kimi intelligent agent. Moreover, the agent has deployed the results obtained, allowing us to easily access: https://ijohefkudygve.beta-ok.kimi.link/

10x speed video.

In the end, while the results are not perfect, they did not disappoint us. It not only accurately restored the details of the two main apartments from The Big Bang Theory, but also provided additional wireframe mode and open-source roof:

Next, let's focus on the K2.5 Agent Swarm, which is currently in beta testing. In this mode, we can have multiple agents handle your tasks simultaneously. Here, we envisioned a rather sci-fi task:

> Please develop a basic vocabulary for an intelligent species that "lives in the deep sea and communicates through glowing skin." The requirements include grammatical structures, 200 basic entries, and 3 creation myths of this species. The cluster must ensure that all neologisms have a high degree of internal logical consistency in phonetics and semantics.

At the beginning of the task, Kimi created four different agents: phonetics designer Ning Yi, grammatical structure designer Young Galileo, vocabulary designer Jing Chuan, and myth creator Professor Li.

In the first phase of the design work, phonetics and grammatical structures can proceed in parallel, so we can see Ning Yi and Young Galileo working together to build the foundation of this new language.

Then, it was time to create vocabulary. At this point, Kimi added some parallel-running agents based on the requirements, allowing them to create vocabulary on different themes.

The entire process took 38 minutes, and we witnessed the birth of a new language called "Luminous Language." This language uses different forms of light as phonemes and features a unique parallel clause grammar and spatial case system. Moreover, Kimi thoughtfully designed a romanization transcription system.

20x speed video.

 Finally, let's test Kimi Code. Kimi Code offers two usage methods: one is a simple command `uv tool install --python 3.13 kimi-cli` to install Kimi CLI, and the other is to configure it into third-party tools like Claude Code.

Now, let's do a simple test of Kimi Code using the official Kimi CLI. After installation and configuration, we first let Kimi Code create a gold price monitor:

> Create a monitor for gold and silver prices, and notify me when the price fluctuation exceeds 1% within 24 hours.

4x speed video.

As we can see, the entire execution process took only about 4 minutes, but after the first round of interaction, the result was just a program that requires API configuration and a demo program. Nevertheless, the effect is quite satisfactory.

Interestingly, during this process, we also witnessed Kimi Code's powerful ability to encounter errors and automatically resolve issues.

Of course, while this program is usable, it requires manual API configuration, which can be a bit cumbersome. However, with Kimi Code, we can easily avoid these troubles and execute further actions with just one command, directly configuring a free API.

4x speed video.

Soon, Kimi Code completed the task, let's run it and see the effect:

At this point, the gold and silver prices correctly reflect the real-time prices. Of course, we can also let Kimi Code execute further actions, such as changing the price display to RMB per gram, packaging this Python program into an .exe, configuring reminder sounds and pop-ups, and achieving real-time display on the taskbar, etc.

But like other similar tools, Kimi Code is not exclusively a programming tool. With the right configuration, we can also make it a powerful aid in our work. For example, we can easily use Kimi Code to achieve batch processing of files. For instance, for our daily topic docx documents, we can let Kimi Code batch process them into a format compatible with Obsidian based on obsidian-skills and tag them appropriately Based on obsidian-skills, these daily topic summary documents are processed into a Markdown format compatible with Obsidian and tagged appropriately.

4x speed video.

It can be seen that Kimi Code completed the correct processing of all 94 files in less than two minutes, with the context usage just exceeding 10%. During this process, it is also noticeable that Kimi Code correctly invoked obsidian-skills, and the results obtained are very satisfactory: the handling of yaml, callout, etc., is very accurate.

Overall, we believe that Kimi 2.5's capabilities in terms of intelligence are already on par with cutting-edge models, especially its agent cluster mode, which performs remarkably well in solving complex tasks.

## Conclusion

China's open-source models are gradually becoming the new standard and are setting the rules. The release of Kimi K2.5 has set a new benchmark for global open-source large models.

At the same time, based on the development of K2.5's vision and agent capabilities, AI has unlocked more abilities to solve complex problems in the real world.

Now AI has aesthetics when writing code, and hundreds of agents can work collaboratively, bringing us one step closer to AGI.

Risk Warning and Disclaimer

The market has risks, and investment should be cautious. This article does not constitute personal investment advice and does not take into account individual users' specific investment objectives, financial conditions, or needs. Users should consider whether any opinions, views, or conclusions in this article align with their specific circumstances. Investing based on this is at one's own risk.

## Related News & Research

- [RMX Industries, Inc. Progresses Toward Launch of Next-Gen Visual Intelligence Solution | RMXI Stock News](https://longbridge.com/en/news/286098067.md)
- [In-depth analysis of the Shai-Hulud malware: Is open source a recipe for disaster?](https://longbridge.com/en/news/286230902.md)
- [Cognex OneVision™ Adoption Ramps as Manufacturers Scale AI Vision Globally | CGNX Stock News](https://longbridge.com/en/news/286266085.md)
- [13:00 ETVmake Expands Image Enhancer with Advanced Old Photo Restoration Features Ahead of Mother's Day](https://longbridge.com/en/news/285414024.md)
- [<![CDATA[The Questions You Couldn’t Ask in Time: How Agentic AI Accelerates Market Entry]]>](https://longbridge.com/en/news/286300823.md)