Kimi hard against the full-blooded version o1 of multimodal, first exposure of training details! Reinforcement learning scaling new paradigm born

Wallstreetcn
2025.01.21 00:16
portai
I'm PortAI, I can summarize articles.

Kimi released the k1.5 multimodal thinking model, marking the rise of Chinese programming languages. The model's mathematical, coding, and multimodal reasoning capabilities under the Long CoT mode have reached the level of OpenAI's o1 full version, and it significantly outperforms GPT-4o and Claude 3.5 under the Short CoT mode. The Kimi team innovatively expanded the application of reinforcement learning, opening new pathways to achieve autonomous expansion of training data through a reward mechanism, effectively promoting the scaling of computation