Xiaomi's strongest voice large model is open-sourced! One billion hours of training, excels at stand-up comedy and fast-paced speaking

Wallstreetcn
2025.09.19 08:55
portai
I'm PortAI, I can summarize articles.

Xiaomi has open-sourced its first native end-to-end voice model, Xiaomi-MiMo-Audio, with a parameter scale of 7 billion and over 100 million hours of pre-training data, achieving SOTA in voice intelligence and audio understanding benchmark tests. The model has various capabilities, including smooth conversation, audio subtitles, and audio reasoning, can naturally speak Tianjin dialect, and has voice continuation ability. Xiaomi describes its release as the "GPT-3 moment in the voice closed-source field." Various models and technical reports have been open-sourced