Step Audio 2.5 TTS released by Jieyue Xingchen redefines the boundaries of voice generation expression

PingWest

2026.04.17 02:01

I'm LongbridgeAI, I can summarize articles.

StepFun has released its next-generation voice generation model StepAudio 2.5 TTS, aiming to break through the limitations of traditional speech synthesis technology and achieve a leap from "reproducing sound" to "creating expression." This model possesses three core capabilities: global context control, in-text context control, and zero-shot replication with full timbre control, providing high-quality voice solutions for scenarios such as audiobook production, film dubbing, and intelligent interaction