Alibaba upgrades its next-generation voice model Qwen3-TTS, which can generate human-like tones based on text and voice references

36Kr
2025.12.24 08:06

36Kr learned that on December 24th, Alibaba upgraded its voice model family Qwen3-TTS, releasing two new models: Qwen3-TTS-VD (VoiceDesign) for voice creation and Qwen3-TTS-VC (VoiceClone) for voice cloning. The new Qwen3-TTS models enable DIY voice design and pixel-level voice imitation, even allowing animals to "naturally" speak human language. The voice quality is natural, the effects are stable, and the generation is efficient, which can accelerate the application of large voice models in various professional fields such as audiobooks, AI comic dramas, and film dubbing