
Expressive TTS with voice cloning in 15 languages
Microsoft's most expressive TTS model yet — voice cloning from short samples, fine-grained emotional control, and consistent voice identity across 15 languages. Now live in Azure AI Foundry at $22 per million characters, with integrations rolling out in VSCode, Dynamics 365 Contact Center, and Teams. For builders shipping voice agents who need production-grade prosody without the OpenAI Realtime API price tag.
Microsoft MAI-Voice-2 is an advanced text-to-speech model that offers voice cloning and emotional control across 15 languages, available in Azure AI Foundry for $22 per million characters. It integrates with VSCode, Dynamics 365 Contact Center, and Teams, targeting developers creating production-grade voice agents.