Microsoft’s AI division has introduced its first internally developed artificial intelligence models, MAI-Voice-1 and MAI-1-preview. The announcement was made on 28 August, marking a significant step in the company’s strategy to build its own AI tools to support its growing product ecosystem.
Advancing AI with MAI-Voice-1
The MAI-Voice-1 model is a speech-generation tool capable of producing one minute of audio in under one second using a single GPU. Microsoft has already integrated it into several features, including Copilot Daily, a service where an AI host delivers a summary of the day’s top news stories. The model is also being used to create podcast-style conversations to explain complex topics in an accessible way.
Users can experiment with MAI-Voice-1 through Copilot Labs, where they can input custom scripts and adjust both voice and speaking style. Microsoft has positioned this model as a flexible tool for consumers, offering a glimpse of the future of AI-powered voice services.
MAI-1-preview showcases future potential
Alongside the speech model, Microsoft introduced MAI-1-preview, an advanced AI model designed to follow instructions and provide detailed, helpful answers to everyday questions. Trained on approximately 15,000 Nvidia H100 GPUs, MAI-1 Preview is intended for users seeking sophisticated text-based AI support.
Mustafa Suleyman, chief executive of Microsoft AI, explained the company’s vision during an appearance on the Decoder podcast last year. “My logic is that we have to create something that works extremely well for the consumer and really optimise for our use case,” Suleyman said. “So, we have vast amounts of very predictive and very useful data on the ad side, on consumer telemetry, and so on. My focus is on building models that really work for the consumer companion.”
Microsoft has begun testing MAI-1-preview on LMArena, a public AI benchmarking platform. The company also plans to incorporate the model into its Copilot AI assistant, which currently relies on OpenAI’s large language models. Initial use cases will focus on text-based interactions, with further applications expected to follow.
A step toward specialised AI ecosystems
The launch of these models signals Microsoft’s ambition to develop an ecosystem of in-house AI systems tailored for specific functions. In a blog post announcing the release, the company wrote: “We have big ambitions for where we go next. Not only will we pursue further advances here, but we believe that orchestrating a range of specialised models serving different user intents and use cases will unlock immense value.”
With MAI-Voice-1 already in use and MAI-1-preview undergoing public testing, Microsoft is laying the groundwork for a future in which AI tools are increasingly customised to deliver targeted and efficient user experiences.