Microsoft has introduced AI models that it trained internally and says it will begin using them in some products. This announcement may represent an effort to move away from dependence on OpenAI, despite Microsoft's substantial investment in that company. It comes more than a year after insider reports revealed that Microsoft was beginning work on its own foundational models.
A post on the Microsoft AI blog describes two models. MAI-Voice-1 is a natural speech-generation model meant to deliver "high-fidelity, expressive audio across both single and multi-speaker scenarios." The idea is that voice will be one of the main ways users interact with AI tools in the future, though we haven't really seen that come to fruition so far.
The second model is called MAI-1-preview, and it's a foundational large language model specifically trained to drive Copilot, Microsoft's AI chatbot tool. It was trained on around 15,000 Nvidia H100 GPUs, and runs inference on a single GPU. As reported last year, this model is significantly larger than the models seen in Microsoft's earlier experiments, which focused on smaller models meant to run locally, like Phi-3.
To date, Copilot has primarily depended on OpenAI's models. Microsoft has invested enormous amounts of money in OpenAI, and it's unlikely the two companies will fully divorce any time soon. That said, there have been some tensions in recent months when their incentives or objectives have strayed out of alignment.