OpenInfer has raised $8 million in funding to redefine AI inference for edge applications.
It’s the brain child of Behnam Bastani and Reza Nourai, who spent nearly a decade of building and scaling AI systems together at Meta’s Reality Labs and Roblox.
Through their work at the forefront of AI and system design, Bastani and Nourai witnessed firsthand how deep system architecture enables continuous, large-scale AI inference. However, today’s AI inference remains locked behind cloud APIs and hosted systems—a barrier for low-latency, private, and cost-efficient edge applications. OpenInfer changes that. It wants to agnostic to the types of devices at the edge, Bastani said in an interview with GamesBeat.
By enabling the seamless execution of large AI models directly on devices—from SoCs to the cloud—OpenInfer removes these barriers, enabling inference of AI models without compromising performance.
The implication? Imagine a world where your phone anticipates your needs in real time — translating languages instantly, enhancing photos with studio-quality precision, or powering a voice assistant that truly understands you. With AI inference running directly on your device, users can expect faster performance, greater privacy, and uninterrupted functionality no matter where they are. This shift eliminates lag and brings intelligent, high-speed computing to the palm of your hand.
Building the OpenInfer Engine: AI Agent Inference Engine
OpenInfer’s founders
Since founding the company six months ago, Bastani and Nourai have assembled a team of
seven, including former colleagues from their time at Meta. While at Meta, they had built Oculus
Link together, showcasing their expertise in low-latency, high-performance system design.
... continue reading