Customer support and service are among the hottest sectors in voice AI right now. But building a product that sounds human and responds without noticeable delay turns out to be much harder in some markets than others — and most of the major players weren’t built with Africa and the Middle East in mind.
AethexAI, a startup founded last year to close that gap, has raised $3 million in pre-seed funding led by 4DX Ventures, with participation from Enza Capital, Dorm Room Fund, Mojo Ventures, and Stanford GSB 26 Fund. Individual investors include Stanford faculty, telecom executives, and AI researchers from Anthropic.
Rather than using existing orchestration tools like Vapi and LiveKit, the company built its own small model and orchestration layer from scratch to handle the localized dialects of English, French, and Arabic spoken across its target markets — a decision driven, as we’ll get to, by the particular demands of operating in the region.
The company is also launching its platform for enterprises to try out its tech and sign up for its services, along with APIs and SDKs for developers to experiment with its models.
The startup was founded by Mariama Diallo and Ayooluwa Odemuyiwa. CEO Diallo worked at Goldman Sachs and later joined YC-backed ModelML as a product and growth hire. CTO Odemuyiwa graduated from Caltech, worked at Meta, and enrolled at Stanford Business School before co-founding the company. The pair wanted to build something for emerging markets and started looking for opportunities.
Businesses around the world are racing to adopt AI tools to automate parts of their operations. But that doesn’t always work out. In Egypt, a call center automated a significant share of its calls, but rolled the system back because of poor results, the founders found. Several support centers in Africa told them that finding and hiring engineers to automate calls at the right cost was a persistent headache.
“The latency and jitter that we saw on automated calls in this region were outrageous. If we had become orchestrators, we might have had to use large models that were hosted outside the region, resulting in higher latency. We realized that in order for this to work, we have to use very small models and cut latency at every step,” Odemuyiwa told TechCrunch about the decision to build the company’s own models and orchestration layer.
AI labs that deploy their latest models usually spend millions training them and acquiring data. AethexAI found a solution for both. Rather than chasing the largest possible models, it decided that small models are enough to tackle the latency problem while maintaining accuracy and developed its own Kora series, with parameters ranging from 300 million to 1.7 billion. That’s a fraction of the size of the LLMs, which is precisely the point.
To train these models, the startup used anonymized recordings from a call center partner. It also shipped hard drives to radio stations across Africa to collect more audio data. To keep costs down, it built a contributor network of university students to annotate data and pronounce local names. As a result, the startup says, it’s now handling more than 17,000 calls per day.
On the business side, the company is taking care to walk clients who are new to voice AI through the process, offering onsite demos and workshops to help them identify the best use cases for automation.
... continue reading