It all started with cold calling.
In our new "AI/ML Product Management" class, the "pre-case" submissions (short assignments meant to prepare students for class discussion) were looking suspiciously good. Not "strong student" good. More like "this reads like a McKinsey memo that went through three rounds of editing," good.
So we started cold calling students randomly during class.
The result was... illuminating. Many students who had submitted thoughtful, well-structured work could not explain basic choices in their own submission after two follow-up questions. Some could not participate at all. This gap was too consistent to blame on nerves or bad luck. If you cannot defend your own work live, then the written artifact is not measuring what you think it is measuring.
Brian Jabarian has been doing interesting work on this problem, and his results both inspired us and gave us the confidence to try something that would have sounded absurd two years ago: running the final exam with a Voice AI agent.
Why oral exams? And why now?
The core problem is simple: students have immediate access to LLMs that can handle most exam questions we traditionally use for assessment. The old equilibrium, where take-home work could reliably measure understanding, is dead. Gone. Kaput.
Oral exams are a natural response. They force real-time reasoning, application to novel prompts, and defense of actual decisions. The problem? Oral exams are a logistical nightmare. You cannot run them for a large class without turning the final exam period into a month-long hostage situation.
Unless you cheat.
Enter the Voice Agent
... continue reading