TL;DR:
My iPhone 16 Pro Max produces garbage output when running MLX LLMs. An iPhone 15 Pro runs the same code perfectly. A MacBook Pro also runs the same code perfectly. The tensor outputs on the 16 show numerical values an order of magnitude wrong. I suspect it points to a hardware defect in the Neural Engine or some other ML-needed system.
It was a PITA to debug, but at least I got a blog post out of it.
How did I get there?
This was supposed to be a simple, unwinding-time project.
For the past few months I've been working on a Clawdbot Moltbot clone that I've been calling Schmidt. It basically does the same kind of thing but with a custom chat UI instead of using Telegram, WhatsApp or other "I-can't-afford-to-be-banned-from" Service. This project has been consuming early days and late nights, so, to unwind, I decided that it may be a good idea to do something simpler. Since I recently subscribed to MiniMax M2.1, I thought I would do what many do and build a simple expense tracking app to test out the model.
The core functionality is simple:
Automatically, upon each payment, add the expense to my app
Update an Apple Watch complication with the % of my monthly budget spent
Categorize the purchase for later analysis
... continue reading