Launch HN: Cactus (YC S25) – AI inference on smartphones
Energy-efficient AI inference framework & kernels for phones & AI-native hardware. Budget and mid-range phones control over 70% of the market, but frameworks today optimise for the highend phones with advanced chips. Cactus is designed bottom-up with no dependencies for all mobile devices. Example (CPU-only): Model: Qwen3-600m-INT8 File size: 370-420mb 16-20 t/s on Pixel 6a, Galaxy S21, iPhone 11 Pro 50-70 t/s on Pixel 9, Galaxy S25, iPhone 16 Architecture Cactus exposes 4 levels of abstr