Latest Tech News

Stay updated with the latest in technology, AI, cybersecurity, and more

Filtered by: cactus Clear Filter

Launch HN: Cactus (YC S25) – AI inference on smartphones

Energy-efficient AI inference framework & kernels for phones & AI-native hardware. Budget and mid-range phones control over 70% of the market, but frameworks today optimise for the highend phones with advanced chips. Cactus is designed bottom-up with no dependencies for all mobile devices. Example (CPU-only): Model: Qwen3-600m-INT8 File size: 370-420mb 16-20 t/s on Pixel 6a, Galaxy S21, iPhone 11 Pro 50-70 t/s on Pixel 9, Galaxy S25, iPhone 16 Architecture Cactus exposes 4 levels of abstr

Show HN: Cactus – Ollama for Smartphones

Hey HN, Henry and Roman here - we've been building a cross-platform framework for deploying LLMs, VLMs, Embedding Models and TTS models locally on smartphones. Ollama enables deploying LLMs models locally on laptops and edge severs, Cactus enables deploying on phones. Deploying directly on phones facilitates building AI apps and agents capable of phone use without breaking privacy, supports real-time inference with no latency, we have seen personalised RAG pipelines for users and more. Apple a