Why This Matters
Gemma Gem introduces an innovative AI assistant embedded directly within the browser, leveraging WebGPU to run Google's Gemma 4 model locally without relying on cloud services or API keys. This approach enhances user privacy, reduces latency, and offers seamless, on-demand AI capabilities for browsing and interaction. Its on-device architecture signifies a shift towards more private and accessible AI tools for consumers and developers alike.
Key Takeaways
- Runs entirely on-device using WebGPU, ensuring privacy and no data leaving your machine.
- Supports interactive browsing tasks like reading pages, clicking buttons, and executing JavaScript.
- Offers flexible model options (E2B and E4B) with easy setup via Chrome extension.
Gemma Gem
Your personal AI assistant living right inside the browser. Gemma Gem runs Google's Gemma 4 model entirely on-device via WebGPU — no API keys, no cloud, no data leaving your machine. It can read pages, click buttons, fill forms, run JavaScript, and answer questions about any site you visit.
Requirements
Chrome with WebGPU support
~500MB disk for E2B model, ~1.5GB for E4B (cached after first run)
Setup
pnpm install pnpm build
Load the extension in chrome://extensions (developer mode) from .output/chrome-mv3-dev/ .
Usage
Navigate to any page Click the gem icon (bottom-right corner) to open the chat Wait for model to load (progress shown on icon + chat) Ask questions about the page or request actions
... continue reading