Show HN: Gemma Gem – AI model embedded in a browser – no API keys, no cloud

2026-04-06 | original

read original get AI Model Browser Extension → more articles

Why This Matters

Gemma Gem introduces an innovative AI assistant embedded directly within the browser, leveraging WebGPU to run Google's Gemma 4 model locally without relying on cloud services or API keys. This approach enhances user privacy, reduces latency, and offers seamless, on-demand AI capabilities for browsing and interaction. Its on-device architecture signifies a shift towards more private and accessible AI tools for consumers and developers alike.

Key Takeaways

Runs entirely on-device using WebGPU, ensuring privacy and no data leaving your machine.
Supports interactive browsing tasks like reading pages, clicking buttons, and executing JavaScript.
Offers flexible model options (E2B and E4B) with easy setup via Chrome extension.

Gemma Gem

Your personal AI assistant living right inside the browser. Gemma Gem runs Google's Gemma 4 model entirely on-device via WebGPU — no API keys, no cloud, no data leaving your machine. It can read pages, click buttons, fill forms, run JavaScript, and answer questions about any site you visit.

Requirements

Chrome with WebGPU support

~500MB disk for E2B model, ~1.5GB for E4B (cached after first run)

Setup

pnpm install pnpm build

Load the extension in chrome://extensions (developer mode) from .output/chrome-mv3-dev/ .

Usage

Navigate to any page Click the gem icon (bottom-right corner) to open the chat Wait for model to load (progress shown on icon + chat) Ask questions about the page or request actions

... continue reading

Explore topics: gemma gem webgpu huggingface transformers chrome extension ai model