Show HN: Intercepting proxy for semantic search over visited pages

A proxy that embeds every web page you visit and lets you run similarity searches.

Each successful HTTP GET 200 response (except for localhost) is re-fetched from pure.md to obtain clean Markdown. The cleaned text is embedded through llm. A minimal Flask UI provides search and cached-page views.

Installation

This is not a stand-alone program. It is a plugin for llm. If you are not using llm yet, install it with pipx first.

pipx install llm

Now you can install this plugin:

llm install git+https://github.com/mlang/llm-embed-proxy

To be able to run a local embedding model, you need to install the llm-sentence-transformers plugin and register/download a model. This step is optional if you happen to have an OpenAI API key and want to use their embedding endpoint.

llm install llm-sentence-transformers llm sentence-transformers register Qwen/Qwen3-Embedding-0.6B

Running

... continue reading