60% Fable cost cut by converting code to images and having the model OCR it

pxpipe

Cut Claude Code's input tokens by rendering bulky context as images — the same system prompt, tool docs, and history, in a fraction of the tokens.

An image's token cost is fixed by its pixel dimensions, not by how much text is inside it. Dense content (code, JSON, tool output) packs ~3.1 chars per image-token vs ~1 char per text-token on real Claude Code traffic. pxpipe is a local proxy that exploits that gap: it rewrites the bulky parts of your request (system prompt, tool docs, older history) into compact PNGs before the request leaves your machine.

Savings are workload-dependent — pxpipe wins on token-dense content and leaves sparse/small requests untouched — so these are measured snapshots, not constants. The primary, durable result is input-token reduction: dense system prompts, tool docs, and history go in as compact images instead of text (the example above is ≈25k text tokens rendered as ≈2.7k image tokens), every request measured against its own count_tokens counterfactual. Dollars are downstream of that — at current Fable list prices the token cut lands as a ~59–70% lower end-to-end bill (~72–74% on compressed requests; full pricing math in the FAQ). But list prices can change tomorrow and the token count won't, so tokens — not dollars — are the number to watch. Reproduce both from ~/.pxpipe/events.jsonl .

This is what the model sees instead of text:

~48k characters of system prompt + tool docs (this repo's own README, FINDINGS, and source), ≈25k tokens as text, ≈2.7k image tokens as this page. Produced by the real transformRequest pipeline: whitespace-minified, reflowed into full rows with ↵ marking original newlines, OCR instruction banner co-rendered on top. The model reads renders like this at 100/100 on a clean eval (see benchmarks).

Demo

Fable 5 demo (the default, 100/100 reader):

Fable-AB-Demo.mp4

Both demos with both panes on Fable 5 (plain left, pxpipe right).

... continue reading