Skip to content
Tech News
← Back to articles

60% Fable cost cut by converting code to images and having the model OCR it

read original more articles
Why This Matters

This innovation significantly reduces token costs for AI models by converting dense, bulky content into images, leading to substantial cost savings for users. It enhances efficiency in handling large prompts and history, making AI interactions more affordable and scalable for both developers and consumers.

Key Takeaways

pxpipe

Cut Claude Code's input tokens by rendering bulky context as images — the same system prompt, tool docs, and history, in a fraction of the tokens.

An image's token cost is fixed by its pixel dimensions, not by how much text is inside it. Dense content (code, JSON, tool output) packs ~3.1 chars per image-token vs ~1 char per text-token on real Claude Code traffic. pxpipe is a local proxy that exploits that gap: it rewrites the bulky parts of your request (system prompt, tool docs, older history) into compact PNGs before the request leaves your machine.

Savings are workload-dependent — pxpipe wins on token-dense content and leaves sparse/small requests untouched — so these are measured snapshots, not constants. The primary, durable result is input-token reduction: dense system prompts, tool docs, and history go in as compact images instead of text (the example above is ≈25k text tokens rendered as ≈2.7k image tokens), every request measured against its own count_tokens counterfactual. Dollars are downstream of that — at current Fable list prices the token cut lands as a ~59–70% lower end-to-end bill (~72–74% on compressed requests; full pricing math in the FAQ). But list prices can change tomorrow and the token count won't, so tokens — not dollars — are the number to watch. Reproduce both from ~/.pxpipe/events.jsonl .

This is what the model sees instead of text:

~48k characters of system prompt + tool docs (this repo's own README, FINDINGS, and source), ≈25k tokens as text, ≈2.7k image tokens as this page. Produced by the real transformRequest pipeline: whitespace-minified, reflowed into full rows with ↵ marking original newlines, OCR instruction banner co-rendered on top. The model reads renders like this at 100/100 on a clean eval (see benchmarks).

Demo

Fable 5 demo (the default, 100/100 reader):

Fable-AB-Demo.mp4

Both demos with both panes on Fable 5 (plain left, pxpipe right).

... continue reading