Skip to content
Tech News
← Back to articles

Universal Claude.md – cut Claude output tokens

read original more articles
Why This Matters

This tool significantly reduces token usage by trimming unnecessary verbosity and formatting noise in Claude's outputs, making large-scale automation and structured tasks more cost-effective and consistent. It enhances efficiency for high-volume workflows, though it may not be beneficial for casual or low-volume interactions.

Key Takeaways

One file. Drop it in your project. Cuts Claude output verbosity by ~63%. No code changes required. Note: most Claude costs come from input tokens, not output. This file targets output behavior - sycophancy, verbosity, formatting noise. It won't fix your biggest bill but it will fix your most annoying responses. Model support: benchmarks were run on Claude only. The rules are model-agnostic and should work on any model that reads context - but results on local models like llama.cpp, Mistral, or others are untested. Community results welcome.

The Problem

When you use Claude Code, every word Claude generates costs tokens. Most people never control how Claude responds - they just get whatever the model decides to output.

By default, Claude:

Opens every response with "Sure!", "Great question!", "Absolutely!"

Ends with "I hope this helps! Let me know if you need anything!"

Uses em dashes (--), smart quotes, Unicode characters that break parsers

Restates your question before answering it

Adds unsolicited suggestions beyond what you asked

Over-engineers code with abstractions you never requested

... continue reading