LLMs behave like black boxes. You send them a request, hope the prompt is right, hope your agent didn't mutate it, hope the framework packaged it correctly — and then hope the response makes sense. In simple one-shot queries this usually works fine. But when you're building agents, tools, multi-step workflows, or RAG pipelines, it becomes very hard to see what the model is actually receiving. A single unexpected message, parameter, or system prompt change can shift the entire run.
Today we're introducing Debug Mode for LLM requests in vLLora that makes this visible — and editable.
Here’s what debugging looks like in practice:
vLLora now supports Debug Mode for LLM requests. When Debug Mode is enabled, every request pauses before it reaches the model. Debug Mode works by inserting breakpoints on every outgoing LLM request, allowing you to inspect, edit, or continue execution.
You can:
Inspect the exact request
Edit anything
Continue execution normally
This brings a familiar software-engineering workflow ("pause -> inspect -> edit -> continue") to LLM development.
If you've built anything beyond a simple chat interface, you've likely hit one of these:
... continue reading