Lightweight CLI, API and ChatGPT-like alternative to Open WebUI for accessing multiple LLMs, entirely offline, with all data kept private in browser storage.
Configure additional providers and models in llms.json
Mix and match local models with models from different API providers
Requests automatically routed to available providers that supports the requested model (in defined order)
Define free/cheapest/local providers first to save on costs
Any failures are automatically retried on the next available provider
Features
Lightweight : Single llms.py Python file with single aiohttp dependency (Pillow optional)
: Single llms.py Python file with single dependency (Pillow optional) Multi-Provider Support : OpenRouter, Ollama, Anthropic, Google, OpenAI, Grok, Groq, Qwen, Z.ai, Mistral
: OpenRouter, Ollama, Anthropic, Google, OpenAI, Grok, Groq, Qwen, Z.ai, Mistral OpenAI-Compatible API : Works with any client that supports OpenAI's chat completion API
... continue reading