The rapid deployment of large language models (LLMs) has introduced significant security vulnerabilities due to misconfigurations and inadequate access controls. This paper presents a systematic approach to identifying publicly exposed LLM servers, focusing on instances running the Ollama framework. Utilizing Shodan, a search engine for internet-connected devices, we developed a Python-based tool to detect unsecured LLM endpoints. Our study uncovered over 1,100 exposed Ollama servers, with approximately 20% actively hosting models susceptible to unauthorized access. These findings highlight the urgent need for security baselines in LLM deployments and provide a practical foundation for future research into LLM threat surface monitoring.
Introduction
The integration of large language models (LLMs) into diverse applications has surged in recent years, driven by their advanced capabilities in natural language understanding and generation. Widely adopted platforms such as ChatGPT, Grok, and DeepSeek have contributed to the mainstream visibility of LLMs, while open-source frameworks like Ollama and Hugging Face have significantly lowered the barrier to entry for deploying these models in custom environments. This has led to widespread adoption by both organizations and individuals of a broad range of tasks, including content generation, customer support, data analysis, and software development.
Despite their growing utility, the pace of LLM adoption has often outstripped the development and implementation of appropriate security practices. Many self-hosted or locally deployed LLM solutions are brought online without adequate hardening, frequently exposing endpoints due to default configurations, weak or absent authentication, and insufficient network isolation. These vulnerabilities are not only a byproduct of poor deployment hygiene but are also symptomatic of an ecosystem that has largely prioritized accessibility and performance over security. As a result, improperly secured LLM instances present an expanding attack surface, opening the door to risks such as:
Unauthorized API Access — Many ML servers operate without authentication, allowing anyone to submit queries.
— Many ML servers operate without authentication, allowing anyone to submit queries. Model Extraction Attacks — Attackers can reconstruct model parameters by querying an exposed ML server repeatedly.
— Attackers can reconstruct model parameters by querying an exposed ML server repeatedly. Jailbreaking and Content Abuse — LLMs like GPT-4, LLaMA, and Mistral can by manipulated to generate restricted content, including misinformation, malware code, or harmful outputs.
— LLMs like GPT-4, LLaMA, and Mistral can by manipulated to generate restricted content, including misinformation, malware code, or harmful outputs. Resource Hijacking (ML DoS Attacks) — Open AI models can be exploited for free computation, leading to excessive costs for the host.
— Open AI models can be exploited for free computation, leading to excessive costs for the host. Backdoor Injection and Model Poisoning — Adversaries could exploit unsecured model endpoints to introduce malicious payloads or load untrusted models remotely.
This work investigates the prevalence and security posture of publicly accessible LLM servers, with a focus on instances utilizing the Ollama framework, which has gained popularity for its ease of use and local deployment capabilities. While Ollama enables flexible experimentation and local model execution, its deployment defaults and documentation do not explicitly emphasize security best practices, making it a compelling target for analysis.
... continue reading