AI models need a virtual machine

AI Models Need a Virtual Machine

Applications using AI embed the AI model in a framework that interfaces between the model and the rest of the system, providing needed services such as tool calling, context retrieval, etc. Software for early chatbots took user input, called the LLM, and returned the result to the user; essentially just a read-eval-print loop. But, as the capabilities of LLMs have evolved and extension mechanisms, such as MCP were defined, the complexities of the control software that calls the LLM have increased. AI software systems require the same qualities that an operating system provides, including security, isolation, extensibility, and portability. For example, when an AI model needs to be given a file as part of its context, access control must be established that determines if the model should be allowed to view that file. We believe it is time to consider standardizing the ways in which the AI models are embedded into software and think of that control software layer as a virtual machine, where one of the machine instructions, albeit a super-powerful one, is to call the LLM.

Our approach decouples model development from integration logic, allowing any model to “plug in” to a rich software ecosystem that includes tools, security controls, memory abstractions, etc. Similar to the impact that the Java Virtual Machine had, creating a specification of a VM for the AI orchestrator could enable a “write once, run anywhere” execution environment for AI models while at the same time providing familiar constraints and governance to maintain security and privacy in existing software systems. Below we outline related work in this direction, the motivation behind it, and the key benefits of an AI Model VM.

Introduction

AI models are being leveraged in existing software as application copilots, embedded in IDEs, and with the rise of the MCP protocol, are increasingly able to use tools, implement agents, etc. This rapid evolution of valuable use cases brings with it a greater need to ensure that the AI-powered applications maintain privacy, are secure, and operate correctly. Guarantees of security and privacy are best provided if the underlying system is secure by design and not added on to systems as an afterthought. We take the Java Virtual Machine (JVM) as our inspiration in making the case for the importance of a standard AI Virtual Machine. The Java Virtual Machine guarantees memory safety by design, defines access control policies, and prevents code injection with bytecode verification. These properties allow Java programs running on the JVM to be executed with trust despite being shipped remotely, enabling “write once, run anywhere” software distribution.

How does the JVM relate to applications that use AI models? We used the following example to explain:

The diagram illustrates the role of the software layer that interacts with an AI model, which we call the Model Virtual Machine (MVM). That layer intermediates between the model and the rest of the world. For example, a chatbot user might type a prompt (1) that the MVM then sends unmodified to the AI model (2). In practice, the MVM will add additional context, including the system prompt, chat history, to the AI model input as well. The AI model generates a response, which in the example requires a specific tool to be called (3). This response has a specific format that is mutually agreed upon between the model and the MVM, such as MCP. In our example, because it is important to restrict the model from making undesired tool calls, the MVM first consults the list of allowed tools (4) before deciding to call the tool the model requested (5). This check (4) guarantees that the model doesn’t make unauthorized tool calls. Every commercial system using AI models requires some version of this control software.

We make the analogy that the interface with the LLM should be a virtual machine. If that is the case, what are the instructions that the machine can execute? Here are examples of operations that existing AI model interfaces have:

Certifying, loading, initializing, and unloading a given AI model

Calling a model with context

... continue reading