Operational transformation (OT) is a technology for supporting a range of collaboration functionalities in advanced collaborative software systems. Hero illustration by Rose Wong. When we first started building multiplayer functionality Multiplayer Editing in Figma Today’s public release of Figma contains two long-awaited changes. in Figma four years ago, we decided to develop our own solution. No other design tool offered this feature, and we didn’t want to use operational transforms (a.k.a. OTs), the standard multiplayer algorithm popularized by apps like Google Docs. As a startup we value the ability to ship features quickly, and OTs were unnecessarily complex for our problem space. So we built a custom multiplayer system that's simpler and easier to implement. If you want to know more about where we’re headed next, check out "Beyond multiplayer: Building community together in Figma," where we share some exciting product announcements. At the time, we weren’t sure building this feature was the right product decision. No one was clamoring for a multiplayer design tool—if anything, people hated the idea. Designers worried that live collaborative editing would result in “hovering art directors” and “design by committee” catastrophes. But ultimately, we had to do it because it just felt wrong not to offer multiplayer as a tool on the web. It eliminates the need to export, sync, or email copies of files and allows more people to take part in the design process (like copy-writers and developers). Just by having the right link, everyone can view the current status of a design project without interrupting the person doing the work. Our bet paid off, and these days it’s obvious that multiplayer is the way all productivity tools on the web should work, not just design. But while we use products with live collaborative editing every day, there aren’t that many public case studies on these production systems. We decided it was time to share a peek into how we did it at Figma, in the hopes of helping others. It should be a fun read for those who like seeing how computer science theory is applied in practice. We’ll cover a lot but each section builds upon the previous ones. By the end, you should hopefully have an understanding of the entire system. Before talking about our multiplayer protocol, it's useful to have some context about how our multiplayer system is set up. We use a client/server architecture where Figma clients are web pages that talk with a cluster of servers over WebSockets. Our servers currently spin up a separate process for each multiplayer document which everyone editing that document connects to. If you’re interested in learning more, this article Rust in production at Figma How Mozilla’s new language dramatically improved our server-side performance talks about how we scale our production multiplayer servers. When a document is opened, the client starts by downloading a copy of the file. From that point on, updates to that document in both directions are synced over the WebSocket connection. Figma lets you go offline for an arbitrary amount of time and continue editing. When you come back online, the client downloads a fresh copy of the document, reapplies any offline edits on top of this latest state, and then continues syncing updates over a new WebSocket connection. This means that connecting and reconnecting are very simple and all of the complexity with multiplayer (which is what this blog post is about) is in dealing with updates to already connected documents. It’s worth noting that we only use multiplayer for syncing changes to Figma documents. We also sync changes to a lot of other data (comments, users, teams, projects, etc.) but that is stored in Postgres, not our multiplayer system, and is synced with clients using a completely separate system that won’t be discussed in this article. Although these two systems are similar, they have separate implementations because of different tradeoffs around certain properties such as performance, offline availability, and security. We didn't start with this setup though. When making a change this fundamental, it's important to be able to iterate quickly and experiment before committing to an approach. That's why we first created a prototype environment to test our ideas instead of working in the real codebase. This playground was a web page that simulated three clients connecting to a server and visualized the whole state of the system. It let us easily set up different scenarios around offline clients and bandwidth limited connections. Once we figured out how we wanted to build our multiplayer system, it was straightforward to graft the ideas from our prototype onto our existing codebase. We used this prototype to quickly research and evaluate different collaborative algorithms and data structures. A screenshot of our internal prototype Douglas Engelbart practicing for "The Mother of All Demos" Multiplayer technology has a rich history and has arguably been around at least since Douglas Engelbart's demo in 1968. Before we dive in too deep into how our own multiplayer system works, it’s worth a quick overview on the traditional approaches that informed ours: OTs and CRDTs. Critique of OT While the classic OT approach of defining operations through their offsets in the text seems to be simple and natural, real-world distributed systems raise serious issues. Namely, that operations propagate with finite speed, states of participants are often different, thus the resulting combinations of states and operations are extremely hard to foresee and understand. As Li and Li put it, "Due to the need to consider complicated case coverage, formal proofs are very complicated and error-prone, even for OT algorithms that only treat two characterwise primitives (insert and delete)." As mentioned earlier, OTs power most collaborative text-based apps such as Google Docs. They’re the most well-known technique but in researching them, we quickly realized they were overkill for what we wanted to achieve. They’re a great way of editing long text documents with low memory and performance overhead, but they are very complicated and hard to implement correctly. They result in a combinatorial explosion of possible states which is very difficult to reason about. Our primary goal when designing our multiplayer system was for it to be no more complex than necessary to get the job done. A simpler system is easier to reason about which then makes it easier to implement, debug, test, and maintain. Since Figma isn't a text editor, we didn't need the power of OTs and could get away with something less complicated. Conflict-free replicated data type In distributed computing, a conflict-free replicated data type (CRDT) is a data structure that is replicated across multiple computers in a network, with the following features: The application can update any replica independently, concurrently and without coordinating with other replicas. An algorithm (itself part of the data type) automatically resolves any inconsistencies that might occur. Although replicas may have different state at any particular point in time, they are guaranteed to eventually converge. Figma's tech is instead inspired by something called CRDTs, which stands for conflict-free replicated data types. CRDTs refer to a collection of different data structures commonly used in distributed systems. All CRDTs satisfy certain mathematical properties which guarantee eventual consistency. If no more updates are made, eventually everyone accessing the data structure will see the same thing. This constraint is required for correctness; we cannot allow two clients editing the same Figma document to diverge and never converge again. There are many types of CRDTs. See this list for a good overview. Some examples: Grow-only set: This is a set of elements. The only type of update is to add something to the set. Adding something twice is a no-op, so you can determine the contents of the set by just applying all of the updates in any order. This is a set of elements. The only type of update is to add something to the set. Adding something twice is a no-op, so you can determine the contents of the set by just applying all of the updates in any order. Last-writer-wins register: This is a container for a single value. Updates can be implemented as a new value, a timestamp, and a peer ID. You can determine the value of the register by just taking the value of the latest update (using the peer ID to break a tie). Figma isn't using true CRDTs though. CRDTs are designed for decentralized systems where there is no single central authority to decide what the final state should be. There is some unavoidable performance and memory overhead with doing this. Since Figma is centralized (our server is the central authority), we can simplify our system by removing this extra overhead and benefit from a faster and leaner implementation. It’s also worth noting that Figma's data structure isn't a single CRDT. Instead it's inspired by multiple separate CRDTs and uses them in combination to create the final data structure that represents a Figma document (described below). Even if you have a client-server setup, CRDTs are still worth researching because they provide a well-studied, solid foundation to start with. Understanding them helps build intuition on how to create a correct system. From that point, it's possible to relax some of the requirements of CRDTs based on the needs of the application as we have done. Ok, so we want to sync updates to Figma documents using CRDTs. What does the structure of a Figma document even look like? HTML DOM The HTML Document Object Model (DOM) is a standard object model and programming interface for HTML. It defines: The HTML elements as objects, the properties of all HTML elements, the methods to access all HTML elements, and the events for all HTML elements. Every Figma document is a tree of objects, similar to the HTML DOM. There is a single root object that represents the entire document. Underneath the root object are page objects, and underneath each page object is a hierarchy of objects representing the contents of the page. This tree is is presented in the layers panel on the left-hand side of the Figma editor. Each object has an ID and a collection of properties with values. One way to think about this is by picturing the document as a two-level map: Map> . Another way to think about this is a database with rows that store (ObjectID, Property, Value) tuples. This means that adding new features to Figma usually just means adding new properties to objects. For the rest of this post, we will talk about the details of Figma's multiplayer algorithm and how we solved some of the edge cases we encountered.