← Back to the algorithm list Published on May 19th, 2024
Collaboratively editing strings of text is a common desire in peer-to-peer applications. For example, a note-taking app might represent each document as a single collaboratively-edited string of text.
The algorithm presented here is one way to do this. It comes from a family of algorithms called CRDTs, which I will not describe here. It's similar to the approaches taken by popular collaborative text editing libraries such as Yjs and Automerge. Other articles have already been written about these similar approaches (see the references section below), but this article also has a nice interactive visualization of what goes on under the hood.
The algorithm:
Each character is assigned a unique identifier consisting of site (the identifier of the creator) and clock (a site-specific integer that is incremented after every operation) as well as a (possibly null) parent pointer to a previous character.
To insert a character, set its parent pointer to the character immediately before the insertion point at the time of insertion (or to null when inserting at the beginning). The character order is determined by a pre-order tree traversal that places parents before their children. This is tree-based indexing.
To order characters with the same parent, have each character also store a counter value and sort first by counter (use descending order), then by the site of the character (use arbitrary but consistent order). When inserting before a character with the same parent, use its counter incremented by 1 to be ordered before it. This order is unique because doing this means the same site will never reuse the same counter in the same spot.
To delete a character, put that character's identifier in a deleted set. Note that this means deleted characters persist forever, which is known as a "tombstone" in CRDT literature. Deleted character values can be forgotten but the positions must be remembered to be able to correctly order incoming changes that use one of the now-deleted characters as a parent.
This is not as expensive as it sounds because of three important optimizations:
Successive inserts from the same site can all be merged into a single block in memory, so for example pasting a large chunk of text uses the same amount of metadata as inserting a single character. This works because character identifiers are carefully designed to be able to be implicit in a contiguous run of text (each character will the same site and counter and have a clock that's 1 more than the previous character's clock ).
... continue reading