Tree-sitter vs. Language Servers

21 Jan 2026

I got asked a good question today: what is the difference between Tree-sitter and a language server? I don’t understand how either of these tools work in depth, so I’m just going to explain from an observable, pragmatic point of view.

Tree-sitter is a parser generator. What this means is that you can hand Tree-sitter a description for a programming language and it will create a program that will parse that language for you. What’s special about Tree-sitter is that it is a.) fast, and b.) can tolerate syntax errors in the input. These two properties make Tree-sitter ideal for creating syntax highlighting engines in text editors. When you’re editing a program, most of the time the program will be in a syntactically invalid state. During that time, you don’t want your colors changing or just outright breaking while you’re typing. Naïve regex-based syntax highlighters frequently suffer from this issue.

Tree-sitter also provides a query language where you can make queries against the parse tree. I use this in the Emacs package I’m trying to develop to add Typst support to the Citar citation/bibliography tool: I can ask Tree-sitter to find a particular syntax object; it is safer and more robust than using a regular expression because it can do similar parsing to the Typst engine itself.

In short, Tree-sitter provides syntax highlighting that is faithful to how the language implementation parses the program, instead of relying on regular expressions that incidentally come close.

Language server #

A language server is a program that can analyze a program and report interesting information about that program to a text editor. A standard, called the Language Server Protocol (LSP), defines the kinds of JSON messages that pass between a text editor and the server. The protocol is an open standard; any language and any text editor can take advantage of the protocol to get nice smart programming helps in their system. Language servers can provide information like locating the definition of a symbol, possible completions at the cursor point, etc. to a text editor which can then decide how and when to display or use this information.

Language servers solve the “ \(N \times M\) problem” where \(N\) programming languages and \(M\) text editors would mean there have to be \(N \times M\) implementations for language analyzers. Now, every language just needs a language server, and every editor needs to be able to speak the LSP protocol.

Language servers are powerful because they can hook into the language’s runtime and compiler toolchain to get semantically correct answers to user queries. For example, suppose you have two versions of a pop function, one imported from a stack library, and another from a heap library. If you use a tool like the dumb-jump package in Emacs I just want to say that I think dumb-jump is very cool and I am not trying to knock it down at all. It’s honest about its limitations and can be handy when you do not have a language server available. and you use it to jump to the definition for a call to pop , it might get confused as to where to go because it’s not sure what module is in scope at the point. A language server, on the other hand, should have access to this information and would not get confused.

Using a language server for highlighting #

... continue reading