Arch shares its wiki strategy with Debian [LWN subscriber-only content]
The Arch Linux project is especially well-known in the Linux community for two things: its rolling-release model and the quality of the documentation in the ArchWiki. No matter which Linux distribution one uses, the odds are that eventually the ArchWiki's documentation will prove useful. The Debian project recognized this and has sought to improve its own documentation game by inviting ArchWiki maintainers Jakub Klinkovský and Vladimir Lavallade to DebConf25 in Brest, France, to speak about how Arch manages its wiki. The talk has already borne fruit with the launch of an effort to revamp the Debian wiki.
Klinkovský and Lavallade were introduced by Debian developer Thomas Lange, who said that he had the idea to invite the pair to DebConf. Klinkovský said that he had been a maintainer of the wiki since about 2014, and that he is also a package maintainer for Arch Linux. He added that he contributes to many other projects " wherever I can ". For his part, Lavallade said that he has contributed to the wiki since 2021, but he had only recently joined the maintenance team: " I know just enough to be dangerous ."
Lavallade said that the talk was a good opportunity to cross-pollinate with another distribution, and to do some self-reflection on how the wiki team operates. They would explain how the wiki is run using the SWOT analysis format, with a focus on the content and how the maintenance team keeps the quality of pages as high as it can. "SWOT", for those who have been fortunate enough not to have encountered the acronym through corporate meetings, is short for "strengths, weaknesses, opportunities, and threats". SWOT analysis is generally used for decision-making processes to help analyze the current state and identify what an organization needs to improve.
ArchWiki:About
The ArchWiki was established in 2004; the project originally used PhpWiki as its backend—but Klinkovský said that it was quickly migrated to MediaWiki, which is still in use today. The wiki maintenance and translation teams were established " about 2010 ". The maintenance team is responsible for the contribution guidelines, style conventions, organization, and anything else that contributors need to know.
Today, the wiki has more than 4,000 topic pages; it has close to 30,000 pages if one counts talk pages, redirects, and help pages. " We are still quite a small wiki compared to Wikipedia ", Klinkovský said.
He displayed a slide, part of which is shown below, with graphs showing the number of edits and active users per month. The full set of slides is available online as well.
Since 2006, the wiki has had more than 840,000 edits by more than 86,000 editors; the project is averaging more than 2,000 edits by about 300 active contributors each month. Klinkovský noted that this " used to be quite a larger number ".
Strengths
Lavallade had a short list of the " best user-facing qualities " of the ArchWiki, which are the project's strengths. The first was " comprehensive content and a very large coverage of various topics ". He said this included not just how to run Arch Linux, but how to run important software on the distribution.
The next was having high-quality and up-to-date content. Given that Arch is a rolling-release distribution, he said, every page has to be updated to reflect the latest package provided with the distribution. That is only possible thanks to " a very involved community "; he noted that most of the edits on the ArchWiki were made by contributors outside the maintenance team.
All of that brought him to the last strength he wanted to discuss: its reach beyond the Arch community. He pulled up a slide that included a quote from Edward Snowden, which said:
Is it just me, or have search results become absolute garbage for basically every site? It's nearly impossible to discover useful information these days (outside the ArchWiki).
Contribution and content guidelines
The contribution guidelines and processes have a lot to do with the quality of the content on the wiki. Contributors, he said, have to follow three fundamental rules. The first is that they must use the edit summary to explain what has been done and why. The second rule is that contributors should not make complex edits all at once. As much as possible, Lavallade said, contributors should do " some kind of atomic editing " where each change is independent of the other ones. He did not go into specifics on this during the talk, but the guidelines have examples of best practices. The third rule is that major changes or rewrites should be announced on a topic's talk page to give others who are watching the topic a chance to weigh in.
The team also has three major content guidelines that Lavallade highlighted. One that is likely familiar to anyone contributing to technical documentation is the don't repeat yourself (DRY) principle. A topic should only exist in one place, rather than being repeated on multiple pages. He also said that the ArchWiki employed a " simple, but not stupid " approach to the documentation. This means that the documentation should be simple to read and maintain, but not offer too much hand-holding. Users also need to be willing to learn; they may need to read through more than one page to find the information they need to do something.
The final guideline is that everything is Arch-centric. Content on the site may be useful for users running different Linux distributions, and contributions are welcome that may apply to other distributions, but " something that will not work on Arch as-is is not something we will be hosting on our site ". That, he said, allowed the maintenance team to be focused on the content Arch provides and helps to keep maintenance more manageable.
Maintenance
Speaking of maintenance, Klinkovský said, the project has tools and templates to help make life easier for contributors. A reviewer might apply an accuracy template, for instance, which will add it to a page that lists all content that has been flagged as possibly inaccurate. The templates are usually used and acted on by people, but the project also has bots that can add some templates (such as dead link) and even fix some problems.
The review process is an important part of maintenance, he said. Everyone can participate in review, not just the maintainers of the wiki. He explained that it was not possible for the maintenance team to review everything, so much of the review is done by people interested in specific topics who watch pages to see when changes were made. If people spot errors, they are empowered to fix them on their own, or to use the templates to flag them for others to address. Maintainers are there, he said, " to make some authoritative decisions when needed, and mediate disputes if they came up ".
Klinkovský referred to watching and reviewing content on the wiki as " patrolling ", and said there were some basic rules that should be followed, starting with " assume good faith ". Most people do something because they think it is right; the maintainers rarely see outright vandalism on the wiki.
The second rule, he said, is " when in doubt, discuss changes with others before making a hasty decision ". If a change must be reverted, then a reviewer should always explain why it was reverted. This gives the original contributor a chance to come back and fix the problem or address it in a different way. Lastly, Klinkovský said, they wanted to avoid edit wars: " the worst thing that can happen on a wiki is a few people just reverting their changes after each other ".
Preventing edit wars and encouraging contributions was, Lavallade said, part of the broader topic of community management. The team tries to encourage contributors to not only make one change, but to learn the guidelines and keep contributing—and then help teach others the guidelines.
Arch has support forums, such as IRC, and when people ask for help there they are pointed to the wiki " because there is always the solution on the ArchWiki ". In the rare event that the wiki does not have the solution, he said, " we gently point them to where the page with the content needs to be " and invite the user to add it even if it's not perfect the first time. That helps to reinforce the idea that the wiki is a collaborative work that everyone should feel welcome to add to.
Weaknesses
Lavallade said that the contribution model also illustrated one of ArchWiki's weaknesses: there is a lot to learn about contributing to the wiki, and newcomers can get tripped up. For example, he said that the DRY principle was difficult for new contributors. Or a newcomer might add a large chunk of content into a page that should be broken up into several pages.
The MediaWiki markup language is another hurdle for new contributors. He called the markup " antiquated ", and acknowledged that the style conventions for the ArchWiki are not obvious either. It can take a lot of reading, cross-referencing, and back-and-forth discussions for a new contributor to make a content contribution correctly.
MediaWiki has a lot of strengths, Klinkovský said; it is battle-proven software, it is the de facto standard platform for wikis, and it has a nice API that can be used for integration with external applications such as bots. But MediaWiki is a weakness as well, he said. The platform is primarily developed for Wikipedia, and its developers are from Wikipedia. " Sometimes their decisions don't suit us ", he said, and there was little way to make things exactly as the ArchWiki maintenance team might want.
The primary weakness, though, was that its markup language is " very weird and hard to understand both for humans and machines ". In 2025, most people know and write Markdown daily, but MediaWiki markup is different. It is weird and fragile; changing a single token can completely break a page. It is also, he said, difficult to write a proper or robust parser for the language. This is particularly true because there is no formal specification of the language, just the reference implementation in the form of MediaWiki. That can change at any time: " so even if you had a perfect parser one day, it might not work the same or perfectly the next day ".
Since ArchWiki is developed by volunteer contributors, its content is essentially driven by popularity; people generally only edit the content that they have an interest in. Klinkovský said that this was not a weakness, necessarily, but it was related to some weaknesses. For example, some pages were edited frequently while others were not touched for years due to lack of interest. To a reader, it is not obvious whether page content is stale or recently updated.
There is also no perfect way to ensure that content makes its way to the wiki. He noted that people might solve their problem in a discussion on Arch's forums, but that the solution might never end up on the wiki.
Opportunities and threats
Klinkovský said that they had also identified several areas of opportunity—such as community involvement and support tools for editors—where the ArchWiki's work could be improved.
Lavallade said that one example of community involvement would be to work with derivatives from Arch Linux, such as SteamOS or Arch ports to CPU architectures other than x86-64. Currently, Arch is only supported on x86-64, he noted, but the project has passed an RFC to expand the number of architectures that would be supported.
Right now, the project has two tools for editors to use to make their work a bit easier: wiki-scripts and Wiki Monkey. Klinkovský explained that wiki-scripts was a collection of Python scripts used to automate common actions, such as checking if links actually work. Wiki Monkey is an interactive JavaScript tool that runs in the web browser, he said, and can help contributors improve content. For example, it has plugins to expand contractions, fix headers, convert HTML
tags into proper MediaWiki markup, and more.
There is much more that could be added or improved, he said, like linting software for grammar issues. The team might also consider incorporating machine learning or AI techniques into the editor workflow, " but this needs to be done carefully so we don't cause more trouble than we have right now ". The trouble the team has with AI right now will probably sound familiar to anyone running an open-source project today; specifically, AI-generated content that is not up to par and scraper bots.
People have already tried contributing to ArchWiki using AI, but Klinkovský pointed out that " current models are obviously not trained on our style guidelines, because the content does not fit ". Using AI for problem solving also prevents people from fully understanding a solution or how things work. That may be a problem for the whole of society, he said, not just ArchWiki.
The scraper bot problem is a more immediate concern, in that the project had to put the wiki behind Anubis in the early part of the year for about two months. Currently they do not need to use it, Klinkovský said, but they have it on standby if the bots come back. " So this is still a threat and we cannot consider it solved. "
Another, non-technical, threat that the project faces is burnout. Lavallade said that contributor burnout is a real problem, and that people who have stayed a long while " usually start with a good, strong string of changes, but they end up tapering their amount of contributions ". Everyone, he said, ends up running out of steam at some point. Because of that, there is a need to keep bringing in new contributors to replace people who have moved on.
Questions
One member of the audience wanted to know if there was a dedicated chat room for the wiki to discuss changes coming in. Lavallade said that there is an #archlinux-wiki room on Libera.Chat, and anyone is welcome there. However, the team frequently redirects conversations about changes to the talk pages on the wiki to ensure that everyone interested in a topic can discuss the change.
Steve McIntyre had two questions. He was curious about how many maintainers the ArchWiki had and what kind of hardware or setup was on the backend of the wiki " is this like, one virtual machine, or a cluster? " Klinkovský said that there were about 30 to 50 maintainers at the moment. As far as the setup, he said he was not on Arch's DevOps team and didn't know all the details, but he did know it was just one virtual machine " in the cloud ".
Like what you are reading? Try LWN for free for 1 month, no credit card required.
Another person wanted to know if the team would choose MediaWiki again if they were building the wiki today. Klinkovský did not quite answer directly, but he said that if a project does not like the markup language used by MediaWiki then it should look to a solution that uses Markdown. But, if a project needs all of the other features MediaWiki has, " like plugins or the API for writing bots and so on ", then MediaWiki is the best from all of the wiki software available.
One audience member pointed out that the chart seemed to show a spike in activity beginning with COVID and a steady decline since. They asked if the team had noticed that, and what they were doing about it. Klinkovský said that they had not looked at that problem as a whole team, or discussed what they could do about it. He said that if Arch added new architectures or accepted contributions from Arch-derivative distributions, it might reverse the trend.
Lange closed the session by saying that he thought it was funny that the presenters had said they wanted ArchWiki to be Arch-centric: " I think you failed, because a lot of other people are reading your really great, big wiki ".
Debian embraces MediaWiki
The session seems to have been a success in that it has helped to inspire the Debian project to revamp its own wiki. Immediately after the ArchWiki presentation, there was a Debian wiki BoF where it was decided to use MediaWiki. Debian currently uses the MoinMoin 1.9 branch, which depends on Python 2.7.
Since DebConf25, members of the wiki team have worked with the Debian's system administrators team to put up wiki2025.debian.org to eventually replace the current wiki. They have also created a new debian-wiki mailing list and decided to change the content licensing policy for material contributed to the wiki. Changes submitted to the wiki after July 24 are now licensed under the Creative Commons Attribution-ShareAlike 4.0 license unless otherwise noted.
If Debian can sustain the activity that has gone into the wiki revamp since DebConf25, its wiki might give the ArchWiki project a run for its money. In that case, given that ArchWiki has proven such a good resource for Linux users regardless of distribution, everybody will win.
[Thanks to the Linux Foundation, LWN's travel sponsor, for funding my travel to Brest for DebConf25.]
to post comments