Tech News
← Back to articles

Is OOXML Artifically Complex?

read original related products more articles

Is OOXML Artifically Complex?

A while ago, the official blog of LibreOffice published a provocative article: “An artificially complex XML schema as a lock-in tool.” Its target is Microsoft’s XML-based file formats — the Office Open XML (OOXML).

The article alleges that, although Microsoft put its Office formats through standardization, the spec is engineered to be so complex that it obstructs interoperability with third-party software. Moreover, the complexity is allegedly gratuitous and disconnected from real-world needs; it’s like advertising an “open” railway system while designing the signaling so only one manufacturer can run trains. Users, the argument continues, often accept proprietary technology uncritically, which makes it easy for Microsoft to lock people into its ecosystem.

A quick refresher: historically, Office used binary formats ( .doc , .xls , and .ppt ) whose contents weren’t human-readable. Starting with Office 2007, Microsoft switched the defaults to .docx , .xlsx , and .pptx , where the “x” stands for XML. These files are ZIP containers holding a set of XML parts and resources such as images. Both the XML structure and the packaging follow a published spec — OOXML.

With Microsoft’s backing, OOXML was adopted by international standards bodies, first as ECMA-376 and later as ISO/IEC 29500. Microsoft also put it under the Open Specification Promise (OSP), committing not to assert certain patent claims against compliant implementations.

On paper, then, anyone can parse, create, and edit OOXML to be compatible with Microsoft Office, which sounds great. But the LibreOffice article calls this premise into question, arguing that OOXML’s deliberate complexity turns this supposed openness into a trap, a tool for maintaining a monopoly.

Let’s be honest: few people would describe their experience with Microsoft Office as satisfying, which is part of why this article resonated widely. In my past life doing legal grunt work, battling convoluted Word documents was a daily ritual. I also authored the Word section of an Office tutorial series, where my main approach was to explain Word’s quirks by digging into the underlying OOXML format. Thus, I’m intimately familiar with what makes Office and OOXML painful.

Despite this, I disagree with the LibreOffice’s framing and conclusion. Aiming for mass appeal, the post is heavy on emotion and accusation but light on factual analysis, missing a solid educational opportunity. (LibreOffice later published a more technical comparison, but it still jumped straight from code snippets to conclusions.)

In my view, OOXML is indeed complex, convoluted, and obscure. But that’s likely less about a plot to block third-party compatibility and more about a self-interested negligence: Microsoft prioritized the convenience of its own implementation and neglected the qualities of clarity, simplicity, and universality that a general-purpose standard should have. Yes, that neglect has anticompetitive effects in practice, but the motive is different from deliberate sabotage and thus warrants a different judgment. (A detailed legal analysis is beyond the scope of this article.)

In other words, LibreOffice identified the right problem, it may have reached the wrong conclusion. Here’s why.

... continue reading