The History of a Security Hole

Warning: If you do not care for the finer points of x86 architecture, please stop reading right now—in the interest of your own sanity.

A while ago I was made aware of a strange problem causing a normal user process running on 32-bit i386 OpenBSD 6.3 to crash the OS (i386 only, not amd64). The problem turned out to be a security hole with history that goes back more than three decades.

The crashing code looked like it didn’t really have any business crashing, but the CPU was in a very odd state with inaccessible kernel stack and GDT (that’s extremely unhealthy because exceptions and interrupts cause triple faults and CPU shutdown).

After much head scratching, I noticed that the (virtual) CPU’s A20 gate was off. That’s a big no-no because when the CPU is in protected mode, turning the A20 gate off has very nasty, unpredictable, and system-specific consequences. It’s one of those Just Don’t Even Try That things. But could a user process really turn off the A20 gate? That makes no sense.

As it turns out, a user process really could do that on i386 OpenBSD 6.3 (again, i386 only, not amd64). A security hole allowed regular user processes to read and write many I/O ports, which is obviously very unhealthy. The chain of events that led to this is long, and probably the biggest player in it is Intel, with important contributions from NetBSD and OpenBSD developers. Thanks to the nature of open source, we can trace back exactly how it came to be, and perhaps even learn a thing or two from the mistakes.

Exposition, Intel Lays a Trap

When the 80286 was released in 1982, it introduced support for hardware task switching, something which, in certain circles, was in vogue in that era. The basic state of a task was held in a Task State Segment, or TSS. The TSS records the register state of an inactive (“switched away”) task, and also specifies the stack to use when switching to a ring with higher privilege (for that reason, every typical protected-mode OS must have a valid TSS).

John Crawford, one of the main 80386 designers, described the 286/386 task switching as “miles of microcode” which “never did work out quite right”, a very realistic assessment of the feature. But it’s baked into the x86 architecture, and TSSs are necessary even when hardware task switching isn’t used (see the AMD64 architecture—no hardware task switching, but TSSs are still a necessity).

When the 80386 first became available in silicon in 1985, the TSS was trivially extended (relative to the 286) to support 32-bit registers and also hold the task’s copy of the CR3 register (which massively complicated task switching, but that’s a different story).

In mid to late 1985, someone—likely Compaq and/or Microsoft—convinced Intel to add a permission bit map for I/O port access, allowing the OS to trap certain port accesses but allowing others to proceed at full speed; it is known that the permission bit map was not part of the original 386 specification, and there is no mention of it in the original 80386 datasheet (October 1985, Intel order no. 231630-001). The added level of granularity was very useful for V86 mode, and the feature was utilized by Compaq’s CEMM as early as 1986. Note that the I/O permission bitmap applies to every protected mode task (with a 386 TSS), not just V86 ones; the caveat is that for V86 tasks, the permission bit map is consulted for every I/O port access, and for non-V86 tasks only if CPL is numerically greater than IOPL (that is, when I/O would be otherwise not permitted).

... continue reading