I wanted to share something special, a friend of mine, Will, has been so busy working on this project and I wanted to share it here for everyone here first. This is pretty technical, but still interesting deep look into one of Microsoft’s early 32bit/386 based programs that would go on to revolutionize the world, Windows/386! It brought the v86 virtual machine to normal people wrapped up in a nice GUI. By Will Klees (CaptainWillStarblazer) INTRODUCTION I’m CaptainWillStarblazer, an author who has previously been featured on VirtuallyFun for my work on EmuWOW, which enabled running Win32 apps compiled for the MIPS and Alpha AXP architectures to run on x86 computers. While I was born in the 21st century, I have a keen interest in the computers of the past, particularly in the history of Microsoft. The foundations for the breakout success of Windows 3.0, 3.1, and 9x were laid with Windows/386, but until recently, the inner-workings of Windows/386 have not been well understood, and beyond the very high-level, exactly how it works have been considered an opaque black box, not ventured into with books (official or otherwise) like its successors. No longer. FOREWORD Before I begin, I would like to acknowledge that all of my work here was informed by the research of the late, great Geoff Chappell, who has many in-depth pages on this topic as well as many others that laid the groundwork for this post. His contributions to the scene are immeasurable, and I, along with many of you, stand on the shoulders of giants like him. It is unfortunate that up to this point, Windows/386 has not faced much reverse-engineering work (especially in comparison to the better-documented Windows 3.x and 95), but for the first time, it is being examined. ARCHITECTURE OF WINDOWS/386 Windows/386 Loader (WIN386.EXE) The structure of Windows/386 is broadly similar to later versions of Windows running in enhanced mode. The journey begins with WIN386.EXE, which is a standard MZ EXE. WIN386 first performs some checks to make sure that your machine can run Windows/386 (you have enough memory, the right version of DOS, you have an 80386, defending against early buggy 386 steppings, etc.), among them being whether your computer is currently executing in Virtual-8086 Mode. If you are, then that means that another piece of protected-mode software is already controlling the computer. From there, it checks if Windows/386 is already running, and if so, displays an error message. From there, it checks if the resident protected-mode software is a memory manager that it recognizes (either Compaq’s CEMM or Microsoft’s EMM386), and if so, uses the GEMMIS (Global EMM Import Specification) API to suck out all of the EMS mapping page tables from the LIMulator and then switch back into real-mode. If it doesn’t recognize the protected-mode software, it at this point throws another error message. This check for early buggy 386 steppings was retained by Microsoft even into Windows 8.1, surprisingly enough. The system can also check for 386 chips with bad 32-bit multiplication, though it only warns the user of potential issues, it doesn’t fail to run like if you are running a Model 1 Stepping 0 chip. [photo of the code checking for the buggy 386] Finally, it begins loading the Virtual DOS Machine Manager (VDMM) into memory from the file WIN386.386. This file is not an OS/2 Linear Executable like the 386 files from later versions of Windows (that format did not yet exist), rather it is the 32-bit x.out executable format from Xenix-386 (thank you, Michal Necasek!), which makes sense as it was the only 32-bit executable format that Microsoft would have a linker for at the time (and interoperated well with Microsoft’s OMF-based tools, such as MASM). Among the features of this format is that it contains a rather-lengthy symbol table. Not only does this aid reverse-engineering, however, it’s also a key part of the loading process. The WIN386.EXE loader will populate parts of the loaded image with important data using these symbols. Virtual DOS Machine Manager and Virtual Device Drivers (WIN386.386) WIN386.386 contains a statically-linked binary image of the VDMM itself as well of all of the virtual device drivers. Disassembling the source code of Windows/386 was an interesting exercise. On my repo, I have a partial disassembly of EGA.3EX, the WIN386.EXE loader for the EGA version of WIN386.386, which is a standard MS-DOS executable and as such easily examined by reverse-engineering tools. However, the 32-bit x.out format used by Windows/386 is not readily supported by any reverse-engineering tools that I am aware of. While it would be possible to write an Ida or Ghidra plugin, I figured the simplest solution was to convert it to a more standard executable format that could be understood; COFF. After extracting the 16-bit entry stub into a small flat binary to be disassembled on its own, the COFF file could finally be opened (in reality, tools didn’t seem to like the COFF file very much, so I had to use GNU objcopy to convert it to ELF so that tools would like it) and examined. [photo of the conversion program] [objdump or dumpbin examining the resulting COFF image] WIN386.386 starts execution in real-mode with a short stub that prepares the Global Descriptor Table, loads the page directory, switches into protected-mode, and does a far jump to the 32-bit entry point. At this point, it starts WIN86.COM (loaded by WIN386.EXE) to start a real-mode copy of Windows in the first VM, otherwise known as the “System VM”. Two valuable resources for examining the code of Windows/386 have turned out to be the source code for MEMM (Microsoft Expanded Memory Manager, better known by its final name EMM386) from the MS-DOS 4.0 repository, and the Windows 3.0 DDK sample VxDs. It is obvious from comparing the Windows/386 disassembly to portions of the MEMM source code that portions of MEMM, both for EMS emulation and for the V86 monitor in particular, were simply lifted wholesale into Windows/386, and code comments even make reference to this. Amusingly, MEMM was assembled using the MASM 4.00 assembler which has poor support for the 80386, so copious amounts of macros are used to add in 386 instructions. Perhaps the most interesting EMM386-related finding, however, was that parts of EMM386 were written in C. This seemed obvious given the leading underscore and __cdecl-style calling convention in several Windows/386 functions, but examining the code finds it to be true. [emm386 c & 32-bit corresponding asm] Based on my examination, it appears that if you took the EMM386 C code and compiled it for a 32-bit flat model (EMM386’s code was compiled for a 16:16 far pointer model), you’d get the assembly in Windows/386. This is interesting because Windows/386 was previously thought to be written entirely in assembly, and the Microsoft 386 C compiler was in its infancy when Windows/386 was being written. It’s not entirely unbelievable, however, since Xenix-386, the earliest known user of the compiler, came out around the same time as Windows/386. The other handy reference while disassembling Windows/386 has actually been the Windows 3.0 DDK. Since the VDMM contains all of the virtual device drivers statically linked into it, and many of Windows 3.0’s virtual device drivers can trace their beginnings to Windows/386, there’s often a strong correspondence. Many APIs have changed, however, including how VxDs call each other. In Windows/386, it’s just a simple call, while their status as separate modules in Windows 3.0 requires a VxDCall; a special interrupt that causes the VMM to transfer control to another VxD. [comparing between a Windows 3.0 VxD and a Windows 2.03 VxD] Examination of the MapLinear function finds that the memory map for Windows/386 2.xx is essentially identical to Windows 3.0. The first 4MB is the private per-VM arena (so chosen as it allows a task-switch to be as simple as altering the first PDE in the page directory, rather than having to switch page directories), then the 4-20MB range identity-maps the first 16MB of physical memory, and the VDMM is loaded at the 20MB mark. As a quick example of one of the code paths in Windows/386, when Windows/386 needs to return an entry point to a client application (such as through the INT 2FH AX=1602H API), it needs some way to cause a client calling that entry point in Virtual 8086 Mode to trap into protected mode. As documented by Raymond Chen, they found that the quickest way to do this was via the invalid opcode fault, and the invalid opcode they chose for this was 63H, or ARPL. As part of a mechanism that is still in place in Windows 95, when a VM executes an ARPL instruction, it’ll trap into the VDMM, vectoring through the IDT to vm_trap06. [photo of the IDT] [photo of vm_trap06] From there, it determines if the fault came from a VM or not. If not, it executes the Windows/386 error handler, but if it did, it calls into VmFault. VmFault looks up the faulting opcode through a table and invokes the appropriate handler for it. The appropriate handler for ARPL is called Patch_Fault. From there, it determines what kind of call this is, and if you’re lucky, it’ll end up in TS_VMDA_Call, which is described in the next section. [photo of VmFault] [photo of Patch_Fault] The System VM – Windows 2 and WINOLDAP The code running inside of the system VM is almost identical to a standard real-mode Windows 2.xx install, with one exception: WINOLDAP. Responsible for executing MS-DOS (“old”) applications, WINOLDAP is totally different in Windows/386 (and as such, not functional if you try to load WIN86.COM directly from real-mode on its own, which otherwise provides a perfectly workable real-mode Windows experience), making heavy use of 386 instructions and of the “Virtual DOS Applications” (VDA, otherwise known as VMDOSAPP) API (accessed via INT 2FH AX=1601H in Windows/386 2.03 and 2.11) which is made available exclusively to the system VM, allowing WINOLDAP to control the execution of other virtual machines. [photos of the dispatch tables and routines for VDA] [example app using VDA] While the details of this API certainly changed for Windows 3.0 and for later versions, WINOLDAP continued to work in fundamentally the same way, with the DOS application running in the System VM (intended to be Windows) being uniquely privileged to control operations in other virtual machines. Given that many people have figured out how to make Windows/386 start applications other than Windows itself (such as COMMAND.COM), this means that nothing would stop a sufficiently enterprising developer from developing a text-based MS-DOS application that leveraged this API to provide multitasking. In fact, this is likely how Raymond Chen’s “character-mode task switcher” functioned. WINOLDAP is worthy of further examination to determine exactly how it works, and perhaps to develop a multitasking MS-DOS. Obviously, this API, intended to have only Windows as a client, is totally undocumented other than by myself and Geoff Chappell, but further work could reveal its secrets. In addition to the VDA API, Windows/386 also provided a much more limited API to callers in other virtual machines (accessed via INT 2FH AX=1602H), that appears to still be available in Windows 3.0 and is primarily responsible for networking. For most of my experimentation, I actually sidestepped booting Windows altogether so that I could run my own code in the System VM. This is fairly simple; all you need to do is copy COMMAND.COM over WIN86.COM, then start WIN386, and viola! You’re running COMMAND.COM in Virtual 8086 Mode! Probably the most notable change is that if you didn’t already have any LIM EMS memory, you do now. LOST WINDOWS/386 DDK While no DDK for Windows/386 2.xx has been located, hints have been scattered for its existence. Most notably, the Windows 3.0 386 Virtual Device Adaptation Guide provides guidance on the differences between Windows/386 2.xx and Windows 3.0, and how to port virtual display drivers from one to the other, suggesting that Microsoft did provide tools to enable third-party developers to write Windows/386 virtual device drivers. It’s not difficult to imagine what this DDK would have looked like. Likely distributed alongside the regular Windows/286 real-mode DDK, the Windows/386-specific portions would include the 32-bit capable MASM5, along with early versions of MAPSYM32, WDEB386, and the Xenix x.out ld link editor. Very likely, Microsoft provided sample code for each of the VxDs included with Windows/386 (including the CGA, EGA, and Hercules VDDs), as well as a precompiled OMF object containing the VDMM itself, and then one would link everything together. It bears repeating that the documentation on porting virtual device drivers from Windows/386 2.xx to Windows 3.0 was limited solely to virtual display drivers. The only other references to Windows/386 2.xx in the Virtual Device Adaptation Guide discuss the Windows/386 API callable by DOS applications running in a DOS box (many device drivers and applications, including network stacks, were Windows/386 aware). This could mean that other types of drivers could be more easily reassembled for Windows 3.0 without documentation, but I doubt it. As it stands, most of the virtual device drivers included in Windows/386 were fairly generic; the COM port, timer, PIC, keyboard, and other such devices work almost identically in every PC-compatible computer. On the other hand, the display driver is the one major component that Windows would need to interact with and that would significantly change between different types of machines. Additionally, due to the statically-linked nature of Windows/386 at this point, having more than one VxD as the variable factor could balloon into a smorgasbord of different combinations of drivers statically linked into the WIN386.386 image. As such, it stands to reason that the only driver built by third-parties (though no such driver has yet been located) is a virtual display driver. This lines up with Microsoft’s own distribution of Windows/386, as the disks include separate 386 files for each supported display (the appropriate file being copied for your machine based on your selection during setup) and a matching 3EX file that gets copied to become the WIN386.EXE loader, and display drivers (also including their own complete Windows/386 images, obviously based on customizing the EGA/VGA VDD) have been found for other display adapters as well. This is compounded by the fact that during SETUP (including for real-mode Windows), the 16-bit display driver is statically linked into the Windows kernel (in other words, you can only load a display driver during SETUP) for the “fast-boot” configuration (though this can be disabled for a “slow-boot” on debug versions, more similar to how Windows 3.0 and above boot). A lot of reading between the lines is needed here, but it does seem that the only customization Microsoft intended was for OEMs to provide their own virtual display drivers. BREAKING INTO WINDOWS/386 In absence of the Windows/386 DDK and its associated debugger, options are fairly limited as to peeking into the internals of Windows/386 while it is active. Promise was initially found in WIN386.EXE making a call to INT 68H (the WDEB386 real-mode interface, also used by the Deb386 debugger developed for EMM386 that no doubt was the immediate ancestor of WDEB386, as well as by compatible debuggers such as SoftICE) with AH=43H (D386_Identify, typically the first call made when initializing a program that uses WDEB386), no doubt trying to call out to its version of WDEB386, if present. However, the version of WDEB386 from Windows 3.1 only partially worked. While a CTRL-C could break into WDEB386 at any time, it could only trace through Virtual 8086 Mode code (always breaking at an ARPL VM-86 breakpoint), and whenever you tried to resume execution using the G command, Windows/386 would exit. As a result, I had to improvise my own debugger, which required me to gain the ability to execute my own 32-bit code within Windows/386, which has never before been achieved. I immediately decided to adopt a similar approach to WDEB386; leave the debugger behind in conventional memory before Windows/386 starts up, and then have it call into me, so I quickly set about writing a small TSR. The TSR hooked INT 69H with a routine called Intrude that would patch the IDT of Windows/386 (found via traversing the image symbol table) to point to my own code for interrupt vector 0 (the divide exception handler). That way, whenever a divide exception occurred, it would vector into my own code. The next question you may be wondering about is how I got Windows/386 to invoke an INT 69H in the first place? The answer lies in the real-mode initialization stub of WIN386.386; the part that switches into protected-mode. Examine the listing below: Enable_A20: 01B7 803EBD00F8 CMP BYTE PTR [Computer_Type],0F8H ; Check for fast A20 support 01BC 7707 JA Enable_A20_Slow 01BE E492 IN AL,92H ; Fast A20 enable 01C0 0C02 OR AL,2 ; Set bit 1 (A20 line control) 01C2 E692 OUT 92H,AL ; Output back to port 92H 01C4 C3 RET Enable_A20_Slow: 01C5 B4DF MOV AH,0DFH 01C7 EB12 JMP Set_A20 By the time Enable_A20 is called, which checks the computer type from the BIOS, most of the data structures needed to enter Windows/386 have already been set up, so I patched Windows/386 to simply remove Fast A20 support and always use the slow code, putting an INT 69H in the slack space. In other words, it replaces the instruction at TEXT16:01B7 with an INT 69H (CD 69). Since the original instruction is 5 bytes long, the remaining three are padded with NOP (90). The instruction at TEXT16:01BC is then altered to be an unconditional jump (EB) to always invoke the slow A20 line control. Since the loaded object is always at offset 400H in the file, and the offsets appear to be the same for all versions of Windows/386 on all devices, the changes are: 5B7: 80 -> CD 5B8: 3E -> 69 5B9: BD -> 90 5BA: 00 -> 90 5BB: F8 -> 90 5BC: 77 -> EB The trouble at this point was that, while my program did work, it left the protected-mode code sitting in conventional memory, and part of the System VM’s inherited address space and thus subject to corruption. As a result, I wanted to move it up into extended memory, out of the reach of any pesky DOS programs. My first thought was to use XMS memory through HIMEM.SYS, which was introduced with Windows/386 2.11 to facilitate access to the HMA for Windows. Unfortunately, while this did sort of work, it turns out that Windows/386 (which if you’ll recall was initially designed before XMS or HIMEM.SYS) does not respect XMS allocations made before Windows loads, and thus considers them part of its extended memory pool (a fact I was taught by the fact that it corrupts the first two DWORDs of every 64K memory block starting after the HMA as part of its memory test). It is also important to realize that Windows/386 2.11 does not provide virtual XMS services to any client VMs (though Windows 3.0 and later versions do), except for HMA access to the System VM only (Windows/286 2.11 also used the HMA on 80286 and above systems, hence the “286” name, though it otherwise worked fine on XT-class machines, and since Windows/386 ran Windows/286 in the System VM, it made sense to also support the HMA there). As a result, I used the “expand-down” memory allocation method of determining the amount of installed extended memory using INT 15H AH=88H, and then hooking that interrupt to report 132K less memory than there was before, and using the last 132K of extended memory for my own purposes. Since INT 15H AH=88H can report up to 64MB of installed extended memory, while INT 15H AH=87H to copy into extended memory only supports up to 16MB, I had to write my own routines to copy into extended memory by switching into protected-mode and then back. As a result, W386DBG has to be loaded before any memory manager that places the machine into Virtual 8086 Mode, such as EMM386, or anything that allocates XMS memory (not that any such programs are likely to be used alongside Windows/386, as just as I stated earlier, the XMS memory would be corrupted). As you can see, if you cause a divide exception in DEBUG.COM, it’ll print out “W386DBG” in the upper-right of the screen and then hang the computer. This won’t work for a software INT 0, because software interrupts from Virtual 8086 Mode vector through the GPF handler. [photo of the program launching] [photo of the hang with the VBox debugger showing where we hung] Note that while we lack any debug version of the VDMM (along with any symbols that it may contain or debug messages it may output), the VDMM itself does as stated earlier have a considerable symbol table, and we have debug versions of Windows 2 as part of its DDK, which were meant to be used with SYMDEB and include symbols, so at least we can have full debugging capabilities for the 16-bit components of Windows, simply by loading debug Windows 2 into the System VM, as no doubt one was intended to do when developing device drivers for Windows/386. Obviously, W386DBG is not yet a functional debugger, but it has gained the ability to grab control from Windows/386, which is perhaps the most important part. INTO THE FUTURE WITH WINDOWS VERSION 3.0 Lately, I have become interested in turning my attention to the Windows 3.0 version 14 debug release that shipped to ISVs in early 1989. As one would expect, it shows many similarities to Windows 2.xx, but is already well on the way to becoming the Windows 3.0 that we know. Notably, the WIN386.386 file is now gone, having been merged into WIN386.EXE as with the final version of Windows 3.0, meaning that the same DOS executable both loads the VMM and contains it. However, the VMM itself (pointed to by the e_lfanew field in the MZ header) is not an OS/2 2.0 Cruiser Linear Executable like the final version (or, more properly, the W3 format which contains multiple LE VxDs within it), but rather another bespoke format with a “W386” signature that I have not yet torn into yet. All of the VxDs are still statically linked at this point, but the symbol file is showing movement toward the VMM we know from Windows 3.0. I haven’t disassembled all of the real-mode entry portion of WIN386 yet (this will allow me to fully understand the file format), but an interesting piece of code new to this build checks to make sure not only that the DOS major version is at least 3 (3 being the minimum DOS version) but also less than 10, as 10 is the major DOS version reported by OS/2 1.x’s 3xBox, making Windows/386 3.0.14 OS/2-aware (and avoidant). One piece of Windows 3.0-related history that was recently discovered was the manual for Murray Sargent’s Scroll-Screen-Tracer debugger. The debugger is far too rich in features to begin to go over them, but among its incredible DOS-extending features include support for debugging applications in Virtual 8086 Mode (a la SoftICE), debugging Windows, and debugging regular MS-DOS applications running in the 80286’s Protected Mode, much as was described in “Saving Windows from the OS/2 Bulldozer”. Interestingly, the DOS extender provided by WIN386, along with PKERNEL.EXE (the protected-mode Windows kernel) seem to have more in common with the 80286 DOS extender, DOSX.EXE, from Windows 3.0, along with the 80286 standard mode kernel, KRNL286.EXE, than they do with the enhanced mode counterparts. For example, like in the final version of Windows 3.0, DOSX (in this case, WIN386) switches into protected-mode before loading PKERNEL/KRNL286, giving it the unique distinction of being an MZ executable that starts in protected-mode, using a stub to start executing the NE portion of the file. By contrast, in Windows 3.1 (and 3.0 enhanced mode), the DOS extender switches back into real / Virtual 8086 Mode before loading the kernel, which then uses DPMI to switch into protected-mode. Along with translation for DOS API services, according to Michal Necasek of the OS/2 Museum, WIN386 appears to provide some sort of selector management interface via INT 31H that could be considered a sort of proto-DPMI. Disassembling both WIN386 and PKERNEL promises to be an interesting exercise. Not much is known about the early history of DPMI, but the first sign of it outside of Microsoft appears to date to Fall 1989: I will never forget how startled I was when I encountered the DOS-Protected Mode Interface (DPMI) in its primordial form for the first time. I was sitting in a Microsoft OS/2 2.0 ISV seminar in the Fall of 1989, with my mind only about half-engaged during an uninspiring session about OS/2 2.0’s Multiple Virtual DOS Machines (MVDMs), when the speaker mentioned in passing that OS/2 2.0 would support a new interface for the execution of DOS Extender applications. This casual remark focused my mind remarkably… After the speaker finished, I went up to him and asked for more information, explaining that his mystery interface was about to have a severe impact on a book project near and dear to my heart. In a couple of hours, the Microsoftie returned with a thick document entitled “DOS Protected Mode Interface Specification, Version 0.04” still warm from the Xerox machine and generously garnished with “CONFIDENTIAL” warning messages. I suspect I made a most amusing spectacle, as I flipped through the pages with my eyes bulging out and my jaw dropping to the floor. The document I had been handed was nothing less than the functional specification of a protected-mode version of MS-DOS! Microsoft originally defined the DPMI in two layers: a set of low-level functions for interrupt management, mode switching, and extended memory management; and a higher-level interface that provided access to MS-DOS, ROM BIOS, and mouse driver functionality via protected-mode execution of Int 21H, Int 10H, Int 33H, and so on. The higher-level DPMI functions were implemented, of course, in terms of the lower-level DPMI functions and the extant real-mode DOS and ROM BIOS interface. Ray Duncan, Extending DOS, 2d ed., 1992, pp. 433-438 Obviously, by this point, Microsoft, who was still heavily invested in OS/2, planned to implement DPMI into OS/2 2.0, though they would not do so for about a year afterwards. Desiring crossover with the protected-mode DOS apps that would run on Windows (most crucially, Windows itself) was no doubt a desire for the OS/2 development team. I was surprised to learn that DPMI was already by this point mature enough to have even a preliminary specification released. Moreover, Microsoft went on to, at the behest of DOS Extender vendors, such as Phar Lap and Rational Systems, excise from the DPMI specification all of the higher-level DOS Extender components, and “DPMI 0.9” was born, containing only the low-level building blocks of a DOS Extender. As Andrew Schulman went on to say, the DOS Extender portions of DPMI ended up being split off into their own document: Microsoft has an internal document (“MS-DOS API Extensions for DPMI Hosts,” October 31, 1990) that devotes about 30 pages to the Windows 3.0 DOS extenders… For example, the 1990 document discusses the 32-bit DOS extender provided by DOSMGR. The DOS file read and write calls (INT 21h functions 3Fh and 40h) have the count register (ECX) extended to 32-bits, allowing 32-bit programs to perform DOS file I/O of more than 64K at a time. Andrew Schulman, Unauthorized Windows 95, 1994, pp. 151-52 On the PCjs website, Version 0.04 from March 1991 of the MS-DOS API Extensions for DPMI Hosts can be found, and it is obviously quite a preliminary document. What it seems is that DPMI was designed simply to expose the Windows DOS Extender (used by the Windows kernel) to other DOS protected-mode software. DPMI sits on the AH=16H Windows/386 part of the INT 2FH multiplex (W386_Int_Multiplex), with the “Get Protected Mode Switch Entry Point” API from DPMI even being documented as part of INT2FAPI.INC from the Windows 3.0 DDK as W386_Get_PM_Switch_Addr. The “Get Selector to Base of LDT” API from the MS-DOS API Extensions document is even part of INT2FAPI.INC as W386_Get_LDT_Base_Sel. DPMI was defined as an interface for protected-mode DOS software to interface with the Windows (and OS/2) DOS Extenders, and ultimately a subset of the Windows DOS Extender API got standardized and duplicated by other vendors; in effect, DPMI hosts implement a genericized version of the Windows DOS Extender. If you’re interesting in looking at my code and seeing future developments in the disassembly and W386DBG, check it out at https://github.com/BHTY/WIN386.