Tech News
← Back to articles

Analyzing the Performance of WebAssembly vs. Native Code

read original related products more articles

Browsix modifies the Emscripten compiler to allow processes (which run in WebWorkers) to communicate with the Browsix kernel (which runs on the main thread of a page). Since Browsix compiles native programs to JavaScript, this is relatively straightforward: each process’ memory is a buffer that is shared with the kernel (a SharedArrayBuffer), thus system calls can directly read and write process memory. However, this approach has two significant drawbacks. First, it precludes growing the heap on-demand; the shared memory must be sized large enough to meet the high-water-mark heap size of the application for the entire life of the process. Second, JavaScript contexts (like the main context and each web worker context) have a fixed limit on their heap sizes, which is currently approximately 2.2 GB in Google Chrome [ 6 ] . This cap imposes a serious limitation on running multiple processes: if each process reserves a 500 MB heap, Browsix would only be able to run at most four concurrent processes. A deeper problem is that WebAssembly memory cannot be shared across WebWorkers and does not support the Atomic API, which Browsix processes use to wait for system calls.

Browsix-Wasm uses a different approach to process-kernel communication that is also faster than the Browsix approach. Browsix-Wasm modifies the Emscripten runtime system to create an auxiliary buffer (of 64MB) for each process that is shared with the kernel, but is distinct from process memory. Since this auxiliary buffer is a SharedArrayBuffer the Browsix-Wasm process and kernel can use Atomic API for communication. When a system call references strings or buffers in the process’s heap (e.g., writev or stat), its runtime system copies data from the process memory to the shared buffer and sends a message to the kernel with locations of the copied data in auxiliary memory. Similarly, when a system call writes data to the auxiliary buffer (e.g., read), its runtime system copies the data from the shared buffer to the process memory at the memory specified. Moreover, if a system call specifies a buffer in process memory for the kernel to write to (e.g., read), the runtime allocates a corresponding buffer in auxiliary memory and passes it to the kernel. In case the system call is either reading or writing data of size more than 64MB, Browsix-Wasm divides this call into several calls such that each call only reads or writes at maximum 64MB of data. The cost of these memory copy operations is dwarfed by the overall cost of the system call invocation, which involves sending a message between process and kernel JavaScript contexts. We show in §4.2.1 that Browsix-Wasm has negligible overhead.