Fundamental of Virtual Memory

Contents

What and Why?

Have you ever wondered why computers need main memory (RAM) when they already have disk storage? The answer lies in access speed. While disk storage is permanent, it is much slower than main memory. RAM sacrifices volatility for speed—data is lost when the power is off, but access times are much faster. As a result, the CPU can only access data from main memory, not disk storage.

CPUs come with built-in registers, which are even faster than main memory. So why do we need main memory at all? It’s because registers are limited in number and size. Imagine a function that needs to work with a thousand variables—there’s no way to fit all of them into registers. And what if you need to store large data structures like arrays or structs? Registers simply don’t have the capacity. That’s where main memory comes in—it provides the space needed to handle larger and more complex data.

Main memory is a large array of bytes, ranging in size from hundreds of thousands to billions. Each byte has its own address. For a program to be executed, it must be mapped to absolute addresses and loaded into memory. Once loaded, a process—an active execution of a program—accesses program instructions, reads data from and writes data to memory by using these absolute addresses. In the same way, for the CPU to process data from disk, those data must first be transferred to main memory by CPU-generated I/O calls.

Simple Allocation Strategy

Typically, there are multiple processes running on a computer, each with their own memory space allocated in main memory. It’s the responsibility of the operating system to allocate memory for each process, ensuring that they don’t interfere with each other. One of the simplest methods for allocating memory is to assign processes to a variably sized contiguous block of memory in memory, where each block may contain exactly one process.

Contiguous memory block allocation strategy 1

When processes are created, the operating system takes into account the memory requirements of each process and the amount of available memory space to allocate a sufficient partition for it. After allocation, the process is loaded into memory and starts its execution. Once the process is finished, the operating system reclaims the memory block, making it available for other processes. If there is not enough room for an incoming process, the operating system may need to swap out some processes to disk to free up memory space. Alternatively, we can place such processes into a wait queue. When memory is later released, the operating system checks the wait queue to determine if it will satisfy the memory demands of a waiting process.

... continue reading