This article's aim is to explain how a modern operating system makes it possible to use shared libraries with load-time relocation. It focuses on the Linux OS running on 32-bit x86, but the general principles apply to other OSes and CPUs as well.
Note that shared libraries have many names - shared libraries, shared objects, dynamic shared objects (DSOs), dynamically linked libraries (DLLs - if you're coming from a Windows background). For the sake of consistency, I will try to just use the name "shared library" throughout this article.
Loading executables Linux, similarly to other OSes with virtual memory support, loads executables to a fixed memory address. If we examine the ELF header of some random executable, we'll see an Entry point address: $ readelf -h /usr/bin/uptime ELF Header: Magic: 7f 45 4c 46 01 01 01 00 00 00 00 00 00 00 00 00 Class: ELF32 [...] some header fields Entry point address: 0x8048470 [...] some header fields This is placed by the linker to tell the OS where to start executing the executable's code . And indeed if we then load the executable with GDB and examine the address 0x8048470 , we'll see the first instructions of the executable's .text segment there. What this means is that the linker, when linking the executable, can fully resolve all internal symbol references (to functions and data) to fixed and final locations. The linker does some relocations of its own , but eventually the output it produces contains no additional relocations. Or does it? Note that I emphasized the word internal in the previous paragraph. As long as the executable needs no shared libraries , it needs no relocations. But if it does use shared libraries (as do the vast majority of Linux applications), symbols taken from these shared libraries need to be relocated, because of how shared libraries are loaded.
Loading shared libraries Unlike executables, when shared libraries are being built, the linker can't assume a known load address for their code. The reason for this is simple. Each program can use any number of shared libraries, and there's simply no way to know in advance where any given shared library will be loaded in the process's virtual memory. Many solutions were invented for this problem over the years, but in this article I will just focus on the ones currently used by Linux. But first, let's briefly examine the problem. Here's some sample C code which I compile into a shared library: int myglob = 42 ; int ml_func ( int a, int b) { myglob += a; return b + myglob; } Note how ml_func references myglob a few times. When translated to x86 assembly, this will involve a mov instruction to pull the value of myglob from its location in memory into a register. mov requires an absolute address - so how does the linker know which address to place in it? The answer is - it doesn't. As I mentioned above, shared libraries have no pre-defined load address - it will be decided at runtime. In Linux, the dynamic loader is a piece of code responsible for preparing programs for running. One of its tasks is to load shared libraries from disk into memory, when the running executable requests them. When a shared library is loaded into memory, it is then adjusted for its newly determined load location. It is the job of the dynamic loader to solve the problem presented in the previous paragraph. There are two main approaches to solve this problem in Linux ELF shared libraries: Load-time relocation Position independent code (PIC) Although PIC is the more common and nowadays-recommended solution, in this article I will focus on load-time relocation. Eventually I plan to cover both approaches and write a separate article on PIC, and I think starting with load-time relocation will make PIC easier to explain later. (Update 03.11.2011: the article about PIC was published)
Linking the shared library for load-time relocation To create a shared library that has to be relocated at load-time, I'll compile it without the -fPIC flag (which would otherwise trigger PIC generation): gcc -g -c ml_main.c -o ml_mainreloc.o gcc -shared -o libmlreloc.so ml_mainreloc.o The first interesting thing to see is the entry point of libmlreloc.so : $ readelf -h libmlreloc.so ELF Header: Magic: 7f 45 4c 46 01 01 01 00 00 00 00 00 00 00 00 00 Class: ELF32 [...] some header fields Entry point address: 0x3b0 [...] some header fields For simplicity, the linker just links the shared object for address 0x0 (the .text section starting at 0x3b0 ), knowing that the loader will move it anyway. Keep this fact in mind - it will be useful later in the article. Now let's look at the disassembly of the shared library, focusing on ml_func : $ objdump -d -Mintel libmlreloc.so libmlreloc.so: file format elf32-i386 [...] skipping stuff 0000046c
Load-time relocation in action To see the load-time relocation in action, I will use our shared library from a simple driver executable. When running this executable, the OS will load the shared library and relocate it appropriately. Curiously, due to the address space layout randomization feature which is enabled in Linux, relocation is relatively difficult to follow, because every time I run the executable, the libmlreloc.so shared library gets placed in a different virtual memory address . This is a rather weak deterrent, however. There is a way to make sense in it all. But first, let's talk about the segments our shared library consists of: $ readelf --segments libmlreloc.so Elf file type is DYN (Shared object file) Entry point 0x3b0 There are 6 program headers, starting at offset 52 Program Headers: Type Offset VirtAddr PhysAddr FileSiz MemSiz Flg Align LOAD 0x000000 0x00000000 0x00000000 0x004e8 0x004e8 R E 0x1000 LOAD 0x000f04 0x00001f04 0x00001f04 0x0010c 0x00114 RW 0x1000 DYNAMIC 0x000f18 0x00001f18 0x00001f18 0x000d0 0x000d0 RW 0x4 NOTE 0x0000f4 0x000000f4 0x000000f4 0x00024 0x00024 R 0x4 GNU_STACK 0x000000 0x00000000 0x00000000 0x00000 0x00000 RW 0x4 GNU_RELRO 0x000f04 0x00001f04 0x00001f04 0x000fc 0x000fc R 0x1 Section to Segment mapping: Segment Sections... 00 .note.gnu.build-id .hash .gnu.hash .dynsym .dynstr .gnu.version .gnu.version_r .rel.dyn .rel.plt .init .plt .text .fini .eh_frame 01 .ctors .dtors .jcr .dynamic .got .got.plt .data .bss 02 .dynamic 03 .note.gnu.build-id 04 05 .ctors .dtors .jcr .dynamic .got To follow the myglob symbol, we're interested in the second segment listed here. Note a couple of things: In the section to segment mapping in the bottom, segment 01 is said to contain the .data section, which is the home of myglob
section, which is the home of The VirtAddr column specifies that the second segment starts at 0x1f04 and has size 0x10c , meaning that it extends until 0x2010 and thus contains myglob which is at 0x200C . Now let's use a nice tool Linux gives us to examine the load-time linking process - the dl_iterate_phdr function, which allows an application to inquire at runtime which shared libraries it has loaded, and more importantly - take a peek at their program headers. So I'm going to write the following code into driver.c : #define _GNU_SOURCE #include
" , info->dlpi_name, info->dlpi_phnum, ( void *)info->dlpi_addr); for ( int j = 0 ; j < info->dlpi_phnum; j++) { printf( "\t\t header %2d: address=%10p
" , j, ( void *) (info->dlpi_addr + info->dlpi_phdr[j].p_vaddr)); printf( "\t\t\t type=%u, flags=0x%X
" , info->dlpi_phdr[j].p_type, info->dlpi_phdr[j].p_flags); } printf( "
... continue reading