Dynamic Linking in ELF

ELF is the binary format that allows for being both executable and linkable. It is de-facto standard in Linux.

 

A. Linking Overview

As the size of the program functionality grows, modulization helps programmers to maintain their code with efficiency. During compilation, an object file is generated per module. Afterwards, a linker (i.e., ld or ld.gold) takes one or more object files and combines them into a single executable file, library file, or another object file. A linker plays an pivotal role to resolve the locations like re-targeting absolute jumps.

An object file contains a symbol – a primitive datatype to be identified – that references another object file in nature. There are two symbol kinds: local symbols and external symbols. A local symbol resides in the same object file (often for the relocation purpose at linking time). For the latter, if an external symbol is defined inside the object file it can be called from other modules. If undefined, it requires to find the symbol to which references. The referenced symbol can be located in either another object file or a library file, a collection of object files that can be shared by other executables.

A library can be static or dynamic. If an application employs a symbol in a static library (.a extension), the compiler directly merges the object file that contains the symbol with a final executable. When the object file contains another symbol in another object file, it should be resolved and combined at compilation time as well although the object file is not required by the original program. In case of a dynamic library (shared object or .so extension), the final executable does not embed the object file. Instead, it delays the resolution of undefined symbols until runtime and lets a dynamic linker do its job. Hence, a statically linked executable can run itself without a library at runtime whereas a dynamically linked one cannot.

Here we demystify how dynamic linking works with a simple example for the shared object or position independent code (-fPIC option in a gcc and clang) in x86_64 (AKA AMD64).

 

B. Sample Program

Here is a tiny source code that has two modules (test.c and func.c) and one header file (func.h). 

 

C. ELF Header and Tables for Program Header and Section Header

ELF consists of three parts: ELF Header, Program Header Table and Section Header Table. 

First off, ELF header is quite self-explantionary with the defined struct itself in Elf64_Ehdr. See the comments below.

The program header table (PHT) describes how a loader maps the binary into virtual address space (or VA) when loading, whereas the section header table (SHT) has the entries of each defined section header when linking. Each mapped region in VA by a PHT entry is often called a segment from a loader view. As is a section by a SHT entry from a linker view.

The final executable file main is shown as following (Figure 1). The linker view on the left shows how each section is stored as a file at offline. The loader view on the right shows how each segment is loaded as a process at runtime. For instance, [S1] is the first section whose size is 0x1c with (A)llocatable by a loader. R, X and W denote readable, executable and writable respectively. On a loader view, there are four major chucks of memory: 0x400000-0x401000 (RX), 0x401000-0x402000 (RW), regions for shared objects and the space for stack and heap.

A friendly ‘readelf‘ command illustrates what each segment and section look like by reading the structure. Note that there are a lot of sections appended during compilation. 

 

D.  Linking process

Before moving on, here is what the object looks like (in crt1.o). The relocation record shows that there are four locations that cannot be resolved while compilation process. That is why the 4-byte address is empty filled with 0x0s. The first relocation entry at offset 12 has the reference of __libc_csu_fini, defined in another object. We can see that _start function actually calls our main function at offset 0x20, the entry point of the final executable.

Now, when the given program is compiled by default, gcc or clang driver combines necessary files (i.e., CRT) to allow a loader to handle it properly.  The Figure 2 illustrates them. (*) means the function is defined in the object file, whereas others are declared outside of the file. (i.e., using extern keyword in C) For example, crt1.o defines _start in the file  but the function __libc_start_main has to be resolved in a linking time (or maybe later).

Let’s see the layout of all functions from each object file. Figure 3 is part of sections from 10 to 13 in Figure 1. Interesting enough, the layout of all functions in a single object file is not inter-mixed.

 

E. Dynamic Linking

When dynamic linking is required (modern compiler set it by default), a compiler generates .dynamic section (section index 17 above). Note that executable files and shared object files have a separate procedure linkage table (PLT).

With the sample we have, the dynamic section contains 24 entries as following. Pay attention to the highlighted, which are required by dynamic linker. The section .plt.got (PLTGOT) is the very place that the final fix-ups are stored by dynamic linker. The .rela.plt (JMPREL) and .rela.dyn (RELA) are the relocation section tables that describe relocation entries.

Here are the symbols in relocation sections. The type “R_X86_64_JUMP_SLOT” means these symbols need to be resolved at runtime by dynamic linker. The offset is the location that resolved reference has to be stored.

The Figure 4 (before resolution) and 5 (after resolution) illustrate how dynamic linker resolves the references on the fly. With disassembly, three symbols are called at 0x400448, 0x4004c4 and 0x4005c7. At first, they are supposed to jump to somewhere in PLT. Again, another jump instruction in PLT corresponds to somewhere in a .got.plt. The value in .got.plt has the address of next instruction in .plt that has pointed to itself (+6).

For example, the address of printf@plt is 0x400490, and it jumps to 0x400496 after dereferencing (rip+0x158a is 0x401a20, and 0x400496 is stored in there). Then it pushes 0x2 and jumps to .plt table.

Here is the snapshot after the resolution of __libc_start_main and printf at glibc. The code for __gmon_start__ is already in the final executable. (thus 0x400486). At this point, all references are successfully resolved by dynamic linker. Note that the reference is resolved only once when it is called for the first time. 

The address of the routine for the resolution is stored at 0x401a08, which is dl_runtime_resolve_avx in this example.

For more curious readers, here are the source files from glibc that defines __dl_runtime_resolve and _dl_fixup internally. With several breakpoints in debugging, the routine stores the link_map at %rdi register and the reloc_index at %rsi register. This index is the very one pushed in .plt section.

 

[References]

http://www.skyfree.org/linux/references/ELF_Format.pdf
https://software.intel.com/sites/default/files/article/402129/mpx-linux64-abi.pdf
https://sourceware.org/git/?p=glibc.git;a=blob_plain;f=elf/elf.h
https://sourceware.org/git/?p=glibc.git;a=blob_plain;f=elf/dl-reloc.c
http://www.cs.stevens.edu/~jschauma/810/elf.html

2 thoughts on “Dynamic Linking in ELF”

  1. In E. Dynamic Linking
    0x0000000000000003 (PLTGOT) 0x4019f8 // address of .plt.got section

    To my understanding, it should be address of .got.plt section. However, I don’t know what .plt.got contains. Google .plt.got got me here.

  2. .plt.got looks like the parts related to the symbol actually used in conventional .plt section, it excludes the plt[0] which leads to the symbol resolver.

Leave a Reply

Your email address will not be published.