The Executable and Linkable Format (ELF) is a common standard file format for executable files, object code, shared libraries, and core dumps. Originally developed for Unix System V, it has become the standard binary format for Unix-like operating systems, including Linux and BSD. ELF simplifies the development and execution of programs across various systems, providing a flexible and extensible structure for different types of binary files.
The three main uses of ELF are Executable, Shared Library, and Object File.
ELF File Structure
An ELF file is composed of several parts, each serving a specific purpose in the executable's lifecycle. The key components include:
- ELF Header
- Program Header Table
- Section Header Table
- Sections
- Segments

1. ELF Header
The ELF header is the starting point and first part of an ELF file. It contains metadata that describes the file's format and attributes such as the file type, machine architecture, version, entry point address, program header table offset, section header table offset and flags.
- Location: The Elf header is always located at the start of the file, i.e., offset 0.
- Size:
- For 32-bit ELF files: 52 bytes.
- For 64-bit ELF files: 64 bytes.
- It is a fixed-size structure that provides an overview of the file's layout and properties.
- The ELF header differs slightly between 32-bit and 64-bit ELF files primarily in terms of field sizes. For instance, addresses and offsets are 4 bytes in 32-bit ELF and 8 bytes in 64-bit ELF.
The ELF header contains metadata about the file itself. It includes:
- Identification (e_ident): A magic number and other information to identify the file as an ELF file and specify the architecture and format.
- Size: 16 Bytes.
- Fields:
- Magic Number (EI_MAG0 to EI_MAG3): The first four bytes, which should contain the magic number
0x7F
, followed by the charactersELF
. This identifies the file as an ELF file. - Class (EI_CLASS): Identifies the file as 32-bit (
ELFCLASS32
) or 64-bit (ELFCLASS64
). - Data Encoding (EI_DATA): Specifies the data encoding (little-endian
ELFDATA2LSB
or big-endianELFDATA2MSB
). - Version (EI_VERSION): The ELF header version, currently always
1
(EV_CURRENT). - OS/ABI (EI_OSABI): Identifies the target operating system and ABI (Application Binary Interface).
- ABI Version (EI_ABIVERSION): Specifies the version of the ABI.
- Padding (EI_PAD): Unused bytes, padded to the size of 16 bytes for alignment.
- Magic Number (EI_MAG0 to EI_MAG3): The first four bytes, which should contain the magic number
- File Type (e_type): The type of the ELF file (e.g., relocatable, executable, shared object, core).
- Size: 2 Bytes
- Explanation:
- Specifies the type of the ELF file (e.g.,
ET_EXEC
for executable files,ET_REL
for relocatable files,ET_DYN
for shared objects,ET_CORE
for core dumps).
- Specifies the type of the ELF file (e.g.,
- Machine (e_machine): The target architecture (e.g., Intel 80386).
- Specifies the target architecture (e.g.,
EM_386
for Intel 80386,EM_X86_64
for AMD x86-64).
- Specifies the target architecture (e.g.,
- Version (e_version): The ELF version.
- The version of the ELF specification, currently always
1
(EV_CURRENT).
- The version of the ELF specification, currently always
- Entry Point Address (e_entry): The entry point address, where the program starts executing.
- The virtual address to which the system first transfers control, effectively the starting point of the executable.
- e_phoff: The offset to the Program Header Table.
- The file offset in bytes where the program header table is located.
- e_shoff: The offset to the Section Header Table.
- The file offset in bytes where the section header table is located.
- e_flags: Processor-specific flags.
- Architecture-specific flags.
- e_ehsize: The size of the ELF header.
- The size of this header in bytes.
- e_phentsize: The size of each entry in the Program Header Table.
- Typically, this is either 32 or 56 bytes depending on whether it's a 32-bit or 64-bit ELF file.
- e_phnum: The number of entries in the Program Header Table.
- Indicates how many program header entries are present.
- e_shentsize: The size of each entry in the Section Header Table.
- Typically, this is either 40 or 64 bytes depending on whether it's a 32-bit or 64-bit ELF file.
- e_shnum: The number of entries in the Section Header Table.
- Indicates how many section header entries are present.
- e_shstrndx: The index of the section header string table.
- The index of the section header table entry that contains the section names.
We can easily view all this information of any elf file through readelf
command with -h
option:
readelf -h elf_file
Sample Output:
ELF Header:
Magic: 7f 45 4c 46 02 01 01 00 00 00 00 00 00 00 00 00
Class: ELF64
Data: 2's complement, little endian
Version: 1 (current)
OS/ABI: UNIX - System V
ABI Version: 0
Type: EXEC (Executable file)
Machine: Advanced Micro Devices X86-64
Version: 0x1
Entry point address: 0x401000
Start of program headers: 64 (bytes into file)
Start of section headers: 8056 (bytes into file)
Flags: 0x0
Size of this header: 64 (bytes)
Size of program headers: 56 (bytes)
Number of program headers: 8
Size of section headers: 64 (bytes)
Number of section headers: 29
Section header string table index: 28
ELF Header for 32-bit (Elf32_Ehdr) struct:
#define EI_NIDENT 16
typedef struct {
unsigned char e_ident[EI_NIDENT]; // Magic number and other info
uint16_t e_type; // Object file type
uint16_t e_machine; // Architecture
uint32_t e_version; // Object file version
uint32_t e_entry; // Entry point virtual address
uint32_t e_phoff; // Program header table file offset
uint32_t e_shoff; // Section header table file offset
uint32_t e_flags; // Processor-specific flags
uint16_t e_ehsize; // ELF header size in bytes
uint16_t e_phentsize; // Program header table entry size
uint16_t e_phnum; // Program header table entry count
uint16_t e_shentsize; // Section header table entry size
uint16_t e_shnum; // Section header table entry count
uint16_t e_shstrndx; // Section header string table index
} Elf32_Ehdr;
// In Assembly
; ELF header structure
ELFMAG0 equ 0x7F
ELFMAG1 equ 'E'
ELFMAG2 equ 'L'
ELFMAG3 equ 'F'
ELFCLASS32 equ 1
ELFDATA2LSB equ 1
EV_CURRENT equ 1
ET_EXEC equ 2
EM_386 equ 3
Elf32_Ehdr:
db ELFMAG0 ; e_ident[EI_MAG0]
db ELFMAG1 ; e_ident[EI_MAG1]
db ELFMAG2 ; e_ident[EI_MAG2]
db ELFMAG3 ; e_ident[EI_MAG3]
db ELFCLASS32 ; e_ident[EI_CLASS]
db ELFDATA2LSB ; e_ident[EI_DATA]
db EV_CURRENT ; e_ident[EI_VERSION]
db 0 ; e_ident[EI_OSABI]
db 0 ; e_ident[EI_ABIVERSION]
times 7 db 0 ; e_ident[EI_PAD]
dw ET_EXEC ; e_type
dw EM_386 ; e_machine
dd EV_CURRENT ; e_version
dd _start ; e_entry (entry point)
dd ph_offset ; e_phoff (program header table offset)
dd 0 ; e_shoff (section header table offset)
dd 0 ; e_flags
dw ehdr_size ; e_ehsize (ELF header size)
dw phdr_size ; e_phentsize (program header entry size)
dw 1 ; e_phnum (number of program headers)
dw 0 ; e_shentsize (section header entry size)
dw 0 ; e_shnum (number of section headers)
dw 0 ; e_shstrndx (section header string table index)
ehdr_size equ $ - Elf32_Ehdr
phdr_size equ $ - Elf32_Phdr
; Program header structure
PT_LOAD equ 1
PF_X equ 1
PF_W equ 2
PF_R equ 4
Elf32_Phdr:
dd PT_LOAD ; p_type (loadable segment)
dd 0 ; p_offset (offset in file)
dd segment_start ; p_vaddr (virtual address in memory)
dd segment_start ; p_paddr (physical address, irrelevant here)
dd segment_size ; p_filesz (size in file)
dd segment_size ; p_memsz (size in memory)
dd PF_R + PF_W + PF_X ; p_flags (segment permissions)
dd 0x1000 ; p_align (alignment)
// With Reserved
; ELF Header (32-bit)
struc Elf32_Ehdr
.e_ident resb 16 ; Magic number and other info
.e_type resw 1 ; Object file type
.e_machine resw 1 ; Architecture
.e_version resd 1 ; Object file version
.e_entry resd 1 ; Entry point virtual address
.e_phoff resd 1 ; Program header table file offset
.e_shoff resd 1 ; Section header table file offset
.e_flags resd 1 ; Processor-specific flags
.e_ehsize resw 1 ; ELF header size in bytes
.e_phentsize resw 1 ; Program header table entry size
.e_phnum resw 1 ; Program header table entry count
.e_shentsize resw 1 ; Section header table entry size
.e_shnum resw 1 ; Section header table entry count
.e_shstrndx resw 1 ; Section header string table index
endstruc
; Program Header (32-bit)
struc Elf32_Phdr
.p_type resd 1 ; Segment type
.p_offset resd 1 ; Segment file offset
.p_vaddr resd 1 ; Segment virtual address
.p_paddr resd 1 ; Segment physical address
.p_filesz resd 1 ; Segment size in file
.p_memsz resd 1 ; Segment size in memory
.p_flags resd 1 ; Segment flags
.p_align resd 1 ; Segment alignment
endstruc
; Section Header (32-bit)
struc Elf32_Shdr
.sh_name resd 1 ; Section name (string table index)
.sh_type resd 1 ; Section type
.sh_flags resd 1 ; Section flags
.sh_addr resd 1 ; Section virtual address in memory
.sh_offset resd 1 ; Section file offset
.sh_size resd 1 ; Section size in bytes
.sh_link resd 1 ; Link to another section
.sh_info resd 1 ; Additional section information
.sh_addralign resd 1 ; Section alignment
.sh_entsize resd 1 ; Entry size if section holds a table
endstruc
Offset | Field | Size (bytes) | Description |
---|---|---|---|
0x00 | e_ident | 16 | Magic number and other info |
0x10 | e_type | 2 | Object file type |
0x12 | e_machine | 2 | Architecture |
0x14 | e_version | 4 | Object file version |
0x18 | e_entry | 4 | Entry point virtual address |
0x1C | e_phoff | 4 | Program header table file offset |
0x20 | e_shoff | 4 | Section header table file offset |
0x24 | e_flags | 4 | Processor-specific flags |
0x28 | e_ehsize | 2 | ELF header size in bytes |
0x2A | e_phentsize | 2 | Program header table entry size |
0x2C | e_phnum | 2 | Program header table entry count |
0x2E | e_shentsize | 2 | Section header table entry size |
0x30 | e_shnum | 2 | Section header table entry count |
0x32 | e_shstrndx | 2 | Section header string table index |
ELF Header for 64-bit (Elf64_Ehdr) struct:
#define EI_NIDENT 16
typedef struct {
unsigned char e_ident[EI_NIDENT]; // Magic number and other info
uint16_t e_type; // Object file type
uint16_t e_machine; // Architecture
uint32_t e_version; // Object file version
uint64_t e_entry; // Entry point virtual address
uint64_t e_phoff; // Program header table file offset
uint64_t e_shoff; // Section header table file offset
uint32_t e_flags; // Processor-specific flags
uint16_t e_ehsize; // ELF header size in bytes
uint16_t e_phentsize; // Program header table entry size
uint16_t e_phnum; // Program header table entry count
uint16_t e_shentsize; // Section header table entry size
uint16_t e_shnum; // Section header table entry count
uint16_t e_shstrndx; // Section header string table index
} Elf64_Ehdr;
- Magic Number: Identifies the file as an ELF file.
- Class: Specifies whether the file is 32-bit or 64-bit.
- Data Encoding: Indicates the endianness of the file.
- Version: The ELF specification version.
- OS/ABI: Specifies the target operating system and ABI.
- Type: Indicates the type of file (e.g., executable, shared object, core file).
- Machine: Specifies the target architecture (e.g., x86, ARM).
- Entry Point: The memory address where execution starts.
- Program Header Offset: Offset to the program header table.
- Section Header Offset: Offset to the section header table.
- Flags: Architecture-specific flags.
- Header Size: Size of the ELF header.
- Program Header Entry Size and Count: Size and number of entries in the program header table.
- Section Header Entry Size and Count: Size and number of entries in the section header table.
- String Table Index: Index of the section header string table.
Explanation of ELF Header Fields
- The
EI_NIDENT
, is the size in bytes of the f - e_ident (Offset 0x00, Size 16 bytes): It is a 16-byte array that identifies the ELF object, it always starts with
0x7fELF
.- The first 16 bytes of the ELF header contain the magic number and other information such as the file class, data encoding, and version.
- The magic number (first 4 bytes) should be
0x7F 'E' 'L' 'F'
. - e_ident[EI_MAG0] to e_ident[EI_MAG3]: Magic number (0x7f, 'E', 'L', 'F') identifying the file as an ELF file.
- e_ident[EI_CLASS]: Identifies the file as 32-bit (
ELFCLASS32
) or 64-bit (ELFCLASS64
).ELFCLASS32
(1): 32-bit objectsELFCLASS64
(2): 64-bit objects
- e_ident[EI_DATA]: Specifies the endianness (little-endian or big-endian).
ELFDATA2LSB
(1): Little-endianELFDATA2MSB
(2): Big-endian
- e_ident[EI_VERSION]: Version of the ELF specification (currently, it should be 1).
EV_CURRENT
(1)
- e_ident[EI_OSABI]: Identifies the target operating system and ABI.
ELFOSABI_SYSV
(0): UNIX System V ABIELFOSABI_HPUX
(1): HP-UXELFOSABI_NETBSD
(2): NetBSDELFOSABI_LINUX
(3): Linux- Others: Various OS-specific values
- e_ident[EI_ABIVERSION]: ABI version.
- e_ident[EI_PAD]: Unused, reserved for future use.
- Padding bytes (unused)
- e_type (Offset 0x10, Size 2 bytes):
- Specifies the type of the ELF file (e.g.,
ET_EXEC
for executable files,ET_DYN
for shared objects,ET_REL
for relocatable files).ET_NONE
(Undefined)(0): ELF Format unknown or not specified.ET_EXEC
: (Executable file)(1): An ELF executable.ET_DYN
: (Shared object)(2): A library or a dynamically-linked executable.ET_REL
(Relocatable file)(3): Relocatable files (.o object files).ET_CORE
(Core dump)(4): A core dump file.ET_LOOS
toET_HIOS
(0xfe00 to 0xfeff): Operating system-specificET_LOPROC
toET_HIPROC
(0xff00 to 0xffff): Processor-specific
- Specifies the type of the ELF file (e.g.,
- e_machine (Offset 0x12, Size 2 bytes):
- Specifies the target architecture (e.g.,
EM_386
for Intel 80386,EM_X86_64
for x86-64).EM_NONE
(0): No machine.EM_M32
(1): AT&T WE 32100.EM_SPARC
(2): SPARC.EM_386
(3): Intel 80386.EM_68K
(4): Motorola 68000.EM_88K
(5): Motorola 88000.EM_860
(7): Intel 80860.EM_MIPS
(8): MIPS RS3000.EM_PARISC
(15): HP/PA.EM_SPARC32PLUS
(18): SPARC with enhanced instruction set.EM_PPC
(20): PowerPC.EM_PPC64
(21): PowerPC 64-bit.EM_ARM
(40): ARM.EM_X86_64
(62): AMD x86-64.EM_AARCH64
(183): ARM 64-bit.- Other values specify different architectures.
- Specifies the target architecture (e.g.,
- e_version (Offset 0x14, Size 4 bytes):
- The version of the ELF specification (should be 1 for the current version).
EV_NONE
(0): Invalid version.EV_CURRENT
(1): Current version.
- The version of the ELF specification (should be 1 for the current version).
- e_entry (Offset 0x18, Size 4 bytes):
- The virtual address to which the system first transfers control, i.e., the entry point of the executable.
- e_phoff (Offset 0x1C, Size 4 bytes):
- The offset of the program header table in the file.
- e_shoff (Offset 0x20, Size 4 bytes):
- The offset of the section header table in the file.
- e_flags (Offset 0x24, Size 4 bytes):
- Processor-specific flags.
- Flags specific to the target architecture.
- Processor-specific flags.
- e_ehsize (Offset 0x28, Size 2 bytes):
- The size of the ELF header in bytes.
- Ehdr size (in bytes). (Usually 64 byte in 64-bit ELF and 52 bytes for 32 bits)
- The size of the ELF header in bytes.
- e_phentsize (Offset 0x2A, Size 2 bytes):
- The size of each entry in the program header table.
- e_phnum (Offset 0x2C, Size 2 bytes):
- The number of entries in the program header table.
- e_shentsize (Offset 0x2E, Size 2 bytes):
- The size of each entry in the section header table.
- e_shnum (Offset 0x30, Size 2 bytes):
- The number of entries in the section header table.
- e_shstrndx (Offset 0x32, Size 2 bytes):
- The index of the section header string table, which contains the names of the sections.
-> e_type defines:
#define ET_NONE 0 /* No file type */
#define ET_REL 1 /* Relocatable file */
#define ET_EXEC 2 /* Executable file */
#define ET_DYN 3 /* Shared object file */
#define ET_CORE 4 /* Core file */
#define ET_NUM 5 /* Number of defined types */
#define ET_LOOS 0xfe00 /* OS-specific range start */
#define ET_HIOS 0xfeff /* OS-specific range end */
#define ET_LOPROC 0xff00 /* Processor-specific range start */
#define ET_HIPROC 0xffff /* Processor-specific range end */
1 Executable Files (ET_EXEC):
- These files contain a program that is ready to be executed. When you run a command in Unix-like operating systems, the shell loads the corresponding executable ELF file.
- Generation Process:
- Compilation: Source code files (e.g., .c, .cpp) are compiled into object files (.o) using a compiler (e.g.,
gcc
,clang
). - Linking: The object files are linked together by a linker (e.g.,
ld
) to produce an executable file. During this process, the linker resolves symbol references and assigns runtime addresses.
- Compilation: Source code files (e.g., .c, .cpp) are compiled into object files (.o) using a compiler (e.g.,
- Tools Involved:
- Compiler:
gcc
,clang
, etc. - Linker:
ld
or the linker stage ofgcc
,clang
, etc.
- Compiler:
- Example Command:
gcc -o myprogram myprogram.c
.
2 Relocatable Files (ET_REL)
- These are object files created by a compiler or assembler that can be linked with other object files to produce either a shared object file or an executable file. They contain code and data in a form suitable for linking but not for execution.
- Generation Process:
- Compilation: Source code files are compiled into relocatable object files. These files contain code and data that are not yet assigned final memory addresses, making them suitable for linking but not for execution.
- Tools Involved:
- Compiler:
gcc
,clang
, etc.
- Compiler:
- Example Command:
gcc -c mymodule.c
- This produces
mymodule.o
, a relocatable file.
- This produces
3 Shared Object Files (ET_DYN)
- Also known as shared libraries, these files can be dynamically linked with an executable at run time or with other shared objects to form a single executable image in memory. They enable code reuse and modular programming.
- Generation Process:
- Compilation and Linking: Source code files are compiled into object files, and then these object files are linked together into a shared library. Shared libraries contain code that can be shared by multiple programs at runtime.
- Tools Used:
- Compiler:
gcc
,clang
, etc. - Linker:
ld
,gold
, etc.
- Compiler:
- Example:
gcc -shared -o mylib.so lib.o
-> e_machine defines:
#define EM_NONE 0 /* No machine */
#define EM_M32 1 /* AT&T WE 32100 */
#define EM_SPARC 2 /* SUN SPARC */
#define EM_386 3 /* Intel 80386 */
#define EM_68K 4 /* Motorola m68k family */
#define EM_88K 5 /* Motorola m88k family */
#define EM_IAMCU 6 /* Intel MCU */
#define EM_860 7 /* Intel 80860 */
#define EM_MIPS 8 /* MIPS R3000 big-endian */
#define EM_S370 9 /* IBM System/370 */
#define EM_MIPS_RS3_LE 10 /* MIPS R3000 little-endian */
/* reserved 11-14 */
#define EM_PARISC 15 /* HPPA */
/* reserved 16 */
#define EM_VPP500 17 /* Fujitsu VPP500 */
#define EM_SPARC32PLUS 18 /* Sun's "v8plus" */
#define EM_960 19 /* Intel 80960 */
#define EM_PPC 20 /* PowerPC */
#define EM_PPC64 21 /* PowerPC 64-bit */
#define EM_S390 22 /* IBM S390 */
#define EM_SPU 23 /* IBM SPU/SPC */
/* reserved 24-35 */
#define EM_V800 36 /* NEC V800 series */
#define EM_FR20 37 /* Fujitsu FR20 */
#define EM_RH32 38 /* TRW RH-32 */
#define EM_RCE 39 /* Motorola RCE */
#define EM_ARM 40 /* ARM */
#define EM_FAKE_ALPHA 41 /* Digital Alpha */
#define EM_SH 42 /* Hitachi SH */
#define EM_SPARCV9 43 /* SPARC v9 64-bit */
#define EM_TRICORE 44 /* Siemens Tricore */
#define EM_ARC 45 /* Argonaut RISC Core */
#define EM_H8_300 46 /* Hitachi H8/300 */
#define EM_H8_300H 47 /* Hitachi H8/300H */
#define EM_H8S 48 /* Hitachi H8S */
#define EM_H8_500 49 /* Hitachi H8/500 */
#define EM_IA_64 50 /* Intel Merced */
#define EM_MIPS_X 51 /* Stanford MIPS-X */
#define EM_COLDFIRE 52 /* Motorola Coldfire */
#define EM_68HC12 53 /* Motorola M68HC12 */
#define EM_MMA 54 /* Fujitsu MMA Multimedia Accelerator */
#define EM_PCP 55 /* Siemens PCP */
#define EM_NCPU 56 /* Sony nCPU embeeded RISC */
#define EM_NDR1 57 /* Denso NDR1 microprocessor */
#define EM_STARCORE 58 /* Motorola Start*Core processor */
#define EM_ME16 59 /* Toyota ME16 processor */
#define EM_ST100 60 /* STMicroelectronic ST100 processor */
#define EM_TINYJ 61 /* Advanced Logic Corp. Tinyj emb.fam */
#define EM_X86_64 62 /* AMD x86-64 architecture */
#define EM_PDSP 63 /* Sony DSP Processor */
#define EM_PDP10 64 /* Digital PDP-10 */
#define EM_PDP11 65 /* Digital PDP-11 */
#define EM_FX66 66 /* Siemens FX66 microcontroller */
#define EM_ST9PLUS 67 /* STMicroelectronics ST9+ 8/16 mc */
#define EM_ST7 68 /* STmicroelectronics ST7 8 bit mc */
#define EM_68HC16 69 /* Motorola MC68HC16 microcontroller */
#define EM_68HC11 70 /* Motorola MC68HC11 microcontroller */
#define EM_68HC08 71 /* Motorola MC68HC08 microcontroller */
#define EM_68HC05 72 /* Motorola MC68HC05 microcontroller */
#define EM_SVX 73 /* Silicon Graphics SVx */
#define EM_ST19 74 /* STMicroelectronics ST19 8 bit mc */
#define EM_VAX 75 /* Digital VAX */
#define EM_CRIS 76 /* Axis Communications 32-bit emb.proc */
#define EM_JAVELIN 77 /* Infineon Technologies 32-bit emb.proc */
#define EM_FIREPATH 78 /* Element 14 64-bit DSP Processor */
#define EM_ZSP 79 /* LSI Logic 16-bit DSP Processor */
#define EM_MMIX 80 /* Donald Knuth's educational 64-bit proc */
#define EM_HUANY 81 /* Harvard University machine-independent object files */
#define EM_PRISM 82 /* SiTera Prism */
#define EM_AVR 83 /* Atmel AVR 8-bit microcontroller */
#define EM_FR30 84 /* Fujitsu FR30 */
#define EM_D10V 85 /* Mitsubishi D10V */
#define EM_D30V 86 /* Mitsubishi D30V */
#define EM_V850 87 /* NEC v850 */
#define EM_M32R 88 /* Mitsubishi M32R */
#define EM_MN10300 89 /* Matsushita MN10300 */
#define EM_MN10200 90 /* Matsushita MN10200 */
#define EM_PJ 91 /* picoJava */
#define EM_OPENRISC 92 /* OpenRISC 32-bit embedded processor */
#define EM_ARC_COMPACT 93 /* ARC International ARCompact */
#define EM_XTENSA 94 /* Tensilica Xtensa Architecture */
#define EM_VIDEOCORE 95 /* Alphamosaic VideoCore */
#define EM_TMM_GPP 96 /* Thompson Multimedia General Purpose Proc */
#define EM_NS32K 97 /* National Semi. 32000 */
#define EM_TPC 98 /* Tenor Network TPC */
#define EM_SNP1K 99 /* Trebia SNP 1000 */
#define EM_ST200 100 /* STMicroelectronics ST200 */
#define EM_IP2K 101 /* Ubicom IP2xxx */
#define EM_MAX 102 /* MAX processor */
#define EM_CR 103 /* National Semi. CompactRISC */
#define EM_F2MC16 104 /* Fujitsu F2MC16 */
#define EM_MSP430 105 /* Texas Instruments msp430 */
#define EM_BLACKFIN 106 /* Analog Devices Blackfin DSP */
#define EM_SE_C33 107 /* Seiko Epson S1C33 family */
#define EM_SEP 108 /* Sharp embedded microprocessor */
#define EM_ARCA 109 /* Arca RISC */
#define EM_UNICORE 110 /* PKU-Unity & MPRC Peking Uni. mc series */
#define EM_EXCESS 111 /* eXcess configurable cpu */
#define EM_DXP 112 /* Icera Semi. Deep Execution Processor */
#define EM_ALTERA_NIOS2 113 /* Altera Nios II */
#define EM_CRX 114 /* National Semi. CompactRISC CRX */
#define EM_XGATE 115 /* Motorola XGATE */
#define EM_C166 116 /* Infineon C16x/XC16x */
#define EM_M16C 117 /* Renesas M16C */
#define EM_DSPIC30F 118 /* Microchip Technology dsPIC30F */
#define EM_CE 119 /* Freescale Communication Engine RISC */
#define EM_M32C 120 /* Renesas M32C */
/* reserved 121-130 */
#define EM_TSK3000 131 /* Altium TSK3000 */
#define EM_RS08 132 /* Freescale RS08 */
#define EM_SHARC 133 /* Analog Devices SHARC family */
#define EM_ECOG2 134 /* Cyan Technology eCOG2 */
#define EM_SCORE7 135 /* Sunplus S+core7 RISC */
#define EM_DSP24 136 /* New Japan Radio (NJR) 24-bit DSP */
#define EM_VIDEOCORE3 137 /* Broadcom VideoCore III */
#define EM_LATTICEMICO32 138 /* RISC for Lattice FPGA */
#define EM_SE_C17 139 /* Seiko Epson C17 */
#define EM_TI_C6000 140 /* Texas Instruments TMS320C6000 DSP */
#define EM_TI_C2000 141 /* Texas Instruments TMS320C2000 DSP */
#define EM_TI_C5500 142 /* Texas Instruments TMS320C55x DSP */
#define EM_TI_ARP32 143 /* Texas Instruments App. Specific RISC */
#define EM_TI_PRU 144 /* Texas Instruments Prog. Realtime Unit */
/* reserved 145-159 */
#define EM_MMDSP_PLUS 160 /* STMicroelectronics 64bit VLIW DSP */
#define EM_CYPRESS_M8C 161 /* Cypress M8C */
#define EM_R32C 162 /* Renesas R32C */
#define EM_TRIMEDIA 163 /* NXP Semi. TriMedia */
#define EM_QDSP6 164 /* QUALCOMM DSP6 */
#define EM_8051 165 /* Intel 8051 and variants */
#define EM_STXP7X 166 /* STMicroelectronics STxP7x */
#define EM_NDS32 167 /* Andes Tech. compact code emb. RISC */
#define EM_ECOG1X 168 /* Cyan Technology eCOG1X */
#define EM_MAXQ30 169 /* Dallas Semi. MAXQ30 mc */
#define EM_XIMO16 170 /* New Japan Radio (NJR) 16-bit DSP */
#define EM_MANIK 171 /* M2000 Reconfigurable RISC */
#define EM_CRAYNV2 172 /* Cray NV2 vector architecture */
#define EM_RX 173 /* Renesas RX */
#define EM_METAG 174 /* Imagination Tech. META */
#define EM_MCST_ELBRUS 175 /* MCST Elbrus */
#define EM_ECOG16 176 /* Cyan Technology eCOG16 */
#define EM_CR16 177 /* National Semi. CompactRISC CR16 */
#define EM_ETPU 178 /* Freescale Extended Time Processing Unit */
#define EM_SLE9X 179 /* Infineon Tech. SLE9X */
#define EM_L10M 180 /* Intel L10M */
#define EM_K10M 181 /* Intel K10M */
/* reserved 182 */
#define EM_AARCH64 183 /* ARM AARCH64 */
/* reserved 184 */
#define EM_AVR32 185 /* Amtel 32-bit microprocessor */
#define EM_STM8 186 /* STMicroelectronics STM8 */
#define EM_TILE64 187 /* Tileta TILE64 */
#define EM_TILEPRO 188 /* Tilera TILEPro */
#define EM_MICROBLAZE 189 /* Xilinx MicroBlaze */
#define EM_CUDA 190 /* NVIDIA CUDA */
#define EM_TILEGX 191 /* Tilera TILE-Gx */
#define EM_CLOUDSHIELD 192 /* CloudShield */
#define EM_COREA_1ST 193 /* KIPO-KAIST Core-A 1st gen. */
#define EM_COREA_2ND 194 /* KIPO-KAIST Core-A 2nd gen. */
#define EM_ARC_COMPACT2 195 /* Synopsys ARCompact V2 */
#define EM_OPEN8 196 /* Open8 RISC */
#define EM_RL78 197 /* Renesas RL78 */
#define EM_VIDEOCORE5 198 /* Broadcom VideoCore V */
#define EM_78KOR 199 /* Renesas 78KOR */
#define EM_56800EX 200 /* Freescale 56800EX DSC */
#define EM_BA1 201 /* Beyond BA1 */
#define EM_BA2 202 /* Beyond BA2 */
#define EM_XCORE 203 /* XMOS xCORE */
#define EM_MCHP_PIC 204 /* Microchip 8-bit PIC(r) */
/* reserved 205-209 */
#define EM_KM32 210 /* KM211 KM32 */
#define EM_KMX32 211 /* KM211 KMX32 */
#define EM_EMX16 212 /* KM211 KMX16 */
#define EM_EMX8 213 /* KM211 KMX8 */
#define EM_KVARC 214 /* KM211 KVARC */
#define EM_CDP 215 /* Paneve CDP */
#define EM_COGE 216 /* Cognitive Smart Memory Processor */
#define EM_COOL 217 /* Bluechip CoolEngine */
#define EM_NORC 218 /* Nanoradio Optimized RISC */
#define EM_CSR_KALIMBA 219 /* CSR Kalimba */
#define EM_Z80 220 /* Zilog Z80 */
#define EM_VISIUM 221 /* Controls and Data Services VISIUMcore */
#define EM_FT32 222 /* FTDI Chip FT32 */
#define EM_MOXIE 223 /* Moxie processor */
#define EM_AMDGPU 224 /* AMD GPU */
/* reserved 225-242 */
#define EM_RISCV 243 /* RISC-V */
#define EM_BPF 247 /* Linux BPF -- in-kernel virtual machine */
#define EM_CSKY 252 /* C-SKY */
#define EM_NUM 253
/* Old spellings/synonyms. */
#define EM_ARC_A5 EM_ARC_COMPACT
/* If it is necessary to assign new unofficial EM_* values, please
pick large random numbers (0x8523, 0xa7f2, etc.) to minimize the
chances of collision with official or non-GNU unofficial values. */
#define EM_ALPHA 0x9026
-> e_version defines:
#define EV_NONE 0 /* Invalid ELF version */
#define EV_CURRENT 1 /* Current version */
#define EV_NUM 2
2 Program Header Table (Segment Header Table)
The Program Header Table provides the system with information necessary to load and execute a program. Each entry in this table describes a segment or other information the system needs to prepare the program for execution. These segment need to be loaded into the memory for execution. Each entry in this table provides information about the segment's type, offset, virtual address, physical address, file size, memory size, flags, and alignment.
- Location:
- The Program Header Table's location within an ELF file is specified by the ELF header, which resides at the very beginning of the file. The ELF header contains an offset and a count of entries for the Program Header Table.
- ELF Header Fields Relevant to Program Headers:
- e_phoff: This field in the ELF header specifies the offset of the Program Header Table in the file.
- e_phentsize: This field specifies the size of each entry in the Program Header Table.
- e_phnum: This field specifies the number of entries in the Program Header Table.
Each entry in the table contains:
Structure of the Program Header Table
The Program Header table consists of an array of Program Headers, each defined by the Elf32_Phdr
structure for 32-bit ELF files and the Elf64_Phdr
structure for 64-bit ELF files.
Offset | Field | Size (bytes) | Description |
---|---|---|---|
0x00 | p_type | 4 | Segment type |
0x04 | p_offset | 4 | Segment file offset |
0x08 | p_vaddr | 4 | Segment virtual address |
0x0C | p_paddr | 4 | Segment physical address (unused in many systems) |
0x10 | p_filesz | 4 | Size of segment in the file |
0x14 | p_memsz | 4 | Size of segment in memory |
0x18 | p_flags | 4 | Segment flags |
0x1C | p_align | 4 | Segment alignment |
1 ELF Program Header for 32-bit (ELF32_Phdr):
typedef struct {
uint32_t p_type; // Segment type
uint32_t p_offset; // Segment file offset
uint32_t p_vaddr; // Segment virtual address
uint32_t p_paddr; // Segment physical address
uint32_t p_filesz; // Segment size in file
uint32_t p_memsz; // Segment size in memory
uint32_t p_flags; // Segment flags
uint32_t p_align; // Segment alignment
} Elf32_Phdr;
2 ELF Program Header for 64-bit (Elf64_Phdr):
typedef struct {
uint32_t p_type; // Segment type
uint32_t p_flags; // Segment flags
uint64_t p_offset; // Segment file offset
uint64_t p_vaddr; // Segment virtual address
uint64_t p_paddr; // Segment physical address
uint64_t p_filesz; // Segment size in file
uint64_t p_memsz; // Segment size in memory
uint64_t p_align; // Segment alignment
} Elf64_Phdr;
Explanation of Program Header Fields:
- p_type (Offset 0x00, Size 4 bytes):
- Identifies the type of the segment. The purpose and semantics of the segment (e.g., loadable segment, dynamic linking information, program interpreter information).
- Common values include:
PT_NULL
(0): Unused entry.PT_LOAD
(1): Loadable segment. This is the most common type.PT_DYNAMIC
(2): Dynamic linking information.PT_INTERP
(3): Interpreter information.PT_NOTE
(4): Auxiliary information.PT_SHLIB
(5): Reserved.PT_PHDR
(6): Program header table itself.
- p_offset (Offset 0x04, Size 4 bytes):
- Offset of the segment in the file.
- p_vaddr (Offset 0x08, Size 4 bytes):
- Virtual address of the segment in memory. The virtual address where the segment should be loaded into memory.
- p_paddr (Offset 0x0C, Size 4 bytes):
- Physical address of the segment (relevant for systems where physical addressing is used).
- Relevant for certain architectures; usually, it’s the same as
p_vaddr
.
- p_filesz (Offset 0x10, Size 4 bytes):
- Size of the segment in the file.
- p_memsz (Offset 0x14, Size 4 bytes):
- Size of the segment in memory.
- Size of the segment in memory (can be larger than
p_filesz
if the segment includes zero-initialized data).
- p_flags (Offset 0x18, Size 4 bytes):
- Segment-dependent flags (e.g.,
PF_R
,PF_W
,PF_X
for read, write, and execute permissions).PF_X
(1): Execute permission.PF_W
(2): Write permission.PF_R
(4): Read permission.
- Segment-dependent flags (e.g.,
- p_align (Offset 0x1C, Size 4 bytes):
- Alignment of the segment in memory and in the file. Must be a power of two.
- 0 and 1 mean no alignment is required.
Example of ELF Executable File With Segments:
To better understand the content of the Program Header Table, let’s break down a few possible entries:
Example: Segment 1: Loadable Segment (Code Segment
)
p_type: 1 (PT_LOAD)
p_offset: 0x1000
p_vaddr: 0x80481000
p_paddr: 0x80481000
p_filesz: 0x1000
p_memsz: 0x2000
p_flags: 0x5 (PF_R | PF_X)
p_align: 0x1000
- p_type (1): This segment is loadable.
- p_offset (0x1000): This segment starts at offset 0x1000 in the file.
- p_vaddr (0x80481000): The segment should be loaded at virtual address 0x80481000.
- p_paddr (0x80481000): The segment should be loaded at physical address 0x80481000.
- p_filesz (0x1000): The segment occupies 0x1000 bytes in the file.
- p_memsz (0x2000): The segment occupies 0x2000 bytes in memory.
- p_flags (0x5): The segment is readable and executable.
- p_align (0x1000): The segment must be aligned on a 0x1000-byte boundary.
Example: Segment 2: Loadable Segment (Data Segment
)
Type: PT_LOAD
Flags: PF_W (Writable), PF_R (Readable)
Offset in File: 0x0000000000000C00
Virtual Address (VADDR): 0x0000000000600000
Physical Address (PADDR): 0x0000000000600000
File Size: 0x0000000000000400 (1024 bytes)
Memory Size: 0x0000000000000400 (1024 bytes)
Alignment: 0x0000000000000000
Viewing Program Header Table with readelf
:
readelf -l file
This command will display the ELF file header and the program headers. Here is an example of what the program header table might look like:
Elf file type is EXEC (Executable file)
Entry point 0x401020
There are 9 program headers, starting at offset 64
Program Headers:
Type Offset VirtAddr PhysAddr
FileSiz MemSiz Flags Align
LOAD 0x0000000000000000 0x0000000000400000 0x0000000000400000
0x0000000000000644 0x0000000000000644 R E 200000
LOAD 0x0000000000001000 0x0000000000601000 0x0000000000601000
0x0000000000000224 0x0000000000000224 RW 200000
DYNAMIC 0x0000000000001028 0x0000000000601028 0x0000000000601028
0x00000000000001d0 0x00000000000001d0 RW 8
NOTE 0x0000000000000064 0x0000000000400064 0x0000000000400064
0x0000000000000020 0x0000000000000020 R 4
GNU_EH_FRAME 0x0000000000000514 0x0000000000400514 0x0000000000400514
0x000000000000004c 0x000000000000004c R 4
GNU_STACK 0x0000000000000000 0x0000000000000000 0x0000000000000000
0x0000000000000000 0x0000000000000000 RW 8
GNU_RELRO 0x0000000000001000 0x0000000000601000 0x0000000000601000
0x00000000000000f0 0x000000000
3 Section Header Table
The section header table describes the various sections that make up the ELF file, which hold various types of data used during linking.
Sections are used during linking and have no meaning at runtime.
The Section Header Table is an array of section header entries, each corresponding to a section in the ELF file. The table itself is located at a file offset specified in the ELF header. The number of entries and the size of each entry are also specified in the ELF header.
Each entry in the table describes a single section:
- sh_name: Section name (offset in the section header string table).
- sh_type: Section type.
- sh_flags: Section flags.
- sh_addr: Virtual address of the section in memory (if applicable).
- sh_offset: Offset of the section in the file image.
- sh_size: Size of the section.
- sh_link: Section index link.
- sh_info: Extra information about the section.
- sh_addralign: Section alignment.
- sh_entsize: Size of entries, if the section has a table.
Structure of the Section Header Table:
The section header table consists of an array of section headers, each defined by the Elf32_Shdr
structure for 32-bit ELF files and the Elf64_Shdr
structure for 64-bit ELF files.
1 ELF Section Header for 32-bit (Elf32_Shdr):
Each entry is 40 bytes in size.
typedef struct {
uint32_t sh_name; // Section name (string table index)
uint32_t sh_type; // Section type
uint32_t sh_flags; // Section flags
uint32_t sh_addr; // Section virtual address at execution
uint32_t sh_offset; // Section file offset
uint32_t sh_size; // Section size in bytes
uint32_t sh_link; // Link to another section
uint32_t sh_info; // Additional section information
uint32_t sh_addralign; // Section alignment
uint32_t sh_entsize; // Entry size if section holds table
} Elf32_Shdr;
2 ELF Section Header for 64-bit (Elf64_Shdr):
typedef struct {
uint32_t sh_name; // Section name (string table index)
uint32_t sh_type; // Section type
uint64_t sh_flags; // Section flags
uint64_t sh_addr; // Section virtual address at execution
uint64_t sh_offset; // Section file offset
uint64_t sh_size; // Section size in bytes
uint32_t sh_link; // Link to another section
uint32_t sh_info; // Additional section information
uint64_t sh_addralign; // Section alignment
uint64_t sh_entsize; // Entry size if section holds table
} Elf64_Shdr;
-: Explanation of Section Header Fields :-
32-bit Section Header Table
Each entry is 40 bytes in size and includes the following fields:
- sh_name (4 bytes):
- Index into the section header string table, which provides the name of the section.
- sh_type (4 bytes):
- Identifies the type of the section (e.g.,
SHT_PROGBITS
,SHT_SYMTAB
,SHT_STRTAB
).SHT_NULL
Section table entry unused.SHT_PROGBITS
: Program data (Such as machine instructions or constants).SHT_SYMTAB
: Symbol table. (Static symbol table)SHT_STRTAB
: String table.SHT_RELA
: Relocation entries with addends.SHT_HASH
: Symbol hash table.SHT_DYNAMIC
: Dynamic linking information.SHT_NOTE
: Notes.SHT_NOBITS
: Uninitialized data.SHT_REL
: Relocation entries without addends.SHT_SHLIB
: Reserved.SHT_DYNSYM
Dynamic linker symbol table. (Dynamic-linker-used symbol table)
- Identifies the type of the section (e.g.,
- sh_flags (4 bytes):
- Section attributes (e.g.,
SHF_WRITE
,SHF_ALLOC
,SHF_EXECINSTR
).SHF_WRITE
: Writable at runtime.SHF_ALLOC
: The section will be loaded to virtual memory at runtime.SHF_EXECINSTR
: Contains executable instructions.
- Section attributes (e.g.,
- sh_addr (4 bytes):
- Virtual address of the section in memory when the ELF file is loaded.
- sh_offset (4 bytes):
- Offset of the section in the file image.
- sh_size (4 bytes):
- Size of the section in bytes.
- sh_link (4 bytes):
- Section index link. Interpretation depends on the section type.
- sh_info (4 bytes):
- Additional section information. Interpretation depends on the section type.
- sh_addralign (4 bytes):
- Required alignment of the section. Must be a power of two.
- sh_entsize (4 bytes):
- Size of each entry if the section contains a table of fixed-size entries. Otherwise, this field is zero.
64-bit Section Header Table Entry
Each entry is 64 bytes in size and includes the following fields:
- sh_name (4 bytes): An index into the section header string table, which gives the name of the section.
- sh_type (4 bytes): The type of the section, such as SHT_PROGBITS, SHT_SYMTAB, etc.
- sh_flags (8 bytes): Section attributes, such as SHF_WRITE, SHF_ALLOC, SHF_EXECINSTR, etc.
- sh_addr (8 bytes): The virtual address of the section in memory (for sections that are to be loaded).
- sh_offset (8 bytes): The offset of the section in the file image.
- sh_size (8 bytes): The size of the section in bytes.
- sh_link (4 bytes): Link to another section, depending on the type.
- sh_info (4 bytes): Additional section information, depending on the type.
- sh_addralign (8 bytes): The required alignment of the section.
- sh_entsize (8 bytes): The size of each entry, if the section holds a table of fixed-size entries.
Inspecting the Section Header Table:
To view the section header table of an ELF file, you can use the readelf
utility:
readelf -S file
Sample Output:
Section Headers:
[Nr] Name Type Address Off Size ES Flg Lk Inf Al
[ 0] NULL 0000000000000000 000000 000000 00 0 0 0
[ 1] .interp PROGBITS 0000000000000238 000238 00001c 00 A 0 0 1
[ 2] .note.ABI-tag NOTE 0000000000000254 000254 000020 00 A 0 0 4
[ 3] .gnu.hash GNU_HASH 0000000000000274 000274 000024 00 A 4 0 4
[ 4] .dynsym DYNSYM 0000000000000298 000298 0000b0 18 A 5 1 8
[ 5] .dynstr STRTAB 0000000000000348 000348 00007f 00 A 0 0 1
[ 6] .gnu.version VERSYM 00000000000003c8 0003c8 00000e 02 A 4 0 2
[ 7] .gnu.version_r VERNEED 00000000000003d8 0003d8 000020 00 A 5 1 8
[ 8] .rela.dyn RELA 00000000000003f8 0003f8 000030 18 A 4 0 8
[ 9] .rela.plt RELA 0000000000000428 000428 0000c0 18 AI 4 23 8
[10] .init PROGBITS 00000000000004e8 0004e8 00001a 00 AX 0 0 4
[11] .plt PROGBITS 0000000000000508 000508 000060 10 AX 0 0 8
[12] .text PROGBITS 0000000000000570 000570 000293 00 AX 0 0 16
[13] .fini PROGBITS 0000000000000804 000804 000009 00 AX 0 0 4
[14] .rodata PROGBITS 0000000000000810 000810 00008c 00 A 0 0 4
[15] .eh_frame_hdr PROGBITS 000000000000089c 00089c 00003c 00 A 0 0 4
[16] .eh_frame PROGBITS 00000000000008d8 0008d8 00010c 00 A 0 0 8
[17] .init_array INIT_ARRAY 0000000000001cf0 001cf0 000010 08 WA 0 0 8
[18] .fini_array FINI_ARRAY 0000000000001d00 001d00 000008 08 WA 0 0 8
[19] .dynamic DYNAMIC 0000000000001d08 001d08 0001d0 10 WA 5 0 8
[20] .got PROGBITS 0000000000001ee0 001ee0 000028 08 WA 0 0 8
[21] .got.plt PROGBITS 0000000000001f08 001f08 000038 08 WA 0 0 8
[22] .data PROGBITS 0000000000001f40 001f40 000014 00 WA 0 0 4
[23] .bss NOBITS 0000000000002000 002000 000008 00 WA 0 0 1
[24] .comment PROGBITS 0000000000000000 002000 000028 00 0 0 1
[25] .debug_aranges PROGBITS 0000000000000000 002028 000030 00 0 0 1
[26] .debug_info PROGBITS 0000000000000000 002058 000190 00 0 0 1
[27] .debug_abbrev PROGBITS 0000000000000000 0021e8 0000a7 00 0 0 1
[28] .debug_line PROGBITS 0000000000000000 002290 00005a 00 0 0 1
[29] .shstrtab STRTAB 0000000000000000 0022ea 0000b3 00 0 0 1
4 Sections
Each ELF file can contain zero or more segments. Sections are used at link time.
The first entry in the section header table of every ELF file is defined by the ELF standard to be a NULL entry. The type of the entry is SHT_NULL
, and all fields in the section header are zeroed out.
Sections:
.init
: Executable code that performs initialization tasks and needs to run before any other code in the binary is executed (Then it hasSHF_EXECINSTR
flag) The system executes the code in the .init section before transferring control to the main entry point of the binary..fini
: The contrary as.init
, it has executable code that must run after the main program completes..text
: Is where the main code of the program resides (Then it hasSHF_EXECINSTR
flag), it isSHT_PROGBITS
because it has user-defined code..bss
: It contains uninitialized data (TypeSHT_NOBITS
). It does not occupy space at disk to avoid space consuming, then all the data is usually initialized to zero at runtime. It is writable..data
: Program initialized data, it is writable. (TypeSHT_PROGBITS
)..rodata
: It is read-only data, such as strings used by the code, if the data should be writable then.data
is used instead. Data that goes here can be for example hardcoded strings used for aprintf
..plt
: Stands for Procedure Linkage Table. It is code used for dynamic linking purposed that helps to call external functions from shared libraries with the help of the GOT (Global Offset Table)..got.plt
: It is a table where resolved addresses from external functions are stored. It is by default writable as by default Lazy Binding is used. (Unless Relocation Read-Only is used or LD_BIND_NOW env var is exported to resolve all the imported functions at the program initialization)..rel.*
: Contains information about how parts of an ELF object or process image need to be fixed up or modified at linking or runtime (TypeSHT_REL
)..rela.*
: Contains information about how parts of an ELF object or process image need to be fixed up or modified at linking or runtime (with addend) (TypeSHT_RELA
)..dynamic
: Dynamic linking structures and objects. Contains a table ofElfN_Dyn
structures. Also contains pointers to other important information required by the dynamic linker (for instance, the dynamic string table, dynamic symbol table,.got.plt
section, and dynamic relocation section pointed to by tags of typeDT_STRTAB
,DT_SYMTAB
,DT_PLTGOT
, andDT_RELA
, respectively.init_array
: Contains an array of pointers to functions to use as constructors (each of these functions is called in turn when the binary is initialized). Ingcc
, you can mark functions in your C source files as constructors by decorating them with__attribute__((constructor)
. By default, there is an entry in.init_array
for executingframe_dummy
..fini_array
: Contains an array of pointers to functions to use as destructors..shstrtab
: Is simply an array of NULL-terminated strings that contain the names of all the sections in the binary..symtab
: Contains a symbol table, which is a table ofElfN_Sym
structures, each of which associates a symbolic name with a piece of code or data elsewhere in the binary, such as a function or variable..strtab
: Contains strings containing the symbolic names. These strings are pointed to by theElfN_Sym
structures..dynsym
: Same as.symtab
but contains symbols needed for dynamic-linking rather than static-linking..dynstr
: Same as.strtab
but contains strings needed for dynamic-linking rather than static-linking..interp
: RTLD embedded string..rel.dyn
: Global variable relocation table..rel.plt
: Function relocation table.
Older gcc
version sections:
.ctors
: Equivalent of.init_array
produced by older versions ofgcc
..dtors
: Equivalent of.fini_array
produced by older versions ofgcc
.
Sections are the building blocks of ELF files. They include:
- .text: Contains the executable code.
- .data: Contains initialized global and static variables.
- .bss: Contains uninitialized global and static variables (only size information is stored).
- .rodata: Contains read-only data, such as string literals.
- .symtab: A symbol table that includes all symbols in the file.
- .strtab: A string table containing the names of symbols.
- .rel.text or .rela.text: Relocation information for the
.text
section. - .dynamic: Dynamic linking information.
- .got: Global Offset Table.
- .plt: Procedure Linkage Table.
- .comment: Contains version control information.
- .note: Contains additional notes and metadata.
- .shstrtab: Section header string table, which holds the names of sections.
5 Segments
Each ELF file can contains zero or more sections.
In English language, Sections and Segments are synonyms of each other. However here in ELF they describe two completely different things.
- Segments are used at runtime.
Segments are derived from sections and used by the program header table to load the executable into memory. They include:
- Loadable Segments: Contain code and data that need to be loaded into memory.
- Dynamic Linking Information: Contains information needed by the dynamic linker.
- Interpreter Information: Specifies the program interpreter (e.g.,
/lib64/ld-linux-x86-64.so.2
). - Note Segments: Used to store additional information.
- TLS (Thread-Local Storage) Segments: Contains data for thread-local storage.
Two Views of ELF File
ELF File Types
Executable and Linkable Format (ELF) is a standard file format used for executables, object code, shared libraries, and core dumps. There are several different types of ELF files, each serving a specific purpose in the process of program compilation and execution.
ELF files can be categorized into different types based on their usage:
- Executable Files: Contain the code and data necessary to execute a program.
- It contains a program that can be executed directly by the operating system.
- This file type has a program header table that specifies how segments are to be loaded into memory.
- Relocatable File:
- Contains code and data that can be linked with other object files to create an executable or a shared object file.
- This file type is not directly executable. It contains sections like
.text
,.data
, and.bss
, which are not yet assigned to specific memory addresses.
- Shared Object Files: Libraries that can be dynamically linked during program execution.
- Shared libraries (.so files) are dynamically linked at runtime by executables or other shared objects.
- Core Dumps: Files created when a program crashes, containing the memory image of the process at the time of the crash.
- Used for debugging purposes to analyze the state of the program at the time of the crash.
-: Relocatable Type :-
Purpose: Relocatable files contain code and data that need to be linked with other object files to create an executable or a shared object file. They are produced by the assembler from the source code and used by the linker to produce the final output file.
Characteristics:
- Not directly executable: They need to be linked to form an executable file.
- Contains sections: These files are divided into sections, each serving a specific purpose, such as holding code, data, or metadata.
Compilation and Linking Process:
- Compilation: Source code is compiled into ELF relocatable files by the compiler. Each source file produces one or more relocatable files.
- Linking: The linker takes multiple relocatable files and combines them into a single executable or shared object file. During this process, the linker resolves symbol references and performs relocation, adjusting addresses based on where sections are placed in the final output.
Note:
ELF relocatable files (.o files) are not directly executable. They are designed to be linked together with other object files by a linker to produce an executable file or a shared object file.
Why ELF Relocatable Files Are Not Executable:
- Lack of Entry Point:
- Relocatable files do not have an entry point. The entry point is the address where the execution starts, and it's defined in the ELF header of executable files. Since relocatable files are intended to be linked with other files, they do not specify a single entry point.
- Relocation Needed:
- Relocatable files contain symbols and references that need to be resolved. These references are placeholders that the linker must fill with the correct addresses. Until this linking process is completed, the code in relocatable files cannot be executed because it contains unresolved symbols and addresses.
- Incomplete Program:
- A single relocatable file typically represents only a part of a program. It may contain a few functions or data definitions, but not the entire logic needed to perform a task. Executable files, on the other hand, are complete programs that include all the necessary code and data.
Loading and Executing Code from Relocatable Files:
To execute code from relocatable files, they must go through a linking process. Here's how it works:
- Compilation:
- Source code is compiled into one or more relocatable object files (.o). Each object file contains machine code, data, and metadata, but with unresolved references.
- Linking:
- A linker combines these relocatable files with other object files and libraries to produce an executable file or a shared library. The linker performs several tasks:
- Resolving Symbols: The linker resolves references to symbols (functions, variables) by determining their final memory addresses.
- Relocation: The linker adjusts addresses in the code and data sections to reflect their actual positions in memory.
- Creating Executable: The linker generates an ELF executable file that includes an entry point and can be loaded and executed by the operating system.
- A linker combines these relocatable files with other object files and libraries to produce an executable file or a shared library. The linker performs several tasks:
- Execution:
- The operating system's loader loads the executable file into memory, sets up the execution environment, and starts executing the program from the entry point.
ELF File Lifecycle
- Compilation: Source code is compiled into ELF object files. Each source file is transformed into an object file with machine code and data.
- Linking: Object files and libraries are linked to form an executable or shared object. The linker resolves symbols, addresses, and combines the necessary sections. The ELF Format supports both static and dynamic linking. During linking, the linker processes the ELF sections and headers to resolve symbols, perform relocations, and organize the final executable layout
- Execution: When an ELF executable is loaded into memory, the operating system's loader reads the ELF header to determine the layout of the program. It uses the program header table to load the necessary segments into memory, sets up the memory image according to the specified virtual addresses, and finally transfers control to the entry point specified in the ELF header.
Tools for Working with ELF Files
Several tools are available for working with ELF files:
1 readelf:
Displays information about ELF files.
readelf -a filename
2 objdump:
Displays detailed information about object files, including disassembly.
objdump -d filename
3 nm:
Lists symbols from object files.
nm filename
4 ld:
The GNU linker for linking object files into executables or libraries.
ld -o output input.o
5 strip:
Removes symbols and other unnecessary information from ELF files
strip filename
6 gdb:
The GNU Debugger for debugging ELF executables.
Process of Parsing ELF Kernel
1 Load the ELF File:
- Read the ELF file from the disk into a designated memory area.
- Either using BIOS or any interface driver.
2 Parse the ELF File:
- Verify the ELF magic number.
- Compare the first 4 bytes of read memory location to be
0x464C457F
which is0x7F
,E
,L
,F
.
- Compare the first 4 bytes of read memory location to be
load_kernel:
mov esi, 0x1000 ; Address of loaded kernel
mov edi, 0x200000 ; Kernel load address in memory (2MB)
; Verify ELF magic number
cmp dword [esi], 0x464C457F ; Check for 0x7F 'E' 'L' 'F'
jne invalid_elf
- Locate the program header table using the offset provided in the ELF header.
; Read ELF header
mov eax, [esi + 0x1C] ; e_phoff (program header table offset)
add eax, esi ; Convert to absolute address
mov ebx, eax ; ebx = program header table address
mov ecx, [esi + 0x2C] ; e_phnum (number of program headers)
3 Loads Segments into Memory:
- Iterate through the program headers and load each
PT_LOAD
segment into its designated memory location.
load_segments:
; Read each program header
mov eax, [ebx] ; p_type
cmp eax, 1 ; PT_LOAD
jne skip_segment
; Load segment into memory
mov eax, [ebx + 4] ; p_offset
add eax, esi ; Segment data offset
mov edx, eax ; Source address in memory
; p_vaddr (Virtual address in memory)
mov eax, [ebx + 8] ; p_vaddr
add eax, edi ; Destination address in memory, Virtual address + Load base address
mov edi, eax ; Destination address in memory
; p_filesz (Size in file)
mov eax, [ebx + 0x10] ; p_filesz
mov ecx, eax ; Number of bytes to copy
rep movsb ; Copy bytes from source to destination
; Zero out remaining memory if p_memsz > p_filesz
mov eax, [ebx + 0x14] ; p_memsz
sub eax, ecx
jz next_segment
xor al, al
rep stosb
skip_segment:
next_segment:
; Move to the next program header
add ebx, 0x20 ; Size of one program header (32 bytes)
loop load_segments
4 Transfer Control to the Kernel
- Jump to the Kernel's entry point as specified in the ELF header.
; After loading all segments, we can jump to the entry point
; Jump to kernel entry point
mov eax, [esi + 0x18] ; e_entry
add eax, edi ; Entry point address + Load base address
jmp eax ; Jump to the kernel's entry point
invalid_elf:
; Handle invalid ELF file
hlt
jmp invalid_elf