ELF Format

The Executable and Linkable Format (ELF) is a common standard file format for executable files, object code, shared libraries, and core dumps. Originally developed for Unix System V, it has become the standard binary format for Unix-like operating systems, including Linux and BSD. ELF simplifies the development and execution of programs across various systems, providing a flexible and extensible structure for different types of binary files.

The three main uses of ELF are Executable, Shared Library, and Object File.

ELF File Structure

An ELF file is composed of several parts, each serving a specific purpose in the executable's lifecycle. The key components include:

  1. ELF Header
  2. Program Header Table
  3. Section Header Table
  4. Sections
  5. Segments
68747470733a2f2f692e696d6775722e636f6d2f4169394f714f422e706e67.png

1. ELF Header

The ELF header is the starting point and first part of an ELF file. It contains metadata that describes the file's format and attributes such as the file type, machine architecture, version, entry point address, program header table offset, section header table offset and flags.

  • Location: The Elf header is always located at the start of the file, i.e., offset 0.
  • Size:
    • For 32-bit ELF files: 52 bytes.
    • For 64-bit ELF files: 64 bytes.
  • It is a fixed-size structure that provides an overview of the file's layout and properties.
  • The ELF header differs slightly between 32-bit and 64-bit ELF files primarily in terms of field sizes. For instance, addresses and offsets are 4 bytes in 32-bit ELF and 8 bytes in 64-bit ELF.

The ELF header contains metadata about the file itself. It includes:

  • Identification (e_ident): A magic number and other information to identify the file as an ELF file and specify the architecture and format.
    • Size: 16 Bytes.
    • Fields:
      • Magic Number (EI_MAG0 to EI_MAG3): The first four bytes, which should contain the magic number 0x7F, followed by the characters ELF. This identifies the file as an ELF file.
      • Class (EI_CLASS): Identifies the file as 32-bit (ELFCLASS32) or 64-bit (ELFCLASS64).
      • Data Encoding (EI_DATA): Specifies the data encoding (little-endian ELFDATA2LSB or big-endian ELFDATA2MSB).
      • Version (EI_VERSION): The ELF header version, currently always 1 (EV_CURRENT).
      • OS/ABI (EI_OSABI): Identifies the target operating system and ABI (Application Binary Interface).
      • ABI Version (EI_ABIVERSION): Specifies the version of the ABI.
      • Padding (EI_PAD): Unused bytes, padded to the size of 16 bytes for alignment.
  • File Type (e_type): The type of the ELF file (e.g., relocatable, executable, shared object, core).
    • Size: 2 Bytes
    • Explanation:
      • Specifies the type of the ELF file (e.g., ET_EXEC for executable files, ET_REL for relocatable files, ET_DYN for shared objects, ET_CORE for core dumps).
  • Machine (e_machine): The target architecture (e.g., Intel 80386).
    • Specifies the target architecture (e.g., EM_386 for Intel 80386, EM_X86_64 for AMD x86-64).
  • Version (e_version): The ELF version.
    • The version of the ELF specification, currently always 1 (EV_CURRENT).
  • Entry Point Address (e_entry): The entry point address, where the program starts executing.
    • The virtual address to which the system first transfers control, effectively the starting point of the executable.
  • e_phoff: The offset to the Program Header Table.
    • The file offset in bytes where the program header table is located.
  • e_shoff: The offset to the Section Header Table.
    • The file offset in bytes where the section header table is located.
  • e_flags: Processor-specific flags.
    • Architecture-specific flags.
  • e_ehsize: The size of the ELF header.
    • The size of this header in bytes.
  • e_phentsize: The size of each entry in the Program Header Table.
    • Typically, this is either 32 or 56 bytes depending on whether it's a 32-bit or 64-bit ELF file.
  • e_phnum: The number of entries in the Program Header Table.
    • Indicates how many program header entries are present.
  • e_shentsize: The size of each entry in the Section Header Table.
    • Typically, this is either 40 or 64 bytes depending on whether it's a 32-bit or 64-bit ELF file.
  • e_shnum: The number of entries in the Section Header Table.
    • Indicates how many section header entries are present.
  • e_shstrndx: The index of the section header string table.
    • The index of the section header table entry that contains the section names.

We can easily view all this information of any elf file through readelf command with -h option:

readelf -h elf_file

Sample Output:

ELF Header:
  Magic:   7f 45 4c 46 02 01 01 00 00 00 00 00 00 00 00 00
  Class:                             ELF64
  Data:                              2's complement, little endian
  Version:                           1 (current)
  OS/ABI:                            UNIX - System V
  ABI Version:                       0
  Type:                              EXEC (Executable file)
  Machine:                           Advanced Micro Devices X86-64
  Version:                           0x1
  Entry point address:               0x401000
  Start of program headers:          64 (bytes into file)
  Start of section headers:          8056 (bytes into file)
  Flags:                             0x0
  Size of this header:               64 (bytes)
  Size of program headers:           56 (bytes)
  Number of program headers:         8
  Size of section headers:           64 (bytes)
  Number of section headers:         29
  Section header string table index: 28

ELF Header for 32-bit (Elf32_Ehdr) struct:

#define EI_NIDENT 16

typedef struct {
    unsigned char e_ident[EI_NIDENT]; // Magic number and other info
    uint16_t      e_type;             // Object file type
    uint16_t      e_machine;          // Architecture
    uint32_t      e_version;          // Object file version
    uint32_t      e_entry;            // Entry point virtual address
    uint32_t      e_phoff;            // Program header table file offset
    uint32_t      e_shoff;            // Section header table file offset
    uint32_t      e_flags;            // Processor-specific flags
    uint16_t      e_ehsize;           // ELF header size in bytes
    uint16_t      e_phentsize;        // Program header table entry size
    uint16_t      e_phnum;            // Program header table entry count
    uint16_t      e_shentsize;        // Section header table entry size
    uint16_t      e_shnum;            // Section header table entry count
    uint16_t      e_shstrndx;         // Section header string table index
} Elf32_Ehdr;

// In Assembly

; ELF header structure
ELFMAG0      equ 0x7F
ELFMAG1      equ 'E'
ELFMAG2      equ 'L'
ELFMAG3      equ 'F'
ELFCLASS32   equ 1
ELFDATA2LSB  equ 1
EV_CURRENT   equ 1
ET_EXEC      equ 2
EM_386       equ 3

Elf32_Ehdr:
    db ELFMAG0                  ; e_ident[EI_MAG0]
    db ELFMAG1                  ; e_ident[EI_MAG1]
    db ELFMAG2                  ; e_ident[EI_MAG2]
    db ELFMAG3                  ; e_ident[EI_MAG3]
    db ELFCLASS32               ; e_ident[EI_CLASS]
    db ELFDATA2LSB              ; e_ident[EI_DATA]
    db EV_CURRENT               ; e_ident[EI_VERSION]
    db 0                        ; e_ident[EI_OSABI]
    db 0                        ; e_ident[EI_ABIVERSION]
    times 7 db 0                ; e_ident[EI_PAD]
    dw ET_EXEC                  ; e_type
    dw EM_386                   ; e_machine
    dd EV_CURRENT               ; e_version
    dd _start                   ; e_entry (entry point)
    dd ph_offset                ; e_phoff (program header table offset)
    dd 0                        ; e_shoff (section header table offset)
    dd 0                        ; e_flags
    dw ehdr_size                ; e_ehsize (ELF header size)
    dw phdr_size                ; e_phentsize (program header entry size)
    dw 1                        ; e_phnum (number of program headers)
    dw 0                        ; e_shentsize (section header entry size)
    dw 0                        ; e_shnum (number of section headers)
    dw 0                        ; e_shstrndx (section header string table index)

ehdr_size equ $ - Elf32_Ehdr
phdr_size equ $ - Elf32_Phdr

; Program header structure
PT_LOAD     equ 1
PF_X        equ 1
PF_W        equ 2
PF_R        equ 4

Elf32_Phdr:
    dd PT_LOAD                 ; p_type (loadable segment)
    dd 0                       ; p_offset (offset in file)
    dd segment_start           ; p_vaddr (virtual address in memory)
    dd segment_start           ; p_paddr (physical address, irrelevant here)
    dd segment_size            ; p_filesz (size in file)
    dd segment_size            ; p_memsz (size in memory)
    dd PF_R + PF_W + PF_X      ; p_flags (segment permissions)
    dd 0x1000                  ; p_align (alignment)

// With Reserved

; ELF Header (32-bit)
struc Elf32_Ehdr
    .e_ident resb 16              ; Magic number and other info
    .e_type resw 1                ; Object file type
    .e_machine resw 1             ; Architecture
    .e_version resd 1             ; Object file version
    .e_entry resd 1               ; Entry point virtual address
    .e_phoff resd 1               ; Program header table file offset
    .e_shoff resd 1               ; Section header table file offset
    .e_flags resd 1               ; Processor-specific flags
    .e_ehsize resw 1              ; ELF header size in bytes
    .e_phentsize resw 1           ; Program header table entry size
    .e_phnum resw 1               ; Program header table entry count
    .e_shentsize resw 1           ; Section header table entry size
    .e_shnum resw 1               ; Section header table entry count
    .e_shstrndx resw 1            ; Section header string table index
endstruc

; Program Header (32-bit)
struc Elf32_Phdr
    .p_type resd 1                ; Segment type
    .p_offset resd 1              ; Segment file offset
    .p_vaddr resd 1               ; Segment virtual address
    .p_paddr resd 1               ; Segment physical address
    .p_filesz resd 1              ; Segment size in file
    .p_memsz resd 1               ; Segment size in memory
    .p_flags resd 1               ; Segment flags
    .p_align resd 1               ; Segment alignment
endstruc

; Section Header (32-bit)
struc Elf32_Shdr
    .sh_name resd 1               ; Section name (string table index)
    .sh_type resd 1               ; Section type
    .sh_flags resd 1              ; Section flags
    .sh_addr resd 1               ; Section virtual address in memory
    .sh_offset resd 1             ; Section file offset
    .sh_size resd 1               ; Section size in bytes
    .sh_link resd 1               ; Link to another section
    .sh_info resd 1               ; Additional section information
    .sh_addralign resd 1          ; Section alignment
    .sh_entsize resd 1            ; Entry size if section holds a table
endstruc
OffsetFieldSize (bytes)Description
0x00e_ident16Magic number and other info
0x10e_type2Object file type
0x12e_machine2Architecture
0x14e_version4Object file version
0x18e_entry4Entry point virtual address
0x1Ce_phoff4Program header table file offset
0x20e_shoff4Section header table file offset
0x24e_flags4Processor-specific flags
0x28e_ehsize2ELF header size in bytes
0x2Ae_phentsize2Program header table entry size
0x2Ce_phnum2Program header table entry count
0x2Ee_shentsize2Section header table entry size
0x30e_shnum2Section header table entry count
0x32e_shstrndx2Section header string table index

ELF Header for 64-bit (Elf64_Ehdr) struct:

#define EI_NIDENT 16

typedef struct {
    unsigned char e_ident[EI_NIDENT]; // Magic number and other info
    uint16_t      e_type;             // Object file type
    uint16_t      e_machine;          // Architecture
    uint32_t      e_version;          // Object file version
    uint64_t      e_entry;            // Entry point virtual address
    uint64_t      e_phoff;            // Program header table file offset
    uint64_t      e_shoff;            // Section header table file offset
    uint32_t      e_flags;            // Processor-specific flags
    uint16_t      e_ehsize;           // ELF header size in bytes
    uint16_t      e_phentsize;        // Program header table entry size
    uint16_t      e_phnum;            // Program header table entry count
    uint16_t      e_shentsize;        // Section header table entry size
    uint16_t      e_shnum;            // Section header table entry count
    uint16_t      e_shstrndx;         // Section header string table index
} Elf64_Ehdr;
  • Magic Number: Identifies the file as an ELF file.
  • Class: Specifies whether the file is 32-bit or 64-bit.
  • Data Encoding: Indicates the endianness of the file.
  • Version: The ELF specification version.
  • OS/ABI: Specifies the target operating system and ABI.
  • Type: Indicates the type of file (e.g., executable, shared object, core file).
  • Machine: Specifies the target architecture (e.g., x86, ARM).
  • Entry Point: The memory address where execution starts.
  • Program Header Offset: Offset to the program header table.
  • Section Header Offset: Offset to the section header table.
  • Flags: Architecture-specific flags.
  • Header Size: Size of the ELF header.
  • Program Header Entry Size and Count: Size and number of entries in the program header table.
  • Section Header Entry Size and Count: Size and number of entries in the section header table.
  • String Table Index: Index of the section header string table.

Explanation of ELF Header Fields

  • The EI_NIDENT, is the size in bytes of the f
  • e_ident (Offset 0x00, Size 16 bytes): It is a 16-byte array that identifies the ELF object, it always starts with 0x7fELF.
    • The first 16 bytes of the ELF header contain the magic number and other information such as the file class, data encoding, and version.
    • The magic number (first 4 bytes) should be 0x7F 'E' 'L' 'F'.
    • e_ident[EI_MAG0] to e_ident[EI_MAG3]: Magic number (0x7f, 'E', 'L', 'F') identifying the file as an ELF file.
    • e_ident[EI_CLASS]: Identifies the file as 32-bit (ELFCLASS32) or 64-bit (ELFCLASS64).
      • ELFCLASS32 (1): 32-bit objects
      • ELFCLASS64 (2): 64-bit objects
    • e_ident[EI_DATA]: Specifies the endianness (little-endian or big-endian).
      • ELFDATA2LSB (1): Little-endian
      • ELFDATA2MSB (2): Big-endian
    • e_ident[EI_VERSION]: Version of the ELF specification (currently, it should be 1).
      • EV_CURRENT (1)
    • e_ident[EI_OSABI]: Identifies the target operating system and ABI.
      • ELFOSABI_SYSV (0): UNIX System V ABI
      • ELFOSABI_HPUX (1): HP-UX
      • ELFOSABI_NETBSD (2): NetBSD
      • ELFOSABI_LINUX (3): Linux
      • Others: Various OS-specific values
    • e_ident[EI_ABIVERSION]: ABI version.
    • e_ident[EI_PAD]: Unused, reserved for future use.
      • Padding bytes (unused)
  • e_type (Offset 0x10, Size 2 bytes):
    • Specifies the type of the ELF file (e.g., ET_EXEC for executable files, ET_DYN for shared objects, ET_REL for relocatable files).
      • ET_NONE (Undefined)(0): ELF Format unknown or not specified.
      • ET_EXEC: (Executable file)(1): An ELF executable.
      • ET_DYN: (Shared object)(2): A library or a dynamically-linked executable.
      • ET_REL (Relocatable file)(3): Relocatable files (.o object files).
      • ET_CORE (Core dump)(4): A core dump file.
      • ET_LOOS to ET_HIOS (0xfe00 to 0xfeff): Operating system-specific
      • ET_LOPROC to ET_HIPROC (0xff00 to 0xffff): Processor-specific
  • e_machine (Offset 0x12, Size 2 bytes):
    • Specifies the target architecture (e.g., EM_386 for Intel 80386, EM_X86_64 for x86-64).
      • EM_NONE (0): No machine.
      • EM_M32 (1): AT&T WE 32100.
      • EM_SPARC (2): SPARC.
      • EM_386 (3): Intel 80386.
      • EM_68K (4): Motorola 68000.
      • EM_88K (5): Motorola 88000.
      • EM_860 (7): Intel 80860.
      • EM_MIPS (8): MIPS RS3000.
      • EM_PARISC (15): HP/PA.
      • EM_SPARC32PLUS (18): SPARC with enhanced instruction set.
      • EM_PPC (20): PowerPC.
      • EM_PPC64 (21): PowerPC 64-bit.
      • EM_ARM (40): ARM.
      • EM_X86_64 (62): AMD x86-64.
      • EM_AARCH64 (183): ARM 64-bit.
      • Other values specify different architectures.
  • e_version (Offset 0x14, Size 4 bytes):
    • The version of the ELF specification (should be 1 for the current version).
      • EV_NONE (0): Invalid version.
      • EV_CURRENT (1): Current version.
  • e_entry (Offset 0x18, Size 4 bytes):
    • The virtual address to which the system first transfers control, i.e., the entry point of the executable.
  • e_phoff (Offset 0x1C, Size 4 bytes):
    • The offset of the program header table in the file.
  • e_shoff (Offset 0x20, Size 4 bytes):
    • The offset of the section header table in the file.
  • e_flags (Offset 0x24, Size 4 bytes):
    • Processor-specific flags.
      • Flags specific to the target architecture.
  • e_ehsize (Offset 0x28, Size 2 bytes):
    • The size of the ELF header in bytes.
      • Ehdr size (in bytes). (Usually 64 byte in 64-bit ELF and 52 bytes for 32 bits)
  • e_phentsize (Offset 0x2A, Size 2 bytes):
    • The size of each entry in the program header table.
  • e_phnum (Offset 0x2C, Size 2 bytes):
    • The number of entries in the program header table.
  • e_shentsize (Offset 0x2E, Size 2 bytes):
    • The size of each entry in the section header table.
  • e_shnum (Offset 0x30, Size 2 bytes):
    • The number of entries in the section header table.
  • e_shstrndx (Offset 0x32, Size 2 bytes):
    • The index of the section header string table, which contains the names of the sections.

-> e_type defines:

#define ET_NONE		0		/* No file type */
#define ET_REL		1		/* Relocatable file */
#define ET_EXEC		2		/* Executable file */
#define ET_DYN		3		/* Shared object file */
#define ET_CORE		4		/* Core file */
#define	ET_NUM		5		/* Number of defined types */
#define ET_LOOS		0xfe00		/* OS-specific range start */
#define ET_HIOS		0xfeff		/* OS-specific range end */
#define ET_LOPROC	0xff00		/* Processor-specific range start */
#define ET_HIPROC	0xffff		/* Processor-specific range end */

1 Executable Files (ET_EXEC):

  • These files contain a program that is ready to be executed. When you run a command in Unix-like operating systems, the shell loads the corresponding executable ELF file.
  • Generation Process:
    • Compilation: Source code files (e.g., .c, .cpp) are compiled into object files (.o) using a compiler (e.g., gcc, clang).
    • Linking: The object files are linked together by a linker (e.g., ld) to produce an executable file. During this process, the linker resolves symbol references and assigns runtime addresses.
  • Tools Involved:
    • Compiler: gcc, clang, etc.
    • Linker: ld or the linker stage of gcc, clang, etc.
  • Example Command:
    • gcc -o myprogram myprogram.c.

2 Relocatable Files (ET_REL)

  • These are object files created by a compiler or assembler that can be linked with other object files to produce either a shared object file or an executable file. They contain code and data in a form suitable for linking but not for execution.
  • Generation Process:
    • Compilation: Source code files are compiled into relocatable object files. These files contain code and data that are not yet assigned final memory addresses, making them suitable for linking but not for execution.
  • Tools Involved:
    • Compiler: gcc, clang, etc.
  • Example Command:
    • gcc -c mymodule.c
      • This produces mymodule.o, a relocatable file.

3 Shared Object Files (ET_DYN)

  • Also known as shared libraries, these files can be dynamically linked with an executable at run time or with other shared objects to form a single executable image in memory. They enable code reuse and modular programming.
  • Generation Process:
    • Compilation and Linking: Source code files are compiled into object files, and then these object files are linked together into a shared library. Shared libraries contain code that can be shared by multiple programs at runtime.
  • Tools Used:
    • Compiler: gcc, clang, etc.
    • Linker: ld, gold, etc.
  • Example:
    • gcc -shared -o mylib.so lib.o

-> e_machine defines:

#define EM_NONE		 0	/* No machine */
#define EM_M32		 1	/* AT&T WE 32100 */
#define EM_SPARC	 2	/* SUN SPARC */
#define EM_386		 3	/* Intel 80386 */
#define EM_68K		 4	/* Motorola m68k family */
#define EM_88K		 5	/* Motorola m88k family */
#define EM_IAMCU	 6	/* Intel MCU */
#define EM_860		 7	/* Intel 80860 */
#define EM_MIPS		 8	/* MIPS R3000 big-endian */
#define EM_S370		 9	/* IBM System/370 */
#define EM_MIPS_RS3_LE	10	/* MIPS R3000 little-endian */
				/* reserved 11-14 */
#define EM_PARISC	15	/* HPPA */
				/* reserved 16 */
#define EM_VPP500	17	/* Fujitsu VPP500 */
#define EM_SPARC32PLUS	18	/* Sun's "v8plus" */
#define EM_960		19	/* Intel 80960 */
#define EM_PPC		20	/* PowerPC */
#define EM_PPC64	21	/* PowerPC 64-bit */
#define EM_S390		22	/* IBM S390 */
#define EM_SPU		23	/* IBM SPU/SPC */
				/* reserved 24-35 */
#define EM_V800		36	/* NEC V800 series */
#define EM_FR20		37	/* Fujitsu FR20 */
#define EM_RH32		38	/* TRW RH-32 */
#define EM_RCE		39	/* Motorola RCE */
#define EM_ARM		40	/* ARM */
#define EM_FAKE_ALPHA	41	/* Digital Alpha */
#define EM_SH		42	/* Hitachi SH */
#define EM_SPARCV9	43	/* SPARC v9 64-bit */
#define EM_TRICORE	44	/* Siemens Tricore */
#define EM_ARC		45	/* Argonaut RISC Core */
#define EM_H8_300	46	/* Hitachi H8/300 */
#define EM_H8_300H	47	/* Hitachi H8/300H */
#define EM_H8S		48	/* Hitachi H8S */
#define EM_H8_500	49	/* Hitachi H8/500 */
#define EM_IA_64	50	/* Intel Merced */
#define EM_MIPS_X	51	/* Stanford MIPS-X */
#define EM_COLDFIRE	52	/* Motorola Coldfire */
#define EM_68HC12	53	/* Motorola M68HC12 */
#define EM_MMA		54	/* Fujitsu MMA Multimedia Accelerator */
#define EM_PCP		55	/* Siemens PCP */
#define EM_NCPU		56	/* Sony nCPU embeeded RISC */
#define EM_NDR1		57	/* Denso NDR1 microprocessor */
#define EM_STARCORE	58	/* Motorola Start*Core processor */
#define EM_ME16		59	/* Toyota ME16 processor */
#define EM_ST100	60	/* STMicroelectronic ST100 processor */
#define EM_TINYJ	61	/* Advanced Logic Corp. Tinyj emb.fam */
#define EM_X86_64	62	/* AMD x86-64 architecture */
#define EM_PDSP		63	/* Sony DSP Processor */
#define EM_PDP10	64	/* Digital PDP-10 */
#define EM_PDP11	65	/* Digital PDP-11 */
#define EM_FX66		66	/* Siemens FX66 microcontroller */
#define EM_ST9PLUS	67	/* STMicroelectronics ST9+ 8/16 mc */
#define EM_ST7		68	/* STmicroelectronics ST7 8 bit mc */
#define EM_68HC16	69	/* Motorola MC68HC16 microcontroller */
#define EM_68HC11	70	/* Motorola MC68HC11 microcontroller */
#define EM_68HC08	71	/* Motorola MC68HC08 microcontroller */
#define EM_68HC05	72	/* Motorola MC68HC05 microcontroller */
#define EM_SVX		73	/* Silicon Graphics SVx */
#define EM_ST19		74	/* STMicroelectronics ST19 8 bit mc */
#define EM_VAX		75	/* Digital VAX */
#define EM_CRIS		76	/* Axis Communications 32-bit emb.proc */
#define EM_JAVELIN	77	/* Infineon Technologies 32-bit emb.proc */
#define EM_FIREPATH	78	/* Element 14 64-bit DSP Processor */
#define EM_ZSP		79	/* LSI Logic 16-bit DSP Processor */
#define EM_MMIX		80	/* Donald Knuth's educational 64-bit proc */
#define EM_HUANY	81	/* Harvard University machine-independent object files */
#define EM_PRISM	82	/* SiTera Prism */
#define EM_AVR		83	/* Atmel AVR 8-bit microcontroller */
#define EM_FR30		84	/* Fujitsu FR30 */
#define EM_D10V		85	/* Mitsubishi D10V */
#define EM_D30V		86	/* Mitsubishi D30V */
#define EM_V850		87	/* NEC v850 */
#define EM_M32R		88	/* Mitsubishi M32R */
#define EM_MN10300	89	/* Matsushita MN10300 */
#define EM_MN10200	90	/* Matsushita MN10200 */
#define EM_PJ		91	/* picoJava */
#define EM_OPENRISC	92	/* OpenRISC 32-bit embedded processor */
#define EM_ARC_COMPACT	93	/* ARC International ARCompact */
#define EM_XTENSA	94	/* Tensilica Xtensa Architecture */
#define EM_VIDEOCORE	95	/* Alphamosaic VideoCore */
#define EM_TMM_GPP	96	/* Thompson Multimedia General Purpose Proc */
#define EM_NS32K	97	/* National Semi. 32000 */
#define EM_TPC		98	/* Tenor Network TPC */
#define EM_SNP1K	99	/* Trebia SNP 1000 */
#define EM_ST200	100	/* STMicroelectronics ST200 */
#define EM_IP2K		101	/* Ubicom IP2xxx */
#define EM_MAX		102	/* MAX processor */
#define EM_CR		103	/* National Semi. CompactRISC */
#define EM_F2MC16	104	/* Fujitsu F2MC16 */
#define EM_MSP430	105	/* Texas Instruments msp430 */
#define EM_BLACKFIN	106	/* Analog Devices Blackfin DSP */
#define EM_SE_C33	107	/* Seiko Epson S1C33 family */
#define EM_SEP		108	/* Sharp embedded microprocessor */
#define EM_ARCA		109	/* Arca RISC */
#define EM_UNICORE	110	/* PKU-Unity & MPRC Peking Uni. mc series */
#define EM_EXCESS	111	/* eXcess configurable cpu */
#define EM_DXP		112	/* Icera Semi. Deep Execution Processor */
#define EM_ALTERA_NIOS2 113	/* Altera Nios II */
#define EM_CRX		114	/* National Semi. CompactRISC CRX */
#define EM_XGATE	115	/* Motorola XGATE */
#define EM_C166		116	/* Infineon C16x/XC16x */
#define EM_M16C		117	/* Renesas M16C */
#define EM_DSPIC30F	118	/* Microchip Technology dsPIC30F */
#define EM_CE		119	/* Freescale Communication Engine RISC */
#define EM_M32C		120	/* Renesas M32C */
				/* reserved 121-130 */
#define EM_TSK3000	131	/* Altium TSK3000 */
#define EM_RS08		132	/* Freescale RS08 */
#define EM_SHARC	133	/* Analog Devices SHARC family */
#define EM_ECOG2	134	/* Cyan Technology eCOG2 */
#define EM_SCORE7	135	/* Sunplus S+core7 RISC */
#define EM_DSP24	136	/* New Japan Radio (NJR) 24-bit DSP */
#define EM_VIDEOCORE3	137	/* Broadcom VideoCore III */
#define EM_LATTICEMICO32 138	/* RISC for Lattice FPGA */
#define EM_SE_C17	139	/* Seiko Epson C17 */
#define EM_TI_C6000	140	/* Texas Instruments TMS320C6000 DSP */
#define EM_TI_C2000	141	/* Texas Instruments TMS320C2000 DSP */
#define EM_TI_C5500	142	/* Texas Instruments TMS320C55x DSP */
#define EM_TI_ARP32	143	/* Texas Instruments App. Specific RISC */
#define EM_TI_PRU	144	/* Texas Instruments Prog. Realtime Unit */
				/* reserved 145-159 */
#define EM_MMDSP_PLUS	160	/* STMicroelectronics 64bit VLIW DSP */
#define EM_CYPRESS_M8C	161	/* Cypress M8C */
#define EM_R32C		162	/* Renesas R32C */
#define EM_TRIMEDIA	163	/* NXP Semi. TriMedia */
#define EM_QDSP6	164	/* QUALCOMM DSP6 */
#define EM_8051		165	/* Intel 8051 and variants */
#define EM_STXP7X	166	/* STMicroelectronics STxP7x */
#define EM_NDS32	167	/* Andes Tech. compact code emb. RISC */
#define EM_ECOG1X	168	/* Cyan Technology eCOG1X */
#define EM_MAXQ30	169	/* Dallas Semi. MAXQ30 mc */
#define EM_XIMO16	170	/* New Japan Radio (NJR) 16-bit DSP */
#define EM_MANIK	171	/* M2000 Reconfigurable RISC */
#define EM_CRAYNV2	172	/* Cray NV2 vector architecture */
#define EM_RX		173	/* Renesas RX */
#define EM_METAG	174	/* Imagination Tech. META */
#define EM_MCST_ELBRUS	175	/* MCST Elbrus */
#define EM_ECOG16	176	/* Cyan Technology eCOG16 */
#define EM_CR16		177	/* National Semi. CompactRISC CR16 */
#define EM_ETPU		178	/* Freescale Extended Time Processing Unit */
#define EM_SLE9X	179	/* Infineon Tech. SLE9X */
#define EM_L10M		180	/* Intel L10M */
#define EM_K10M		181	/* Intel K10M */
				/* reserved 182 */
#define EM_AARCH64	183	/* ARM AARCH64 */
				/* reserved 184 */
#define EM_AVR32	185	/* Amtel 32-bit microprocessor */
#define EM_STM8		186	/* STMicroelectronics STM8 */
#define EM_TILE64	187	/* Tileta TILE64 */
#define EM_TILEPRO	188	/* Tilera TILEPro */
#define EM_MICROBLAZE	189	/* Xilinx MicroBlaze */
#define EM_CUDA		190	/* NVIDIA CUDA */
#define EM_TILEGX	191	/* Tilera TILE-Gx */
#define EM_CLOUDSHIELD	192	/* CloudShield */
#define EM_COREA_1ST	193	/* KIPO-KAIST Core-A 1st gen. */
#define EM_COREA_2ND	194	/* KIPO-KAIST Core-A 2nd gen. */
#define EM_ARC_COMPACT2	195	/* Synopsys ARCompact V2 */
#define EM_OPEN8	196	/* Open8 RISC */
#define EM_RL78		197	/* Renesas RL78 */
#define EM_VIDEOCORE5	198	/* Broadcom VideoCore V */
#define EM_78KOR	199	/* Renesas 78KOR */
#define EM_56800EX	200	/* Freescale 56800EX DSC */
#define EM_BA1		201	/* Beyond BA1 */
#define EM_BA2		202	/* Beyond BA2 */
#define EM_XCORE	203	/* XMOS xCORE */
#define EM_MCHP_PIC	204	/* Microchip 8-bit PIC(r) */
				/* reserved 205-209 */
#define EM_KM32		210	/* KM211 KM32 */
#define EM_KMX32	211	/* KM211 KMX32 */
#define EM_EMX16	212	/* KM211 KMX16 */
#define EM_EMX8		213	/* KM211 KMX8 */
#define EM_KVARC	214	/* KM211 KVARC */
#define EM_CDP		215	/* Paneve CDP */
#define EM_COGE		216	/* Cognitive Smart Memory Processor */
#define EM_COOL		217	/* Bluechip CoolEngine */
#define EM_NORC		218	/* Nanoradio Optimized RISC */
#define EM_CSR_KALIMBA	219	/* CSR Kalimba */
#define EM_Z80		220	/* Zilog Z80 */
#define EM_VISIUM	221	/* Controls and Data Services VISIUMcore */
#define EM_FT32		222	/* FTDI Chip FT32 */
#define EM_MOXIE	223	/* Moxie processor */
#define EM_AMDGPU	224	/* AMD GPU */
				/* reserved 225-242 */
#define EM_RISCV	243	/* RISC-V */

#define EM_BPF		247	/* Linux BPF -- in-kernel virtual machine */
#define EM_CSKY		252     /* C-SKY */

#define EM_NUM		253

/* Old spellings/synonyms.  */

#define EM_ARC_A5	EM_ARC_COMPACT

/* If it is necessary to assign new unofficial EM_* values, please
   pick large random numbers (0x8523, 0xa7f2, etc.) to minimize the
   chances of collision with official or non-GNU unofficial values.  */

#define EM_ALPHA	0x9026

-> e_version defines:

#define EV_NONE		0		/* Invalid ELF version */
#define EV_CURRENT	1		/* Current version */
#define EV_NUM		2

2 Program Header Table (Segment Header Table)

The Program Header Table provides the system with information necessary to load and execute a program. Each entry in this table describes a segment or other information the system needs to prepare the program for execution. These segment need to be loaded into the memory for execution. Each entry in this table provides information about the segment's type, offset, virtual address, physical address, file size, memory size, flags, and alignment.

  • Location:
    • The Program Header Table's location within an ELF file is specified by the ELF header, which resides at the very beginning of the file. The ELF header contains an offset and a count of entries for the Program Header Table.
  • ELF Header Fields Relevant to Program Headers:
    • e_phoff: This field in the ELF header specifies the offset of the Program Header Table in the file.
    • e_phentsize: This field specifies the size of each entry in the Program Header Table.
    • e_phnum: This field specifies the number of entries in the Program Header Table.

Each entry in the table contains:

Structure of the Program Header Table

The Program Header table consists of an array of Program Headers, each defined by the Elf32_Phdr structure for 32-bit ELF files and the Elf64_Phdr structure for 64-bit ELF files.

OffsetFieldSize (bytes)Description
0x00p_type4Segment type
0x04p_offset4Segment file offset
0x08p_vaddr4Segment virtual address
0x0Cp_paddr4Segment physical address (unused in many systems)
0x10p_filesz4Size of segment in the file
0x14p_memsz4Size of segment in memory
0x18p_flags4Segment flags
0x1Cp_align4Segment alignment

1 ELF Program Header for 32-bit (ELF32_Phdr):

typedef struct {
    uint32_t p_type;   // Segment type
    uint32_t p_offset; // Segment file offset
    uint32_t p_vaddr;  // Segment virtual address
    uint32_t p_paddr;  // Segment physical address
    uint32_t p_filesz; // Segment size in file
    uint32_t p_memsz;  // Segment size in memory
    uint32_t p_flags;  // Segment flags
    uint32_t p_align;  // Segment alignment
} Elf32_Phdr;

2 ELF Program Header for 64-bit (Elf64_Phdr):

typedef struct {
    uint32_t   p_type;   // Segment type
    uint32_t   p_flags;  // Segment flags
    uint64_t   p_offset; // Segment file offset
    uint64_t   p_vaddr;  // Segment virtual address
    uint64_t   p_paddr;  // Segment physical address
    uint64_t   p_filesz; // Segment size in file
    uint64_t   p_memsz;  // Segment size in memory
    uint64_t   p_align;  // Segment alignment
} Elf64_Phdr;

Explanation of Program Header Fields:

  • p_type (Offset 0x00, Size 4 bytes):
    • Identifies the type of the segment. The purpose and semantics of the segment (e.g., loadable segment, dynamic linking information, program interpreter information).
    • Common values include:
      • PT_NULL (0): Unused entry.
      • PT_LOAD (1): Loadable segment. This is the most common type.
      • PT_DYNAMIC (2): Dynamic linking information.
      • PT_INTERP (3): Interpreter information.
      • PT_NOTE (4): Auxiliary information.
      • PT_SHLIB (5): Reserved.
      • PT_PHDR (6): Program header table itself.
  • p_offset (Offset 0x04, Size 4 bytes):
    • Offset of the segment in the file.
  • p_vaddr (Offset 0x08, Size 4 bytes):
    • Virtual address of the segment in memory. The virtual address where the segment should be loaded into memory.
  • p_paddr (Offset 0x0C, Size 4 bytes):
    • Physical address of the segment (relevant for systems where physical addressing is used).
    • Relevant for certain architectures; usually, it’s the same as p_vaddr.
  • p_filesz (Offset 0x10, Size 4 bytes):
    • Size of the segment in the file.
  • p_memsz (Offset 0x14, Size 4 bytes):
    • Size of the segment in memory.
    • Size of the segment in memory (can be larger than p_filesz if the segment includes zero-initialized data).
  • p_flags (Offset 0x18, Size 4 bytes):
    • Segment-dependent flags (e.g., PF_R, PF_W, PF_X for read, write, and execute permissions).
      • PF_X (1): Execute permission.
      • PF_W (2): Write permission.
      • PF_R (4): Read permission.
  • p_align (Offset 0x1C, Size 4 bytes):
    • Alignment of the segment in memory and in the file. Must be a power of two.
    • 0 and 1 mean no alignment is required.

Example of ELF Executable File With Segments:

To better understand the content of the Program Header Table, let’s break down a few possible entries:

Example: Segment 1: Loadable Segment (Code Segment)

p_type:         1  (PT_LOAD)
p_offset:       0x1000
p_vaddr:        0x80481000
p_paddr:        0x80481000
p_filesz:       0x1000
p_memsz:        0x2000
p_flags:        0x5  (PF_R | PF_X)
p_align:        0x1000
  • p_type (1): This segment is loadable.
  • p_offset (0x1000): This segment starts at offset 0x1000 in the file.
  • p_vaddr (0x80481000): The segment should be loaded at virtual address 0x80481000.
  • p_paddr (0x80481000): The segment should be loaded at physical address 0x80481000.
  • p_filesz (0x1000): The segment occupies 0x1000 bytes in the file.
  • p_memsz (0x2000): The segment occupies 0x2000 bytes in memory.
  • p_flags (0x5): The segment is readable and executable.
  • p_align (0x1000): The segment must be aligned on a 0x1000-byte boundary.

Example: Segment 2: Loadable Segment (Data Segment)

Type: PT_LOAD
Flags: PF_W (Writable), PF_R (Readable)
Offset in File: 0x0000000000000C00
Virtual Address (VADDR): 0x0000000000600000
Physical Address (PADDR): 0x0000000000600000
File Size: 0x0000000000000400 (1024 bytes)
Memory Size: 0x0000000000000400 (1024 bytes)
Alignment: 0x0000000000000000

Viewing Program Header Table with readelf:

readelf -l file

This command will display the ELF file header and the program headers. Here is an example of what the program header table might look like:

Elf file type is EXEC (Executable file)
Entry point 0x401020
There are 9 program headers, starting at offset 64

Program Headers:
  Type           Offset             VirtAddr           PhysAddr
                 FileSiz            MemSiz              Flags  Align
  LOAD           0x0000000000000000 0x0000000000400000 0x0000000000400000
                 0x0000000000000644 0x0000000000000644  R E    200000
  LOAD           0x0000000000001000 0x0000000000601000 0x0000000000601000
                 0x0000000000000224 0x0000000000000224  RW     200000
  DYNAMIC        0x0000000000001028 0x0000000000601028 0x0000000000601028
                 0x00000000000001d0 0x00000000000001d0  RW     8
  NOTE           0x0000000000000064 0x0000000000400064 0x0000000000400064
                 0x0000000000000020 0x0000000000000020  R      4
  GNU_EH_FRAME   0x0000000000000514 0x0000000000400514 0x0000000000400514
                 0x000000000000004c 0x000000000000004c  R      4
  GNU_STACK      0x0000000000000000 0x0000000000000000 0x0000000000000000
                 0x0000000000000000 0x0000000000000000  RW     8
  GNU_RELRO      0x0000000000001000 0x0000000000601000 0x0000000000601000
                 0x00000000000000f0 0x000000000

3 Section Header Table

The section header table describes the various sections that make up the ELF file, which hold various types of data used during linking.

Sections are used during linking and have no meaning at runtime.

The Section Header Table is an array of section header entries, each corresponding to a section in the ELF file. The table itself is located at a file offset specified in the ELF header. The number of entries and the size of each entry are also specified in the ELF header.

Each entry in the table describes a single section:

  • sh_name: Section name (offset in the section header string table).
  • sh_type: Section type.
  • sh_flags: Section flags.
  • sh_addr: Virtual address of the section in memory (if applicable).
  • sh_offset: Offset of the section in the file image.
  • sh_size: Size of the section.
  • sh_link: Section index link.
  • sh_info: Extra information about the section.
  • sh_addralign: Section alignment.
  • sh_entsize: Size of entries, if the section has a table.

Structure of the Section Header Table:

The section header table consists of an array of section headers, each defined by the Elf32_Shdr structure for 32-bit ELF files and the Elf64_Shdr structure for 64-bit ELF files.

1 ELF Section Header for 32-bit (Elf32_Shdr):

Each entry is 40 bytes in size.

typedef struct {
    uint32_t sh_name;      // Section name (string table index)
    uint32_t sh_type;      // Section type
    uint32_t sh_flags;     // Section flags
    uint32_t sh_addr;      // Section virtual address at execution
    uint32_t sh_offset;    // Section file offset
    uint32_t sh_size;      // Section size in bytes
    uint32_t sh_link;      // Link to another section
    uint32_t sh_info;      // Additional section information
    uint32_t sh_addralign; // Section alignment
    uint32_t sh_entsize;   // Entry size if section holds table
} Elf32_Shdr;
2 ELF Section Header for 64-bit (Elf64_Shdr):
typedef struct {
    uint32_t   sh_name;      // Section name (string table index)
    uint32_t   sh_type;      // Section type
    uint64_t   sh_flags;     // Section flags
    uint64_t   sh_addr;      // Section virtual address at execution
    uint64_t   sh_offset;    // Section file offset
    uint64_t   sh_size;      // Section size in bytes
    uint32_t   sh_link;      // Link to another section
    uint32_t   sh_info;      // Additional section information
    uint64_t   sh_addralign; // Section alignment
    uint64_t   sh_entsize;   // Entry size if section holds table
} Elf64_Shdr;

-: Explanation of Section Header Fields :-

32-bit Section Header Table

Each entry is 40 bytes in size and includes the following fields:

  1. sh_name (4 bytes):
    1. Index into the section header string table, which provides the name of the section.
  2. sh_type (4 bytes):
    1. Identifies the type of the section (e.g., SHT_PROGBITS, SHT_SYMTAB, SHT_STRTAB).
      • SHT_NULL Section table entry unused.
      • SHT_PROGBITS: Program data (Such as machine instructions or constants).
      • SHT_SYMTAB: Symbol table. (Static symbol table)
      • SHT_STRTAB: String table.
      • SHT_RELA: Relocation entries with addends.
      • SHT_HASH: Symbol hash table.
      • SHT_DYNAMIC: Dynamic linking information.
      • SHT_NOTE: Notes.
      • SHT_NOBITS: Uninitialized data.
      • SHT_REL: Relocation entries without addends.
      • SHT_SHLIB: Reserved.
      • SHT_DYNSYM Dynamic linker symbol table. (Dynamic-linker-used symbol table)
  3. sh_flags (4 bytes):
    1. Section attributes (e.g., SHF_WRITE, SHF_ALLOC, SHF_EXECINSTR).
      • SHF_WRITE: Writable at runtime.
      • SHF_ALLOC: The section will be loaded to virtual memory at runtime.
      • SHF_EXECINSTR: Contains executable instructions.
  4. sh_addr (4 bytes):
    1. Virtual address of the section in memory when the ELF file is loaded.
  5. sh_offset (4 bytes):
    1. Offset of the section in the file image.
  6. sh_size (4 bytes):
    1. Size of the section in bytes.
  7. sh_link (4 bytes):
    1. Section index link. Interpretation depends on the section type.
  8. sh_info (4 bytes):
    1. Additional section information. Interpretation depends on the section type.
  9. sh_addralign (4 bytes):
    1. Required alignment of the section. Must be a power of two.
  10. sh_entsize (4 bytes):
    1. Size of each entry if the section contains a table of fixed-size entries. Otherwise, this field is zero.
64-bit Section Header Table Entry

Each entry is 64 bytes in size and includes the following fields:

  • sh_name (4 bytes): An index into the section header string table, which gives the name of the section.
  • sh_type (4 bytes): The type of the section, such as SHT_PROGBITS, SHT_SYMTAB, etc.
  • sh_flags (8 bytes): Section attributes, such as SHF_WRITE, SHF_ALLOC, SHF_EXECINSTR, etc.
  • sh_addr (8 bytes): The virtual address of the section in memory (for sections that are to be loaded).
  • sh_offset (8 bytes): The offset of the section in the file image.
  • sh_size (8 bytes): The size of the section in bytes.
  • sh_link (4 bytes): Link to another section, depending on the type.
  • sh_info (4 bytes): Additional section information, depending on the type.
  • sh_addralign (8 bytes): The required alignment of the section.
  • sh_entsize (8 bytes): The size of each entry, if the section holds a table of fixed-size entries.

Inspecting the Section Header Table:

To view the section header table of an ELF file, you can use the readelf utility:

readelf -S file
Sample Output:
Section Headers:
  [Nr] Name              Type            Address          Off    Size   ES Flg Lk Inf Al
  [ 0]                   NULL            0000000000000000 000000 000000 00      0   0  0
  [ 1] .interp           PROGBITS        0000000000000238 000238 00001c 00   A  0   0  1
  [ 2] .note.ABI-tag     NOTE            0000000000000254 000254 000020 00   A  0   0  4
  [ 3] .gnu.hash         GNU_HASH        0000000000000274 000274 000024 00   A  4   0  4
  [ 4] .dynsym           DYNSYM          0000000000000298 000298 0000b0 18   A  5   1  8
  [ 5] .dynstr           STRTAB          0000000000000348 000348 00007f 00   A  0   0  1
  [ 6] .gnu.version      VERSYM          00000000000003c8 0003c8 00000e 02   A  4   0  2
  [ 7] .gnu.version_r    VERNEED         00000000000003d8 0003d8 000020 00   A  5   1  8
  [ 8] .rela.dyn         RELA            00000000000003f8 0003f8 000030 18   A  4   0  8
  [ 9] .rela.plt         RELA            0000000000000428 000428 0000c0 18  AI  4  23  8
  [10] .init             PROGBITS        00000000000004e8 0004e8 00001a 00  AX  0   0  4
  [11] .plt              PROGBITS        0000000000000508 000508 000060 10  AX  0   0  8
  [12] .text             PROGBITS        0000000000000570 000570 000293 00  AX  0   0 16
  [13] .fini             PROGBITS        0000000000000804 000804 000009 00  AX  0   0  4
  [14] .rodata           PROGBITS        0000000000000810 000810 00008c 00   A  0   0  4
  [15] .eh_frame_hdr     PROGBITS        000000000000089c 00089c 00003c 00   A  0   0  4
  [16] .eh_frame         PROGBITS        00000000000008d8 0008d8 00010c 00   A  0   0  8
  [17] .init_array       INIT_ARRAY      0000000000001cf0 001cf0 000010 08  WA  0   0  8
  [18] .fini_array       FINI_ARRAY      0000000000001d00 001d00 000008 08  WA  0   0  8
  [19] .dynamic          DYNAMIC         0000000000001d08 001d08 0001d0 10  WA  5   0  8
  [20] .got              PROGBITS        0000000000001ee0 001ee0 000028 08  WA  0   0  8
  [21] .got.plt          PROGBITS        0000000000001f08 001f08 000038 08  WA  0   0  8
  [22] .data             PROGBITS        0000000000001f40 001f40 000014 00  WA  0   0  4
  [23] .bss              NOBITS          0000000000002000 002000 000008 00  WA  0   0  1
  [24] .comment          PROGBITS        0000000000000000 002000 000028 00      0   0  1
  [25] .debug_aranges    PROGBITS        0000000000000000 002028 000030 00      0   0  1
  [26] .debug_info       PROGBITS        0000000000000000 002058 000190 00      0   0  1
  [27] .debug_abbrev     PROGBITS        0000000000000000 0021e8 0000a7 00      0   0  1
  [28] .debug_line       PROGBITS        0000000000000000 002290 00005a 00      0   0  1
  [29] .shstrtab         STRTAB          0000000000000000 0022ea 0000b3 00      0   0  1

4 Sections

Each ELF file can contain zero or more segments. Sections are used at link time.

The first entry in the section header table of every ELF file is defined by the ELF standard to be a NULL entry. The type of the entry is SHT_NULL, and all fields in the section header are zeroed out.

Sections:

  • .init: Executable code that performs initialization tasks and needs to run before any other code in the binary is executed (Then it has SHF_EXECINSTR flag) The system executes the code in the .init section before transferring control to the main entry point of the binary.
  • .fini: The contrary as .init, it has executable code that must run after the main program completes.
  • .text: Is where the main code of the program resides (Then it has SHF_EXECINSTR flag), it is SHT_PROGBITS because it has user-defined code.
  • .bss: It contains uninitialized data (Type SHT_NOBITS). It does not occupy space at disk to avoid space consuming, then all the data is usually initialized to zero at runtime. It is writable.
  • .data: Program initialized data, it is writable. (Type SHT_PROGBITS).
  • .rodata: It is read-only data, such as strings used by the code, if the data should be writable then .data is used instead. Data that goes here can be for example hardcoded strings used for a printf.
  • .plt: Stands for Procedure Linkage Table. It is code used for dynamic linking purposed that helps to call external functions from shared libraries with the help of the GOT (Global Offset Table).
  • .got.plt: It is a table where resolved addresses from external functions are stored. It is by default writable as by default Lazy Binding is used. (Unless Relocation Read-Only is used or LD_BIND_NOW env var is exported to resolve all the imported functions at the program initialization).
  • .rel.*: Contains information about how parts of an ELF object or process image need to be fixed up or modified at linking or runtime (Type SHT_REL).
  • .rela.*: Contains information about how parts of an ELF object or process image need to be fixed up or modified at linking or runtime (with addend) (Type SHT_RELA).
  • .dynamic: Dynamic linking structures and objects. Contains a table of ElfN_Dyn structures. Also contains pointers to other important information required by the dynamic linker (for instance, the dynamic string table, dynamic symbol table, .got.plt section, and dynamic relocation section pointed to by tags of type DT_STRTAB, DT_SYMTAB, DT_PLTGOT, and DT_RELA, respectively
  • .init_array: Contains an array of pointers to functions to use as constructors (each of these functions is called in turn when the binary is initialized). In gcc , you can mark functions in your C source files as constructors by decorating them with __attribute__((constructor). By default, there is an entry in .init_array for executing frame_dummy.
  • .fini_array: Contains an array of pointers to functions to use as destructors.
  • .shstrtab: Is simply an array of NULL-terminated strings that contain the names of all the sections in the binary.
  • .symtab: Contains a symbol table, which is a table of ElfN_Sym structures, each of which associates a symbolic name with a piece of code or data elsewhere in the binary, such as a function or variable.
  • .strtab: Contains strings containing the symbolic names. These strings are pointed to by the ElfN_Sym structures.
  • .dynsym: Same as .symtab but contains symbols needed for dynamic-linking rather than static-linking.
  • .dynstr: Same as .strtab but contains strings needed for dynamic-linking rather than static-linking.
  • .interp: RTLD embedded string.
  • .rel.dyn: Global variable relocation table.
  • .rel.plt: Function relocation table.

Older gcc version sections:

  • .ctors: Equivalent of .init_array produced by older versions of gcc.
  • .dtors: Equivalent of .fini_array produced by older versions of gcc.

Sections are the building blocks of ELF files. They include:

  • .text: Contains the executable code.
  • .data: Contains initialized global and static variables.
  • .bss: Contains uninitialized global and static variables (only size information is stored).
  • .rodata: Contains read-only data, such as string literals.
  • .symtab: A symbol table that includes all symbols in the file.
  • .strtab: A string table containing the names of symbols.
  • .rel.text or .rela.text: Relocation information for the .text section.
  • .dynamic: Dynamic linking information.
  • .got: Global Offset Table.
  • .plt: Procedure Linkage Table.
  • .comment: Contains version control information.
  • .note: Contains additional notes and metadata.
  • .shstrtab: Section header string table, which holds the names of sections.

5 Segments

Each ELF file can contains zero or more sections.

In English language, Sections and Segments are synonyms of each other. However here in ELF they describe two completely different things.

  • Segments are used at runtime.

Segments are derived from sections and used by the program header table to load the executable into memory. They include:

  • Loadable Segments: Contain code and data that need to be loaded into memory.
  • Dynamic Linking Information: Contains information needed by the dynamic linker.
  • Interpreter Information: Specifies the program interpreter (e.g., /lib64/ld-linux-x86-64.so.2).
  • Note Segments: Used to store additional information.
  • TLS (Thread-Local Storage) Segments: Contains data for thread-local storage.

Two Views of ELF File

elf
 

ELF File Types

Executable and Linkable Format (ELF) is a standard file format used for executables, object code, shared libraries, and core dumps. There are several different types of ELF files, each serving a specific purpose in the process of program compilation and execution.

ELF files can be categorized into different types based on their usage:

  1. Executable Files: Contain the code and data necessary to execute a program.
    1. It contains a program that can be executed directly by the operating system.
    2. This file type has a program header table that specifies how segments are to be loaded into memory.
  2. Relocatable File:
    1. Contains code and data that can be linked with other object files to create an executable or a shared object file.
    2. This file type is not directly executable. It contains sections like .text, .data, and .bss, which are not yet assigned to specific memory addresses.
  3. Shared Object Files: Libraries that can be dynamically linked during program execution.
    1. Shared libraries (.so files) are dynamically linked at runtime by executables or other shared objects.
  4. Core Dumps: Files created when a program crashes, containing the memory image of the process at the time of the crash.
    1. Used for debugging purposes to analyze the state of the program at the time of the crash.

-: Relocatable Type :-

Purpose: Relocatable files contain code and data that need to be linked with other object files to create an executable or a shared object file. They are produced by the assembler from the source code and used by the linker to produce the final output file.

Characteristics:

  • Not directly executable: They need to be linked to form an executable file.
  • Contains sections: These files are divided into sections, each serving a specific purpose, such as holding code, data, or metadata.

Compilation and Linking Process:

  1. Compilation: Source code is compiled into ELF relocatable files by the compiler. Each source file produces one or more relocatable files.
  2. Linking: The linker takes multiple relocatable files and combines them into a single executable or shared object file. During this process, the linker resolves symbol references and performs relocation, adjusting addresses based on where sections are placed in the final output.

Note:

ELF relocatable files (.o files) are not directly executable. They are designed to be linked together with other object files by a linker to produce an executable file or a shared object file.

Why ELF Relocatable Files Are Not Executable:

  1. Lack of Entry Point:
    • Relocatable files do not have an entry point. The entry point is the address where the execution starts, and it's defined in the ELF header of executable files. Since relocatable files are intended to be linked with other files, they do not specify a single entry point.
  2. Relocation Needed:
    • Relocatable files contain symbols and references that need to be resolved. These references are placeholders that the linker must fill with the correct addresses. Until this linking process is completed, the code in relocatable files cannot be executed because it contains unresolved symbols and addresses.
  3. Incomplete Program:
    • A single relocatable file typically represents only a part of a program. It may contain a few functions or data definitions, but not the entire logic needed to perform a task. Executable files, on the other hand, are complete programs that include all the necessary code and data.

Loading and Executing Code from Relocatable Files:

To execute code from relocatable files, they must go through a linking process. Here's how it works:

  1. Compilation:
    • Source code is compiled into one or more relocatable object files (.o). Each object file contains machine code, data, and metadata, but with unresolved references.
  2. Linking:
    • A linker combines these relocatable files with other object files and libraries to produce an executable file or a shared library. The linker performs several tasks:
      • Resolving Symbols: The linker resolves references to symbols (functions, variables) by determining their final memory addresses.
      • Relocation: The linker adjusts addresses in the code and data sections to reflect their actual positions in memory.
      • Creating Executable: The linker generates an ELF executable file that includes an entry point and can be loaded and executed by the operating system.
  3. Execution:
    • The operating system's loader loads the executable file into memory, sets up the execution environment, and starts executing the program from the entry point.

ELF File Lifecycle

Compilation
  1. Compilation: Source code is compiled into ELF object files. Each source file is transformed into an object file with machine code and data.
  2. Linking: Object files and libraries are linked to form an executable or shared object. The linker resolves symbols, addresses, and combines the necessary sections. The ELF Format supports both static and dynamic linking. During linking, the linker processes the ELF sections and headers to resolve symbols, perform relocations, and organize the final executable layout
  3. Execution: When an ELF executable is loaded into memory, the operating system's loader reads the ELF header to determine the layout of the program. It uses the program header table to load the necessary segments into memory, sets up the memory image according to the specified virtual addresses, and finally transfers control to the entry point specified in the ELF header.

Tools for Working with ELF Files

Several tools are available for working with ELF files:

1 readelf:

Displays information about ELF files.

readelf -a filename

2 objdump:

Displays detailed information about object files, including disassembly.

objdump -d filename

3 nm:

Lists symbols from object files.

nm filename

4 ld:

The GNU linker for linking object files into executables or libraries.

ld -o output input.o

5 strip:

Removes symbols and other unnecessary information from ELF files

strip filename

6 gdb:

The GNU Debugger for debugging ELF executables.

Process of Parsing ELF Kernel

1 Load the ELF File:

  • Read the ELF file from the disk into a designated memory area.
    • Either using BIOS or any interface driver.

2 Parse the ELF File:

  • Verify the ELF magic number.
    • Compare the first 4 bytes of read memory location to be 0x464C457F which is 0x7F, E, L, F.
load_kernel:
    mov esi, 0x1000     ; Address of loaded kernel
    mov edi, 0x200000   ; Kernel load address in memory (2MB)

    ; Verify ELF magic number
    cmp dword [esi], 0x464C457F  ; Check for 0x7F 'E' 'L' 'F'
    jne invalid_elf
  • Locate the program header table using the offset provided in the ELF header.
; Read ELF header
    mov eax, [esi + 0x1C] ; e_phoff (program header table offset)
    add eax, esi          ; Convert to absolute address
    mov ebx, eax          ; ebx = program header table address
    mov ecx, [esi + 0x2C] ; e_phnum (number of program headers)

3 Loads Segments into Memory:

  • Iterate through the program headers and load each PT_LOAD segment into its designated memory location.
load_segments:
    ; Read each program header
    mov eax, [ebx]        ; p_type
    cmp eax, 1            ; PT_LOAD
    jne skip_segment

    ; Load segment into memory
    mov eax, [ebx + 4]    ; p_offset
    add eax, esi          ; Segment data offset
    mov edx, eax			; Source address in memory
    
    ; p_vaddr (Virtual address in memory)
    mov eax, [ebx + 8]    ; p_vaddr
    add eax, edi          ; Destination address in memory, Virtual address  + Load base address
    mov edi, eax			; Destination address in memory
    
    ; p_filesz (Size in file)
    mov eax, [ebx + 0x10] ; p_filesz
    mov ecx, eax			; Number of bytes to copy
    rep movsb				; Copy bytes from source to destination

    ; Zero out remaining memory if p_memsz > p_filesz
    mov eax, [ebx + 0x14] ; p_memsz
    sub eax, ecx
    jz next_segment
    xor al, al
    rep stosb

skip_segment:
next_segment:
	; Move to the next program header
    add ebx, 0x20         ; Size of one program header (32 bytes)
    loop load_segments

4 Transfer Control to the Kernel

  • Jump to the Kernel's entry point as specified in the ELF header.
; After loading all segments, we can jump to the entry point
; Jump to kernel entry point
    mov eax, [esi + 0x18] ; e_entry
    add eax, edi			; Entry point address + Load base address
    jmp eax					; Jump to the kernel's entry point

invalid_elf:
    ; Handle invalid ELF file
    hlt
    jmp invalid_elf