Set Kernel Output as ELF

Up to this point we had our kernel as binary file. Now we update it to an ELF executable targeting the 32-bit x86 architecture. Below is the updated linker.ld script.

OUTPUT_FORMAT(elf32-i386)
ENTRY(start)
phys = 0x00100000;
SECTIONS
{
	.text phys : AT(phys) {
		code = .;
		*(.text)
		*(.rodata)
		. = ALIGN(4096);
	}
	.data : AT(phys + (data - code))
	{
		data = .;
		*(.data)
		. = ALIGN(4096);
	}
	.bss : AT(phys + (bss - code))
	{
		bss = .;
		*(.bss)
		. = ALIGN(4096);
	}
	end = .;
	/DISCARD/ :
	{
		*(.comment)
		*(.eh_frame)
		*(.note.gnu.build-id)
	}
}

1 OUTPUT_FORMAT(elf32-i386):

  • Specifies the output format of the executable as ELF for the 32-bit x86 architecture.

2 ENTRY(start):

  • Specifies the entry point of the program as the symbol start. This is the address where execution of the program begins.

3 phys = 0x00100000;:

  • Defines a symbol phys with the value 0x00100000, indicating the physical address where the program will be loaded into memory. This is commonly referred to as the base address.

4 SECTIONS:

  • Marks the beginning of the section definitions.

5 .text phys : AT(phys):

  • Defines the .text section and specifies that it should be loaded at the physical address phys. The code = .; line marks the beginning of the .text section. The *(.text) directive includes all .text sections from input files. The *(.rodata) directive includes all .rodata sections from input files. The . = ALIGN(4096); directive aligns the location counter to the next 4096-byte boundary.

6 .data : AT(phys + (data - code)):

  • Defines the .data section and specifies that it should be loaded at the physical address phys plus the offset between the data and code symbols. The data = .; line marks the beginning of the .data section. The *(.data) directive includes all .data sections from input files. The . = ALIGN(4096); directive aligns the location counter to the next 4096-byte boundary.

7 .bss : AT(phys + (bss - code)):

  • Defines the .bss section and specifies that it should be loaded at the physical address phys plus the offset between the bss and code symbols. The bss = .; line marks the beginning of the .bss section. The *(.bss) directive includes all .bss sections from input files. The . = ALIGN(4096); directive aligns the location counter to the next 4096-byte boundary.

8 end = .;:

  • Defines a symbol end with the value of the current location counter, marking the end of all sections.

9 /DISCARD/:

  • Marks a discard section, which specifies sections to be discarded during linking. In this case, it discards .comment, .eh_frame, and .note.gnu.build-id sections.

References

https://thejat.in/learn/program-sections

https://thejat.in/learn/all-about-linking-process