Flat Binary Tiny C Kernel

In the last chapter we crafted the 32-Bit kernel_entry code which prints the welcome message. Now its time to elevate it to further by calling C code from the kernel_entry.asm.

The main challenges :-

  • As we don't have ELF loader prepared yet so all we need is a flat binary format output for the C and Assembly code.

There are different ways of achieving it. we will explore each and all.

1️⃣ First Way

With kernel/kernel_entry.asm:

The first thing to note is that our kernel entry is kernel_entry.asm file which is in assembly and we already crafted it in our last chapter. In the last chapter it output format was of binary:

nasm -f bin kernel_entry.asm -o build/kernel_entry.o

And in this chapter we are introducing a tiny C kernel which is being called from this kernel_entry file.

Problem: So in the kernel_entry file if we want to call any other function if defined in any other assembly files we can call it by including (%include ___.inc) that particular file. However this time our function would be written in C language. So how do we call the external function which is another object file either in C/C++ or assembly even though the function would exist after linking but that is latter phase it will get failed in first phase that is compilation.

image-164.png

This kind of error output we will get calling any function not declared in the current file or defined somewhere in another object files which we later thinking of linking.

Solution: Well if you are calling a function which is defined at some place in other object which you are going to link together, then we have the extern keyword which informs the assembler to resolve the references during the linking time not in compilation time for that function. It's like the we are saying that yes we have define the function but in other object files which are going to be linked together. It's the linker duty to resolve the references. So, we have the make a declaration of that function k_main as extern before calling it in the kernel_entry.asm:

extern k_main		; extern declaration

... other code

call k_main			; calling the extern function defined in the kernel.c file.

And here we encountered another problem which is:

image-165.png

Problem: Our kernel_entry file output format is binary, and which doesn't support external references.

Solution: We have to change the kernel_entry output format to elf(executable and linkable format) because elf format supports external references.

Change it as follows:

nasm -f elf kernel_entry.asm -o build/kernel_entry.elf

Note: elf represents elf32 by default. so we can use elf or elf32 as we haven't shifted our kernel_entry code to 64-bit.

Problem: Now we have changed the output format to elf but still we have a problem which is origin directive.

image-173.png

Solution: The solution is that elf output format does not support the origin directives of nasm. We have to comment the origin directive. After doing so we are ready.

Now we have the kernel_entry.asm output format as elf.

Here is the complete code of it:

;; kernel_entry.asm
;; Origin directive no more needed in elf output format.
; org 0xb000               ; Set the origin address for the code. This tells the assembler
                         ; that the code should be loaded at memory address 0x0B00.
                         ; So all the jmp statement and string declaration offset is
                         ; calculated based on it.
;; In ELF file format org directive is invalid.
;; org directive is only for the binary output format.                     


BITS 32                  ; Specify that the code is 32-bit.

kernel_entry:            ; Label for the kernel entry point.

jmp start		; jmp after the includes

;; Include files
%include "print32.inc"
extern k_main
start:
	;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
	;; Clear the Screen
	call ClearScreen32
	;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;


	;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
	;; Print Welcome Message
	mov esi, sKernelWelcomeStatement
	mov bl, YELLOW		; Foreground color = Yellow
	mov bh, BLACK		; Background color = Black
	call PrintString32
	;;
	;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
	
	
	;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
	;; Call the C kernel
	call k_main
	;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
	

jmp $		; Infinite loop to halt execution after printing the message.


;; Put data in second sector
times 512 - ($ - $$) db 0

sKernelWelcomeStatement: db 'Welcome to Flat Binary 32-Bit Kernel Land.', 0
                         ; Define the welcome message string, terminated by a null byte (0).

times 1024 - ($ - $$) db 0	; Fill the rest of the 1 KB (1024 bytes) space with zeros.

Here is how we can inspect our built elf file:

readelf -h kernel_entry.elf

This command is for checking the header of the elf file.

If it is valid elf file, it will show the following output:

image-174.png

kernel/kernel.c:

Now we have to compile the kernel.c file. We will use the gcc with -ffreestanding means not to link any system library with our code. Since we are not making any program for the host system. Later in the series we will use the cross-compilers for host independent or targeting any particular processor. But since our kernel is small we can just use the host's compiler without any external libraries.

Here is our code for this file:

;; kernel.c

// C kernel entry.
void k_main(){

	unsigned char* mem= (unsigned char*) 0xb8000;

// Print BIN
	mem[0] ='B';
	mem[1] = 0x07;

	mem[2] ='I';
	mem[3] = 0x07;

	mem[4] ='N';
	mem[5] = 0x07;

// Infinite loop
	while(1){}

}

Compile it as follows:

gcc -m32 -ffreestanding -c kernel/kernel.c -o build/kernel_main.elf

Inspect the generated output file:

; print the header of the elf file
readelf -h kernel_main.elf
image-175.png

Now, we have both kernel_entry and kernel_main is in elf format.

We need to link them together. we will link them with the ld.

Link them together:

Since we have two elf files, we need to link them together to make a single elf file.

ld -m elf_i386 -Ttext 0xB000 -o kernel.elf kernel_entry.elf kernel_main.elf

We can use the linker script as well but not now.

Here is the full explanation of the options used here:

  • -m elf_i386 = This option defines the output format of the linked file which is elf32 in our case.
  • -Ttext 0xB000 = This option specifies the starting address (also known as the load address or the base address) for the text (code (.text)) segment of the output file.
  • -o kernel.elf = Output file name.

Note:

This generated kernel.elf file is quite spacious just for few lines of code. It's size is 13.3 KB which is too much for just hardly 50~60 lines of code. That's why we need another way, and we have that.

image-168.png
  • Here, we have the complete kernel in elf format. We can load it from the stage2 of our bootloader However we can't read elf file since it has starting address and size in its header. So we need convert it to flat binary as flat binary contains the code as it is, no headers, no sections and all.

Convert elf to Flat Binary:

We will use the objcopy tool which can convert one file format into another.

You can get more information about it here: https://thejat.in/learn/objcopy-utility

objcopy -O binary kernel.elf kernel.bin

This will convert kernel.elf into kernel.bin.

Modify the stage2 to load the complete kernel

As with the introduction of c code our kernel becomes little bit heavy in size, earlier it is of just 1 KB. Now we have to check the output kernel.bin file size and load that much sectors.

image-170.png

Here, we can see that the our kernel.bin size is 12.3KB. Round it off to 13 which means we have to load 26 sectors in all. Since 1 sector is of 512 bytes.

Let's modify the stage 2 code to load 26 sectors.

;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
	;; Load Flat Kernel of Size 1 KB at location 0xB000
	xor eax, eax		; Clear out the eax
	mov esi, eax		; Clear ou the esi
	mov ax, MEMLOCATION_KERNEL_LOAD_SEGMENT	; 0x0000
	mov es, ax		; Set es to 0x0000
	mov bx, MEMLOCATION_KERNEL_LOAD_OFFSET	; 0xB000
	mov eax, 59		; Starting sector low 32 bit (0-indexed LBA)
	mov esi, 0		; Starting sector high 32 bit
	mov ecx, 26		; Sector count
	mov edx, 512		; Sector sizes in bytes
	call ReadFromDiskUsingExtendedBIOSFunction

Output:

image-171.png

Source Code:

The Source code is hosted here at: https://github.com/The-Jat/TheTaaJ/tree/126ac27707479dd751ad79b944e5cdfbbab1ca22

2️⃣ Second Way - Use LD to Generate Binary Output

The another way is that instead of generating ELF file from the ld generate the binary output file, in this way we don't need objcopy utility to convert elf to binary.

ld -m elf_i386 -Ttext 0xB000 --oformat binary -o kernel.bin kernel_entry.elf kernel_main.elf

Noticed the change:

  • --oformat binary: It tells the linker to output a raw binary file instead of an ELF or other formatted executable.

After doing so we got a little problem:

image-176.png

This error undefined reference to _GLOBAL_OFFSET_TABLE_ typically arises when linking code for position-independent executable but trying to produce a raw binary file with --oformat binary. This is because PIE requires relocation tables which are not compatible with raw binary output.

Solution: The -fno-pie option would be used for compiling the kernel_main.c file.

gcc -m32 -fno-pie -ffreestanding -c kernel_main.c -o kernel_main.elf

Note:

This method is space efficient as well, where in former method. The output of ld which elf is of size 12.3 KB. While in current one way the output from the ld is binary and size of output is 4.1 KB, which is significant difference.

Output:

image-177.png

Source Code:

You can see the complete source code here: https://github.com/The-Jat/TheTaaJ/tree/e3a0bf57bea65366edfa70e773575480c2817e7f