When writing a bootloader in x86 Assembly, you may come across the instruction jump 0:start
at the beginning of the code. This seemingly simple instruction plays a critical role in ensuring that the bootloader works reliably across different systems.
Recap Real Mode Memory Addressing
In real mode (the mode in which the CPU starts when powered on), memory addresses are calculated using segment:offset addressing. The physical address is computed as:
Physical Address = (Segment Register * 16) + Offset
- CS (Code Segment): Determines the segment for code execution.
- IP (Instruction Pointer): Holds the offset within the code segment.
For example:
If
CS = 0x0000 and IP = 0x7c00
, the physical address is:0x0000 * 16 + 0x7C00 = 0x7C00
If
CS = 0x07c0 and IP = 0x0000
, the physical address is:0x07C0 * 16 + 0x0000 = 0x7C00
Both segment:offset
pairs point to the same physical address (0x7c00
), but the segment and offset values are different.
The Problem with BIOS Behavior
When the BIOS loads your bootloader, it typically places the bootloader at the physical address 0x7C00
. However, the BIOS does not guarantee how it sets the CS
and IP
registers. Different BIOS implementations may behave differently:
- Some BIOSes set
CS = 0x0000
andIP = 0x7c00
. - Others set
CS = 0x07c0
andIP = 0x0000
.
While both configurations result in the same physical address (0x7C00
), the segment:offset
pair affects how your code references memory. If your code assumes CS = 0x0000
but the BIOS sets CS = 0x07C0
, your memory references will be incorrect, leading to crashes or unpredictable behavior.
The Role of jmp 0:start
The instruction jmp 0:start
is a far jump
that explicitly sets the CS
register to 0
and the IP
register to the offset of the label start
. This ensures that:
- CS is set to 0x0000.
- Execution continues at the correct offset (
start
).
By doing this, you guarantee that your code runs with CS = 0x0000
, regardless of how the BIOS initialized the segment registers.
How It Works
Here's a breakdown of the instruction:
jmp 0:start
- 0: specifies the new value for
CS (0x0000)
- start: Specifies the new value for
IP (the offset of the start label)
.
After executing this instruction:
- CS = 0x0000
- IP = start
This ensures that all memory references in the code are calculated correctly.
Example Bootloader Code
Here's an example of a bootloader that uses jmp 0:start
:
[BITS 16] ; 16-bit real mode
[ORG 0x7C00] ; Bootloader origin (BIOS loads Stage 1 at 0x7C00)
; Set CS to 0 and jump to start
jmp 0:start ; Far jump to set CS to 0 and IP to start
start:
; Initialize segment registers
xor ax, ax ; Clear AX
mov ds, ax ; Set DS to 0
mov es, ax ; Set ES to 0
; Set up stack
mov ss, ax ; Set SS to 0
mov sp, 0x7C00 ; Stack grows downward from 0x7C00
; Print "Hello, World!"
mov si, msg_hello
call print_string
; Halt the system
hlt
; Function to print a string
print_string:
mov ah, 0x0E ; BIOS teletype function
.print_char:
lodsb ; Load next character from SI into AL
cmp al, 0 ; Check for null terminator
je .done ; If null, done
int 0x10 ; Print character
jmp .print_char ; Repeat for next character
.done:
ret
; Data
msg_hello db 'Hello, World!', 0
; Bootloader padding and signature
times 510-($-$$) db 0 ; Pad to 510 bytes
dw 0xAA55 ; Boot signature (0xAA55)
What Happens Without jmp 0:start?
If you omit the jmp 0:start instruction, your bootloader might work on some systems but fail on others. Here’s why:
- Incorrect Memory References:
- If the BIOS sets CS = 0x07C0, but your code assumes CS = 0x0000, memory references will be calculated incorrectly.
- For example,
mov si, msg_hello
might reference the wrong memory location.
- ORG Directive Mismatch:
- The [ORG 0x7C00] directive tells the assembler that the code will be loaded at offset 0x7C00 within its segment.
- If CS is not 0x0000, the assembler’s calculations for labels and offsets will be wrong.
- Unreliable Behavior:
- Your bootloader might work on one machine but fail on another, making it non-portable.