CLOSE

When writing a bootloader in x86 Assembly, you may come across the instruction jump 0:start at the beginning of the code. This seemingly simple instruction plays a critical role in ensuring that the bootloader works reliably across different systems.

Recap Real Mode Memory Addressing

In real mode (the mode in which the CPU starts when powered on), memory addresses are calculated using segment:offset addressing. The physical address is computed as:

Physical Address = (Segment Register * 16) + Offset
  • CS (Code Segment): Determines the segment for code execution.
  • IP (Instruction Pointer): Holds the offset within the code segment.

For example:

  • If CS = 0x0000 and IP = 0x7c00, the physical address is:

    0x0000 * 16 + 0x7C00 = 0x7C00
  • If CS = 0x07c0 and IP = 0x0000, the physical address is:

    0x07C0 * 16 + 0x0000 = 0x7C00

Both segment:offset pairs point to the same physical address (0x7c00), but the segment and offset values are different.

The Problem with BIOS Behavior

When the BIOS loads your bootloader, it typically places the bootloader at the physical address 0x7C00. However, the BIOS does not guarantee how it sets the CS and IP registers. Different BIOS implementations may behave differently:

  1. Some BIOSes set CS = 0x0000 and IP = 0x7c00.
  2. Others set CS = 0x07c0 and IP = 0x0000.

While both configurations result in the same physical address (0x7C00), the segment:offset pair affects how your code references memory. If your code assumes CS = 0x0000 but the BIOS sets CS = 0x07C0, your memory references will be incorrect, leading to crashes or unpredictable behavior.

The Role of jmp 0:start

The instruction jmp 0:start is a far jump that explicitly sets the CS register to 0 and the IP register to the offset of the label start. This ensures that:

  1. CS is set to 0x0000.
  2. Execution continues at the correct offset (start).

By doing this, you guarantee that your code runs with CS = 0x0000, regardless of how the BIOS initialized the segment registers.

How It Works

Here's a breakdown of the instruction:

jmp 0:start
  • 0: specifies the new value for CS (0x0000)
  • start: Specifies the new value for IP (the offset of the start label).

After executing this instruction:

  • CS = 0x0000
  • IP = start

This ensures that all memory references in the code are calculated correctly.

Example Bootloader Code

Here's an example of a bootloader that uses jmp 0:start:

[BITS 16]                ; 16-bit real mode
[ORG 0x7C00]             ; Bootloader origin (BIOS loads Stage 1 at 0x7C00)

; Set CS to 0 and jump to start
jmp 0:start              ; Far jump to set CS to 0 and IP to start

start:
    ; Initialize segment registers
    xor ax, ax           ; Clear AX
    mov ds, ax           ; Set DS to 0
    mov es, ax           ; Set ES to 0

    ; Set up stack
    mov ss, ax           ; Set SS to 0
    mov sp, 0x7C00       ; Stack grows downward from 0x7C00

    ; Print "Hello, World!"
    mov si, msg_hello
    call print_string

    ; Halt the system
    hlt

; Function to print a string
print_string:
    mov ah, 0x0E         ; BIOS teletype function
.print_char:
    lodsb                ; Load next character from SI into AL
    cmp al, 0            ; Check for null terminator
    je .done             ; If null, done
    int 0x10             ; Print character
    jmp .print_char      ; Repeat for next character
.done:
    ret

; Data
msg_hello db 'Hello, World!', 0

; Bootloader padding and signature
times 510-($-$$) db 0    ; Pad to 510 bytes
dw 0xAA55                ; Boot signature (0xAA55)

What Happens Without jmp 0:start?

If you omit the jmp 0:start instruction, your bootloader might work on some systems but fail on others. Here’s why:

  1. Incorrect Memory References:
    • If the BIOS sets CS = 0x07C0, but your code assumes CS = 0x0000, memory references will be calculated incorrectly.
    • For example, mov si, msg_hello might reference the wrong memory location.
  2. ORG Directive Mismatch:
    • The [ORG 0x7C00] directive tells the assembler that the code will be loaded at offset 0x7C00 within its segment.
    • If CS is not 0x0000, the assembler’s calculations for labels and offsets will be wrong.
  3. Unreliable Behavior:
    • Your bootloader might work on one machine but fail on another, making it non-portable.